GPT-5 Mini is costing requests on the old plan/500 mode

Describe the Bug

I remember seeing a post mentioning that GPT-5 Mini will be free on the old plan and not cost requests

Steps to Reproduce

Use GPT-5.

Expected Behavior

Not having it use requests.

Operating System

MacOS

Current Cursor Version (Menu → About Cursor → Copy)

Version: 1.7.54
VSCode Version: 1.99.3
Commit: 5c17eb2968a37f66bc6662f48d6356a100b67be0
Date: 2025-10-21T19:07:38.476Z
Electron: 34.5.8
Chromium: 132.0.6834.210
Node.js: 20.19.1
V8: 13.2.152.41-electron.0
OS: Darwin arm64 25.1.0

Does this stop you from using Cursor

No - Cursor works, but with this issue

1 Like

259056

What’s that?

Also gemini-2.5-flash from 11/19 :face_with_crossed_out_eyes:

I’m also seeing the same thing. on the 18th November GPT-5 Mini cost 0 requests, but since is costing 1 request. Same goes for gemini-2.5-flash, and GPT-5 Nano. Also noticed before this that in the UI and on the models page on the website, the number of requests for each model has been obfuscated.

Perhaps the legacy plan is getting rug-pulled and this is the start.

To add gpt-5.1-codex-mini is also 1 request since it came in, but not sure that makes any sense if gpt-5.1-codex is 1 request.

For more context deepseek-v3.1 is costing 0.3 requests same as before, grok-code-fast-1 is costing 0 requests as before, and claude-4.5-sonnet-thinking costing 2 requests as before.

2 Likes

I emailed them. The free models have been rug pulled. An email from James at Cursor:

Hi Nathan,

I wanted to provide some clarity on how the system operates behind the scenes. The “free” slow-pool models will now utilize your included premium requests as long as you have some remaining. Once those requests are fully consumed, you can continue using the slow-pool models at no request cost; however, they will operate at a slower rate. This change is part of a recent system update. Additionally, please note that the gpt-5.1-codex-mini model is now categorized in the same request tier as the gpt-5.1-codex model, which is why both models require one request each; this aligns with the updated structure.

If your goal is to optimize your usage, the most efficient approach is to use the premium models first, and then switch to the slow-pool models after exhausting your included requests. If you’d like assistance in finding a setup that better suits your workflow, I’m here to help.

Best,

3 Likes

Thanks for sharing this here @Nathan_Mullings

800090874