Hard Rate Limits or Queues/Throttling?

moneybags · June 26, 2025, 1:29pm

I am probably missing the details but I still have not found enough information about the Ultra plan. Its probably in a comment somewhere that I have not seen.

Does anyone have a good breakdown of rate limits? I have no issue paying $200 but I won’t if I don’t understand what it is buying me. Are rate limits only hard limits, ie there is a message in the UI that states I have exceeded my limits, or are there also throttling/queues that are different than Ultra plan? Really only curious about the last part.

thepKz · June 26, 2025, 1:56pm

20x usage (than PRO PLAN) on all OpenAI, Claude, Gemini models
Today it seems there is a bug related to using claude 1-2 times and it hits the rate-limited. But the document is still a little shady. I think you should by PRO+ (60$) to use before using Ultra.

danperks · June 27, 2025, 9:42am

Hey, thanks for the question on Ultra!

With Ultra, you will get near-unlimited usage of requests each month. There is no hard cap on the number of requests you can use, and while it follows the same rate limit system as Pro, Ultra has 20 times the rate limits, which should be incredibly difficult ever to reach.

If you were to hit a rate limit on any plan, you would be shown an alert in-app, with one of three options:

Upgrade to a higher plan
Enable usage-based pricing to pay for exactly what you use
Switch to a different model / model family

There are no “queues” or “slow requests” anymore, either, so things are much simpler.

G4Q4 · June 27, 2025, 10:15am

Switching to different models or model families doesn’t work anymore. For the past couple of days, when I hit a rate limit with one model, I hit it with every model other than Gemini Flash. And yes, I can’t even use models like 4.1. Or sonnet 3.5.

Is this expected behavior or a glitch?

danperks · June 27, 2025, 10:21am

We may have had an issue where rate limits were not correctly maintained across all models, but GPT 4.1 and Gemini Flash should both be truly unlimited I believe, is this not the case for you?

G4Q4 · June 27, 2025, 10:23am

Now, when I hit a rate limit, and for me now, that means 3-4 prompts with Sonnet 4 non-thinking. I’m unable to use any other model other than Gemini Flash and auto.

I think Auto basically just uses 4.1 the vast majority of the time. I technically can use it, but I can’t specifically select 4.1 and use it.

danperks · July 1, 2025, 2:16pm

This suggests you have likely used your burst rate limit already, which is large (always >= your plan cost) but is very slow to refill - this leaves you with your ‘local’ limit, which is smaller but refills every few hours.

For most users who have hit this already, we’ve found the vast majority have either changed their usage pattern, or used to rely on usage-based pricing to top up their usage each month, and should expect to continue to do so now!

Topic		Replies	Views
Regarding request limits and speed on the Cursor Ultra plan Discussions	2	411	July 14, 2025
Just confused on usage in this new plan Discussions	23	2718	July 1, 2025
Cursor Ultra - A new, higher tier plan, built alongside the model providers Announcements	0	15769	June 17, 2025
New Pro/Team plan usage models Discussions	11	2294	June 29, 2025
Premium models usage with the Pro plan Discussions	10	1379	January 5, 2025

Hard Rate Limits or Queues/Throttling?

Related topics