Cursor hitting a rate limit on Claude-4 Sonnet despite low usage, raising concerns about infrastructure and pricing direction

Describe the Bug

Cursor showed a provider rate limit error when using Claude-4 Sonnet, even though my usage was well below my plan quota (Pro plan, 500 requests/month, only 54 used so far). I am not on Usage-Based Pricing and wasn’t using Max Mode.

Steps to Reproduce

Be on Pro plan with plenty of requests remaining (in my case, 54/500 used).
Use Claude-4 Sonnet for regular code requests (not Max Mode).
Randomly hit a rate limit error message:
“We’ve hit a rate limit with the provider. Please switch to the ‘auto-select’ model, another model, or try again in a few moments.”

Expected Behavior

Cursor should process the request as normal since I am well within my paid quota. If there is a provider-side capacity issue, this should be communicated clearly, and Cursor could automatically switch to another available model or offer real-time provider status so I can make informed decisions.

Operating System

MacOS

Current Cursor Version (Menu → About Cursor → Copy)

Version: 1.1.7
VSCode Version: 1.96.2
Commit: 7111807980fa9c93aedd455ffa44b682c0dc1350
Date: 2025-07-01T07:26:06.233Z (14 hrs ago)
Electron: 34.5.1
Chromium: 132.0.6834.210
Node.js: 20.19.0
V8: 13.2.152.41-electron.0
OS: Darwin x64 23.1.0

Additional Information

This raises bigger concerns:

Is Cursor’s infrastructure or provider agreement unable to support demand at advertised plan levels?

Are users being quietly nudged toward enabling usage-based pricing or upgrading to Ultra to avoid hitting these limits?

Is Cursor overcommitting plan quotas that it can’t consistently fulfill, at least for certain models?

Clearer communication would help, especially distinguishing between provider API issues vs Cursor-imposed limits.

Does this stop you from using Cursor

Yes - Cursor is unusable

2 Likes

Hi @cviator and welcome to Cursor Forum!

This is not about your rate limits or quota.

At times there is heavy usage of certain AI models and providers have a limitation at how many servers they have available due to heavy load.

Thanks for the suggestion to automatically switch. This is why Auto selection is available, as it allows users to let Cursor select good models that arent overloaded. The message you saw is essentially realtime as its not predictable. Many times the issue is resolved within minutes and sometimes it may take longer.
So retrying works very often.

No this is not about nudging users towards usage based pricing as the same model on usage based pricing would have same message.

The message is clear, API providers rate limit was reached. If you reach your own rate limits the message mentions Your rate limit.

While I dont have insight into Cursors provider agreements its most likely that the API provider itself has also internal capacity limits.

I am hitting this limit everyday now as well, rather sooner than later. It is rather annyoing because lets face it, claude-4 produces much better results when doing complex tasks than claude-3.5

i cant even ask 3 questions 2 prompts then i get hit with the rate limit
we’ve hit a rate limit with the provider
its almost unusable right now

8 Likes

@alorbach Cusor team has updated Dashboard showing now clear consumption in usage, please have a look at the detalis

@Abenu10 this seems still to be a Cursor rate limit with AI provider, Is this happening on Claude 4 Sonnet?

I see. Now it would be good to know why sometimes 0 tokens are being used and other times millions of tokens. I was working on the same code (python tools), you can see the token assumptions is highest with 4.0 , lower with 3.7 and lowest with default (3.5). I find this unexpected as the tasks I gave them were not that different

@alorbach your screenshot shows clearly why 0. There was an error, it was not charged, so 0.

Your other parts show moderate to high token consumption. Have a look at the Tokens feature in Usage logs (top right). It would show you what the tokens were used for. This depends highly on how you use Cursor.

Appears that the CACHE Tokens is what makes up the difference between the model calls. Shouldnt CACHED Tokens be cheaper in general?

Regarding the 0 ones, I found some more - but perhaps these are just connection failures?

@alorbach some may be still processing or being updated.

The token Cache Reads are the cheapest, but having 500 000 tokens read from cache means that a lot of context was necessary to handle that part of a request. It is unusal large amount, check the chat, what you requested and how AI performed.

Changing what you ask in prompt or what needs to be done may affect this.
Also consider if a lighter model may suffice for some tasks, it would save you on usage.

I am letting it code rather complex python scripts which display UI, perform AI requests of it’s own, process data and so on. So using lighter models is not really an option as they fail more on the tasks I am giving them.

claude-sonnet 4.0 is so ■■■■■■■ good, I got so used to it’s quality of code that it really hurts to get back to 3.5 all the time :wink:

Sure I also use Claude 4 Sonnet :slight_smile: and it is the best so far.

Not that you have to use 3.5 all the time. I meant more looking into the prompts, what AI does and how you can save yourself tokens for the important parts of your work.

Do you use any framework e.g. for UI / crud or similar processes in your code? Usually this helps avoid AI having to code a lot of things manually and spend tokens on the business logic instead.

Not yet, I had claude decide how to do the GUI coding which is btw made great results. Would be great if you could release a list somewhere how many tokens per model and plan is needed to reach the RATE Limit. Perhaps I need to ask by boss for a better plan ^^

1 Like

I personally dont have that information as I am not internal to Cursor Team, but they mentioned that usage included is always more than the cost of the plan if you look on the token consumption.

With old plan as a developer I used up my 500 requests in about 2 weeks. after that it was usage based pricing for me. With the new plan I can do a lot with the plan and I enable usage based pricing when I go over the limit. But I also check my usage and ensure that no unnecessary AI processing is happening.

From what I personally see it may be that those who really need more intensive AI usage do benefit from higher plans.

metoo :sob:

Please have a look at the recent update

I agree brother. Its been only 10 days since i bought cursor subcription and its already saying You've save 101$ on pro plan, switch to auto...
Its so frustrating, previous subscription plan with 500 request per month was far better.

I also encountered this situation after a few questions, although not attached many files but it still counted a lot of tokens, I wonder if the new memory function of cursor is counted in tokens. This annoying thing makes me want to refund again.

A tool that you can’t rely on to work when needed can’t be used as a primary tool. Without reliability, Cursor can only be a hobbyist’s tool, nothing more imo.

1 Like

They have to fix this.
June 1st - I used my cursor, $20/m + some usage based.
July 1st - Used cursor, $20/m and unlimited claude 4 sonnet usage
July 6th - Hit the ■■■■ limit of $20/m in few minutes, forced to upgrade to $60/m just to keep coding.

There’s some ridiculous usages

Jul 6, 05:01 PM	You	pro-plus	No	claude-4-sonnet-thinking	78,065	Included
Jul 6, 04:53 PM	You	pro-plus	No	claude-4-sonnet-thinking	573,165	Included
Jul 6, 04:45 PM	You	pro-plus	No	claude-4-sonnet-thinking	549,854	Included
Jul 6, 04:16 PM	You	pro-plus	No	claude-4-sonnet-thinking	517,228	Included
Jul 6, 04:14 PM	You	pro-plus	No	claude-4-sonnet-thinking	654,669	Included
Jul 6, 04:06 PM	You	pro-plus	No	claude-4-sonnet-thinking	880,135	Included
Jul 6, 04:04 PM	You	pro-plus	No	claude-4-sonnet-thinking	60,677	Included
Jul 6, 03:55 PM	You	pro-plus	No	claude-4-sonnet-thinking	850,678	Included
Jul 6, 03:50 PM	You	pro-plus	No	claude-4-sonnet-thinking	2,258,447	Included
Jul 6, 03:50 PM	You	pro-plus	No	claude-4-sonnet-thinking	172,237	Included
Jul 6, 03:46 PM	You	pro-plus	No	claude-4-sonnet-thinking	106,684	Included
Jul 6, 03:46 PM	You	pro-plus	No	claude-4-sonnet-thinking	1,273,890	Included
Jul 6, 03:46 PM	You	pro-plus	No	claude-4-sonnet-thinking	104,210	Included
Jul 6, 03:45 PM	You	pro-plus	No	claude-4-sonnet-thinking	101,598	Included
Jul 6, 03:45 PM	You	pro-plus	No	claude-4-sonnet-thinking	237,666	Included

What the hell 2million+ tokens used? Look at the timings, it just followup messages in the chat window with the agent.

2 Likes
1 Like