User Provided Rate Limit Exceeded in Agent Mode

This is not related to the API key or account running out of funds. but rather due to the agent invoking a very high number of requests within a very short period of time.

It can be resolved by introducing a delay between requests. it would be nice to have this configuration option in cursor settings.

Slower but complete execution is better than having work interrupted mid-way then asking the agent to pick up from where he stopped.

Edit: I think this issue happens specifically when using own Anthropic api key. I tested using Cursor API key and the issue is not happening so far.

1 Like

I have exactly the same issue, and I confirm that it is really annoying.

I don’t have the issue with gpt-4o… but gpt-4o is far less efficient as an agent.

The current solution I found is to switch off using own Anthropic API and switch to using Cursor provided key. then purchase credit from them directly. I believe they have higher rate limits so I don’t see the issue anymore.

1 Like

@level09 is correct, when using the API directly, your rate limits are much lower than Cursor’s own - we don’t enforce any limits on your ourselves.

You can now enable usage pricing in your Cursor dashboard, where you are charged $0.40 per Claude 3.5 request - this means you aren’t forced to buy a batch of 500, but can use only what you pay for above the base 500 requests!

1 Like

I think the exact issue lies from anthropic’s side.
They have rate limit of 40000 input tokens per minute and agentic flows have higher context with multiple requests.


Now I am using my org’s account so I am not sure if these limits can be changed

@danperks Sorry to “re-open” this topic, and I realise that a lot of water has passed under this “rate limit” bridge, not least the changes to your pricing model. But was this a valid “solution” to @level09’s proposal, and is it still a solution?

There are other reasons for us to use our Model API keys, not least requirements for centralised monitoring and billing for aggregate model use and spend across different services using different APIs.

For example, I don’t want to hide my use of Anthropic Claude Models within Cursor; I want all my use of Anthropic Models to be visible in the Anthropic Console and on my Anthropic Bills.

I’m constantly getting hit with “User API Key Rate limit exceeded” when using my Anthropic API key. Looking at the Anthropic console, the Input Tokens Per Minute (ITPM) is 20,000 for Claude Sonnet 4. Looking at the usage for my API token over the last 2 days, I have ~5.6 million tokens in and 36 thousand tokens out. I cannot type very quickly, so it seems that Cursor is sending a great deal of context to Anthropic and this is rapidly exceeding the ITPM. The most (seemingly) trivial request in Cursor triggers the rate limit. It’s pretty much useless.

The simplest explanation seems to be that from danperks: Cursor’s own API key(s) have much higher rate limits.

It would seem that the best solution for those that have other uses for Anthropic models is to purchase a separate API key for external use and use Cursor’s own key(s) for coding. Either that or find ways to acquire an API key with sufficiently high rate limits or drastically reduce the context sent with requests. I found that simply requesting a file conversion at the Claude 4 Sonnet prompt would quickly hit the rate limit but if the file was explicitly added to the context before issuing the command then the rate limit was not hit. YMMV.