Rate limits spends after few requests

Describe the Bug

Rate limits spends after 3-4 requests.

Steps to Reproduce

Just ask something to do AI in any model (I tested o3 and sonnet models) and will see what they took more than 300k tokens

Expected Behavior

Just work how it was early

Screenshots / Screen Recordings

Operating System

Windows 10/11

Current Cursor Version (Menu → About Cursor → Copy)

Cursor: v1.2.0 - v1.2.1
Vscode: 1.99.1

Additional Information

I tested on 1.2.0 and 1.2.1, problem still the same.
Other people also wrote about it, but I didn’t saw answer on that problem.

My files is not too large. I used cursor 4 thinking yesterday with max mode and all worked perfect.

Does this stop you from using Cursor

Yes - Cursor is unusable

4 Likes

Hi @FlamesONE the screenshot shows lots of tokens being used. Average 500k tokens up to 2M tokens which does suggest heavier usage even if smaller files are being handled.

The Tokens feature on top right breaks down the usage by 4 types of tokens which can be matched with AI providers API pricing.

While I have no insight into your other usage details or yesterdays usage, I suggest:

  • Have a look what can be done with less intensive models or Auto as this greatly reduces your plan consumption
  • Switch to Sonnet 4 when more complex tasks are needed.
  • Check how much context is really necessary, as high token amount consumes your usage fast.

Could you share a bit what you are working on (what kind of files, other detalis that could shed light onto why so many tokens are needed)

PHP file with ~190 lines.
изображение

Just one request. I suppose after around 5 same requests I will hit rate limit

People also wrote about this there

Please try to use it first and you will see the problem.

ok while this does not show all possible causes of high context, I can pinpoint a few things that I would do differently:

  • Consider not attaching the file, its not necessary, AI can find it and read it anyway.
  • AI had to search in the system but the issue info was not provided, this causes more reads and tokens to be processed unnecessarily.

What I do:

  • Give clear error description, relevant lines from error log or browser issue

Why it helps:

  • AI doesnt have to search in files and consider all possible usages of the file, its functions and how they are being used.
1 Like

@FlamesONE i am using it daily for my own work. And also with PHP, not having the issue though.

AI use a tool calls for finding files, so for me more easier way it’s just attach needed files.

Don’t quite understand. Which error? This is error on cursor servers, not mine.

A days ago I used 4 sonnet thinking for editing file with more than 6000 lines, and spent around 20 requests, all was fine. Now something changed and I’m not alone, a lot of people already facing with same issue.

Please ask about it cursor team what they changed and why models eating so much tokens and hit’s limiter much earlier then before.

Im talking about the error you asked AI to fix. Using convenience by not providing bug details to AI costs you a lot of tokens. This is not an AI issue.

I just asked about that for example to show how many AI took tokens

Yes I suggest looking at the Tokens view (top right in your screenshot)
It can tell you much more about your usage and where the cost comes from.

This does depend a lot on how you use AI


I separated 4 and 3 July by line

I can’t show you how much tools it used because after update cursor cleared all my history…

Was anything different between 3/4 July? I dont have insight into your chats. Depending on whether you have privacy on also Cursor may not be able to see what happened in those chats.

Yes, in 3 July I used files with a lot of lines with code (more than 1000) and all was good.
Today, apparently situation is worse.

I know about cursor privacy, I just say about second bug (why I can’t share what happened there) :face_with_diagonal_mouth:

1 Like

From what I see there was no such heavy change before / after on my usage. If there is a persistent issue, I would suggest making 1 request with privacy off and posting here the Request ID so the Cursor Team can have a look if there are technical issues that could be improved.

Please refer to timeline details on this bug here:

изображение
953fba50-d106-4b82-aafd-2cf50f3bb762

I think problem is in new cursor context works.
Because my project is not small it takes a lot more tokens than before, but in my opinion it works are same as before :person_shrugging:

2 Likes

Hey @condor
I have a question. I just created a blank project and ran Taskmaster AI to generate rules automatically. But I don’t understand what “cache read” means. Also, I’ve deleted all my old projects from the workspace — but is the cache still there?

It seems like it evaporates after just 5 questions — that’s crazy. I only asked 5 times today using Claude-4-thinking.


Cache Read is part of chat session that is already cached at API provider, this avoids repeated processing as it has been ingested for the session. provider reads from cache and uses it as context for response. Cache Read is cheapest of all the 4 columns.

I do not think this is from previous sessions, could you confirm if there is any relation of new project to old projects and if there is any other project in Cursor which might have used similar code or still has files indexed?

1 Like