Just boosted my sub to Ultra mode, because I want to drive fast, not be the “mechanic” in the car and race driver analogy – what I noticed on two relatively short horizon requests it fed Claudey 1m tokens, you can see the previous ones on my older limited Pro account they were all around 150k tokens, it’s way more, but they both seem to be over the top in input and output tokens? Actually my own inputs are two lines in English “On this phone number field, can we add validation for US phone numbers so that it puts hyphenation on it or we use Zod and it knows the exact number of digits?” That is Claudey 4 Sonnet not maximus…
I just noticed my usage dash used to say included in ultra and now says unknown. wonder whats up with that
Totally scamming users
has to be a bug. that token usage is not even close to reasonable
The token count here should be correct; the count here includes cached read tokens, which get included based on all prior input tokens at every tool call.
Cached input tokens are 10x cheaper than input tokens. If you hover over the token count, it should give you a breakdown by token type.
The unknown text is a bug; we’ll get that fixed.
can’t say that was the part that concerned me, but appreciate it
Na token usage has suddenly got pretty wild…..
This is on the same chat, but it was ticking along at 40-50k and then suddenly jumping into the multi millions!??
The longer a chat grows, the more the cache read tokens will grow. E.g. if there are 100k tokens in the chat context, and it makes 10 tool calls, then thats at least 1 million cache read tokens.
Cache read tokens are 10x cheaper than the input tokens though
I feel like this needs to be much more clearly documented. This isthe first response that I see that clearly illustrates what’s happening. You may want to write this in the trending thread where people were asking questions regarding how this happens.
Link to thread in question: Frustrated with Cursor’s Sudden Token Drain and Access Restrictions - #43 by Jaeder
That being said, I don’t really know if every user really wants Cursor to go through millions of tokens just to contextualize their codebase. If you could find ways to truncate context usage, or introduce customization when it comes to having a maximum amount of tokens used per prompt, or stage in the request, that would be great.
I’m sure power users would appreciate having a new useful tool to play around with.
When not using MAX mode, Cursor should intelligently build the final prompt to the AI in a more economic way, which should mean the requests are somewhat uniform in size and therefore are more predictable in their usage!
With MAX enabled, Cursor will quite happily fill the model’s context window which is where things can really add up!
how to disable cached tokens? I want to work the old fashioned way, like March 1st! Maybe you need money, but I also need my money and don’t need your cached tokens.
This is literally an architectural feature of the Chat, which reduces your expenses. If you want to see what happens if you turn it off, use o3-Pro.
I just upgraded to the Ultra plan and have already consumed approximately 20% of my monthly quota in less than half a day of regular work. At this rate, I will exhaust the entire monthly quota within 3 to 4 days.
This level of usage makes the platform the most expensive solution I’ve encountered, and unfortunately, it’s not sustainable. To continue working as I did just yesterday, I would now need to spend over $1,500 USD per month.
I’m looking for alternative solutions, even windsrf cost way less than this
You can try to use my balanced strategy and give me feedback about it
the auto mode is just plan stupid, I cant rely on a model that fix stuff with just adding ONE comment, the auto mode break stuff more than helps, the photo is just a simple example I did to show u how it works normally
Also try to use Agent Compass…
You should also give him a task, and not just state what is happening. Especially with silly models like Auto.
for real ? im thinking on upgrading to ultra plan cause my pro plan usage ended just in 3 day xd im thinking if i buy ultra plan would it get me use whole month or end in 3 day just like pro plan