Massive token increase on Ultra?

fusiondev · July 15, 2025, 9:07pm

Just boosted my sub to Ultra mode, because I want to drive fast, not be the “mechanic” in the car and race driver analogy – what I noticed on two relatively short horizon requests it fed Claudey 1m tokens, you can see the previous ones on my older limited Pro account they were all around 150k tokens, it’s way more, but they both seem to be over the top in input and output tokens? Actually my own inputs are two lines in English “On this phone number field, can we add validation for US phone numbers so that it puts hyphenation on it or we use Zod and it knows the exact number of digits?” That is Claudey 4 Sonnet not maximus…

banth31 · July 16, 2025, 4:03am

I just noticed my usage dash used to say included in ultra and now says unknown. wonder whats up with that

gprethesh · July 16, 2025, 5:50am

Totally scamming users

banth31 · July 16, 2025, 5:55am

has to be a bug. that token usage is not even close to reasonable

Zack_Holbrook · July 16, 2025, 6:07am

The token count here should be correct; the count here includes cached read tokens, which get included based on all prior input tokens at every tool call.

Cached input tokens are 10x cheaper than input tokens. If you hover over the token count, it should give you a breakdown by token type.

banth31 · July 16, 2025, 6:12am

you’re right. shame that this means 1 prompt was supposed to cost $3.40.

Zack_Holbrook · July 16, 2025, 6:17am

The unknown text is a bug; we’ll get that fixed.

banth31 · July 16, 2025, 6:22am

can’t say that was the part that concerned me, but appreciate it

Zaz · July 16, 2025, 6:32am

Na token usage has suddenly got pretty wild…..

This is on the same chat, but it was ticking along at 40-50k and then suddenly jumping into the multi millions!??

Zack_Holbrook · July 16, 2025, 6:38am

The longer a chat grows, the more the cache read tokens will grow. E.g. if there are 100k tokens in the chat context, and it makes 10 tool calls, then thats at least 1 million cache read tokens.

Cache read tokens are 10x cheaper than the input tokens though

Jaeder · July 16, 2025, 10:26am

I feel like this needs to be much more clearly documented. This isthe first response that I see that clearly illustrates what’s happening. You may want to write this in the trending thread where people were asking questions regarding how this happens.

Link to thread in question: Frustrated with Cursor’s Sudden Token Drain and Access Restrictions - #43 by Jaeder

That being said, I don’t really know if every user really wants Cursor to go through millions of tokens just to contextualize their codebase. If you could find ways to truncate context usage, or introduce customization when it comes to having a maximum amount of tokens used per prompt, or stage in the request, that would be great.

I’m sure power users would appreciate having a new useful tool to play around with.

danperks · July 16, 2025, 11:40am

When not using MAX mode, Cursor should intelligently build the final prompt to the AI in a more economic way, which should mean the requests are somewhat uniform in size and therefore are more predictable in their usage!

With MAX enabled, Cursor will quite happily fill the model’s context window which is where things can really add up!

Alxbrondj · July 16, 2025, 12:31pm

how to disable cached tokens? I want to work the old fashioned way, like March 1st! Maybe you need money, but I also need my money and don’t need your cached tokens.

cocode · July 16, 2025, 12:36pm

very interesting observation.

Artemonim · July 16, 2025, 2:06pm

This is literally an architectural feature of the Chat, which reduces your expenses. If you want to see what happens if you turn it off, use o3-Pro.

carlosmartinpavon · July 16, 2025, 8:27pm

I just upgraded to the Ultra plan and have already consumed approximately 20% of my monthly quota in less than half a day of regular work. At this rate, I will exhaust the entire monthly quota within 3 to 4 days.

This level of usage makes the platform the most expensive solution I’ve encountered, and unfortunately, it’s not sustainable. To continue working as I did just yesterday, I would now need to spend over $1,500 USD per month.

I’m looking for alternative solutions, even windsrf cost way less than this

Artemonim · July 16, 2025, 8:40pm

You can try to use my balanced strategy and give me feedback about it

carlosmartinpavon · July 16, 2025, 9:03pm

the auto mode is just plan stupid, I cant rely on a model that fix stuff with just adding ONE comment, the auto mode break stuff more than helps, the photo is just a simple example I did to show u how it works normally

Artemonim · July 16, 2025, 9:10pm

Also try to use Agent Compass…

You should also give him a task, and not just state what is happening. Especially with silly models like Auto.

Dante · July 17, 2025, 1:16am

for real ? im thinking on upgrading to ultra plan cause my pro plan usage ended just in 3 day xd im thinking if i buy ultra plan would it get me use whole month or end in 3 day just like pro plan

Topic		Replies	Views
Monthly token limit in the $20 Pro Plan for Cursor usage Help	18	26857	January 24, 2026
Extreme token usage Feedback	78	4436	November 9, 2025
For users spending $1,000-2,000/month on usage, is Ultra worth it over Pro + usage-based pricing? Discussions	9	2620	March 15, 2026
Cursor still great, but just got a lot more expensive Discussions	21	8002	August 24, 2025
"Hi” Message Used 13K+ Tokens – Why Is Token Usage So High? Help auto-mode , context	8	604	March 5, 2026

Massive token increase on Ultra?

Related topics