Why is my token count so inflated? I don't see how this can be possible

dbruntz · September 8, 2025, 9:09pm

I swear cursor has NEVER taken this much tokens up per request. I’ve been using Cursor for a long time now, usually under wayyyyyy more comprehensive circumstances, and I have never gotten even close to using $5 dollars of usage in less than a day. I have noticed while tracking my usage that also sometimes, even when the model does 0 tool calling, it instead of blowing up with cached read tokens blows up with “Input” tokens. How does this make sense. I asked it how I’d package the extension im building for distribution, it outputted 900 tokens according to tokencounter and my input was 113 tokens, yet it claims I had 92k input tokens and 4k output tokens and "charged’ me 16 cents. This does not make any sense, as there were 0 tool calls. After doing some research, I can see many people are reaching their monthly limit extremely quick, so this is just very weird.

I just don’t understand how a request with 0 tool calls, a 113 token input and 900 token output shows up as 92k input tokens in usage with “16 cents” of usage.

Artemonim · September 8, 2025, 9:15pm

Which model are you using?

dbruntz · September 8, 2025, 9:30pm

GPT-5-High. What doesn’t make sense to me is that the only way I can think of the tokens SOMEHOW getting that high is if the model did a ton of thinking and that thinking gets factored into the input token count. Now the thing is, on all the requests i have that show 1-2.1 million cache read tokens, that it literally spent thinking for like 5 minutes before it gave an output, the “input tokens” are like 5k. So if the model can think in those responses for that long and only have 5k input tokens, how can a model that thinks for 1 minute, doesnt do any tool calling, and gets a 113 token input and outputs 900 tokens worth of text have an 92k input token count?

Artemonim · September 8, 2025, 9:34pm

It really sounds weird. On my current Kotlin project, I received 3 million tokens after 35 minutes of work.

In any case, you should provide the Request ID of the problematic request so that the moderators can send it to the developers for review.

deanrie · September 9, 2025, 10:06am

Hey, do you also use MCP servers? Also, as mentioned above, could you share the request ID? First, you’ll need to disable privacy mode.

Topic		Replies	Views
Cursor Token Counter Over Counting by x10 Bug Reports	10	133	September 3, 2025
For those freaking out - it has to be a bug! Discussions	35	4122	July 22, 2025
Cursor tokens uses so quickly in regular request comparing to claude code Discussions	0	212	July 14, 2025
Ran out sonnet 4.0 in 35 requests - any ideas on how to to get back old pricing option? Discussions	3	104	August 8, 2025
Monthly token limit in the $20 Pro Plan for Cursor usage Discussions	16	6151	August 24, 2025

Why is my token count so inflated? I don't see how this can be possible

Related topics