Why does Cursor consume an absurd amount of cache read tokens?

Colin · February 23, 2026, 5:58pm

Just to call out another example, I have a repo open where the tool definitions, system prompt, and other information (rules, skills I’ve defined) take up ~47.5 k of context. No files from my repo are included in this starter context.

I just sent “hi”, nothing else. But because I’d been working in other chats in the same repo, the provider’s cache already had most of that prefix, so it shows up as 47,499 cache read tokens and only 171 input tokens. The cache is doing exactly what it should: avoiding re-processing tokens the provider has already seen.

Imagine I submit this prompt:

read files and then decide the next file to look at. Do this 10 times, and make sure you think in between.

No surprise, huge cache read on this session, which took 13 requests and eventually opened a file with ~17k tokens (which was added to the next requests as cached tokens)

One factor that may contribute to the perception of higher cache token usage is that models and our agent harness have improved at sustained, multi-step work. A single message now often triggers 10+ LLM calls autonomously, rather than 3-4. The total work (and tokens) is similar to what multiple shorter turns would have consumed, just rolled up into one line item.

Topic		Replies	Views
Extreme token usage Feedback	78	4510	November 9, 2025
Composer 2 token consumption today feels extremely high (10x normal) Feedback composer	51	912	May 9, 2026
Why is a simple edit eating 100,000+ tokens? Let’s talk about this Discussions	107	10872	February 11, 2026
Cursor high token usage Help context , byok , large-codebases	11	666	May 12, 2026
For those freaking out - it has to be a bug! Discussions	35	4480	July 22, 2025

Why does Cursor consume an absurd amount of cache read tokens?

Related topics