Once sent, recorded many times in usage, is this a bug or incorrect functionality
They do some insane imaginary token caching for each tool call.
Switch to gemini-2.5 pro. Gemini does it all in one pass while claude-4 tool calls each trigger a full usage cost.
Claude-4 and even 3.7 are just not sustainable with this dumb pricing model they just introduced
@Dream yoiu can find the detailed breakdown of each record when you use the top right view option “Tokens” in Dashboar > Usage.
It will show following 4 columns
- Input tokens (prompt, rules, attached files, mcp tools, tools,…)
- Output tokens (chat text, code, tool calls, mcp tool calls, …)
- Cache Write (writing chat session to cache so it can be reused in your next chat request on same thread)
- Cache Read (reading that session cache, cheapest token of all 4 as it does not require pre-processing anymore)
API providers have the pricing on their page (add 20% from Cursors side as per documentation)
But what is up with the insane amount of cache Cursur use?
token-based usage calls to claude-4-sonnet, totalling: $2.62. Input tokens: 1120, Output tokens: 10751, Cache write tokens: 117388, Cache read tokens: 5252615
5million tokens in chache read, for 1120 input/10751 outout tokens? What happened did it remember the whole wikipedia in the Cache?
Coursor is exaggerating the facts and making you think they’re trying hard, not that they’re going too far
Every tool call include all prior tokens as cached tokens. E.g., if there were 120k cache write tokens, roughly 42 tool calls using those cached tokens would result in 5m cached read tokens (for Anthropic models, cache read tokens are 10x cheaper than input tokens)
Cache write token or cache create tokens are generally the user input and files the tool call read and so on for the first time.
Input token maybe some little input the cursor or claude think no need to cache, just my guess.