Cursor high token usage

Why is cursor recently consuming an absurd amount of tokens recently? Each API request in my usage dashboard is 500k-1m tokens, 90% of which is cache write/read. I work with large open source projects but in every request I sent I make sure to attach the relevant files, both the context files and the files to edit/debug. Is cursor sending the whole project as cache write to AI providers? This is getting prohibitively expensive and it’s driving me to go back to the old ways of using claude.ai or chatgpt.com for ai-assisted code.

2 Likes

Do you have any MCP servers enabled?

No.

Hey, good request. No, Cursor doesn’t send your whole project. But in Agent mode, every tool call and every follow-up message is a separate API call, and each one has to include the full chat context history, the system prompt, tool call results, and so on. When you’re working with a big open source project, that context grows fast.

About cache write and cache read, that’s prompt caching from the provider (Anthropic or Google), and it actually saves money, not increases costs. Cache read is about 10x cheaper than normal input tokens. Without caching, those same tokens would be counted as full input, and the cost would be much higher.

A detailed explanation from our team with example calculations is here: Someone please explain - Why are cache read and write chargeable? - #8 by condor

A few tips to reduce token usage:

  • Start a new chat for each new task, long chats build up context
  • Only attach the files you actually need in context
  • For simple tasks, use cheaper models
  • If you’re in Agent mode, keep an eye on tool calls, each one resends the full context

Which model and which mode are you using, Agent or Ask? That’ll help me give more specific advice.

I’m observing similar thing. Previously, a monthly subscription was easily enough for me. Last time, the monthly quota ran out in about 2.5 weeks; now 40% of the tokens are gone in just two and a half days. Auto mode. My usage pattern is roughly the same.

I thought the issue might be with MCP and the rules, so I deleted most of the rules and disabled all MCPs - but it’s still the same: even a relatively minor code change request (taking about 0.5 to 1 minute to process) consumes anywhere from 1M to 4–6M tokens. Only most minor requests (like ask something) use less than 1M.

(post deleted by author)