Cursor high token usage

deanrie · April 9, 2026, 4:23pm

Hey, good request. No, Cursor doesn’t send your whole project. But in Agent mode, every tool call and every follow-up message is a separate API call, and each one has to include the full chat context history, the system prompt, tool call results, and so on. When you’re working with a big open source project, that context grows fast.

About cache write and cache read, that’s prompt caching from the provider (Anthropic or Google), and it actually saves money, not increases costs. Cache read is about 10x cheaper than normal input tokens. Without caching, those same tokens would be counted as full input, and the cost would be much higher.

A detailed explanation from our team with example calculations is here: Someone please explain - Why are cache read and write chargeable? - #8 by condor

A few tips to reduce token usage:

Start a new chat for each new task, long chats build up context
Only attach the files you actually need in context
For simple tasks, use cheaper models
If you’re in Agent mode, keep an eye on tool calls, each one resends the full context

Which model and which mode are you using, Agent or Ask? That’ll help me give more specific advice.

Topic		Replies	Views
"Hi” Message Used 13K+ Tokens – Why Is Token Usage So High? Help auto-mode , context	8	781	March 5, 2026
High consumption Help	3	131	May 23, 2026
50 MILLION TOKENS per request? is this Normal? Help ask-mode , context , large-codebases , anthropic	2	137	April 29, 2026
Why does Cursor consume an absurd amount of cache read tokens? Discussions context	22	3284	May 18, 2026
Abnormally high token use Help auto-mode	2	88	April 23, 2026

Cursor high token usage

Related topics