Hey, let’s clear this up.
That 6,3M tokens per request number in the usage dashboard isn’t what actually fit into the context window in a single model call. Composer 2 has a 200k token window, and each individual server-side LLM call stays within that limit. The dashboard sums input + output + cache_read + cache_write across all turns in one agent conversation. On a long agent run with dozens of tool calls, the numbers add up turn over turn, because each later turn re-sends the growing history, and most of that ends up coming from cache.
Important: cache read tokens are billed, but at a much lower rate, around 10% of normal input. So 6M cache reads and 6M fresh input are very different in cost. Colin broke down the numbers here: Why does Cursor consume an absurd amount of cache read tokens? - #24 by Colin
On your questions:
- You can’t directly change the Composer context window limit. It’s fixed at 200k. But you can control what gets included in it.
- Re-uploading millions of tokens between turns is basically cache reads. Technically they’re re-sent, but in practice they’re pulled from the provider’s prompt cache, so the price is much lower than full input.
- Indexing and context timing for a large CAD project:
- Use
.cursorignorefor heavy vendor libs, generated files, geometry assets, binaries. Syntax is like.gitignore. Docs: Ignore files | Cursor Docs - Start a new chat for each unrelated task. Long history means more cache reads turn over turn.
- Use targeted
@fileor@foldermentions instead of letting the agent roam the tree.codebase_searchpulls relevant chunks, but on huge projects it can expand more than you want. - For routine edits, try composer-2 without
-fast. It’s cheaper, and the fast variant costs a lot more because it’s optimized for speed. - Use Plan mode before running. First scope what to touch, then execute.
- Use
If you open that specific 6,3M request in the dashboard, you should see a breakdown by token type (input vs cache_read vs output). That’ll show how much of it was billed at the cheaper rate.