I’m trying to understand a sudden spike in my token usage when using Cursor in Auto mode, and I’m wondering if anyone has experienced something similar.
I’ve been using Cursor daily for several months, always in Auto mode, with very stable usage. Typically my prompts cost around $0.10–$0.20, sometimes a bit more for larger tasks.
Yesterday, however, I unexpectedly consumed the entire $20 usage limit allocated by my company in just a few hours, without doing anything unusual compared to my normal workflow.
After reviewing the usage dashboard, I noticed something unusual on the 18 prompt made the 09/03/2026:
17 prompts from yesterday show 0 cache read and 0 cache write
Previously (for example on March 5th) caching was working normally with significant cache usage
1 prompt shows about 4k cache read on ~584k total tokens, but every other request has no caching at all
To test this further, I tried the same prompt with different settings:
Forcing Claude Sonnet 4.6 → caching works normally (cache read/write visible)
Using Auto mode → no cache read/write appears
So my assumption is that either:
Auto mode started routing my requests to a model that does not support prompt caching, or
There might be a bug where caching is not applied when using Auto mode
One important detail: privacy mode is enforced by my company, so I cannot disable it even if needed for debugging. I understand this may limit the information available on your side, but I wanted to mention it in case it affects caching behavior.
Has anyone seen something similar recently, or could someone from the Cursor team help clarify what might be happening here?
For AI issues: which model did you use?
Model name (e.g., Sonnet 4, Tab…)
For AI issues: add Request ID with privacy disabled
Request ID: f9a7046a-279b-47e5-ab48-6e8dc12daba1
For Background Agent issues, also post the ID: bc-…
Additional Information
Add any other context about the problem here.
Does this stop you from using Cursor?
Yes - Cursor is unusable
Sometimes - I can sometimes use Cursor
No - Cursor works, but with this issue
The more details you provide, the easier it is for us to reproduce and fix the issue. Thanks!
I am experiencing the exact same issue as described in this thread. My usage dashboard shows a massive spike in costs because prompt caching has suddenly stopped working when I use the Auto mode.
I’ve performed some tests and can confirm that the system is sending my entire codebase as fresh input every time I’m in Auto mode, while manual selection seems to handle the cache correctly.
Here is the detailed bug report and the evidence:
Where does the bug appear (feature/product)?
Cursor IDE
Describe the Bug
There is a major discrepancy in prompt caching behavior between “Auto” mode and manual model selection. When using Auto mode, the system consistently fails to use prompt caching (0 Cache Read / 0 Cache Write), forcing the entire context to be resent and billed as fresh Input tokens. When manually selecting a model (like Claude 4.5 Haiku), caching works perfectly.
Steps to Reproduce
Use the “Auto” model selector in a project with a large context.
Ask a question and check the dashboard: Cache Read is 0 and costs are high.
Switch to a manual model (e.g., Claude 4.5 Haiku).
Ask a question: Cache Read/Write is active and costs are normal.
Expected Behavior
Auto mode should utilize prompt caching to avoid redundant token billing, especially on large requests.
Screenshots / Screen Recordings
1. Auto Mode (Bug): 0 Cache Read / High Input Costs
Auto mode (problematic) vs. Claude 4.5 Haiku (working)
Additional Information
My company enforces Privacy Mode. The Auto mode seems to bypass the caching mechanism entirely, which is a critical financial issue when working on large repositories.
The issue is still present three days later. The first occurrence was on March 9th, and as of March 12th, prompt caching is still not working for me when using Auto mode.
Since my original post, I tried several things to rule out local configuration or environment issues:
Tested on another repository with a different and smaller codebase
Reset the repository index
Deleted the local Cursor configuration and re-logged into my account
Reset my Cursor settings
Created new agents and new chats
Tested different modes (Ask / Plan / Agent)
Tested on another computer (macOS) using the same account
None of these tests changed the behavior.
The result is always the same:
Auto mode → 0 Cache Read / 0 Cache Write
Manual model selection (e.g. Claude Sonnet) → caching works normally
So the problem does not appear to be related to:
local machine configuration
repository indexing
project size
operating system
One additional note: Privacy Mode is enforced by my company, so I cannot disable it for testing.
At this point it seems very likely that the issue is related specifically to Auto mode routing, rather than a local configuration problem.
If anyone from the Cursor team is investigating this, I’m happy to provide additional details if needed.
This matches what we saw in a similar report Auto mode: Prompt caching not working. Version 2.6.12 routed requests differently, which made Auto mode prefer models that do not support Anthropic-style prompt caching. Version 2.6.18 and newer fixed this.
If caching drops again in Auto mode, grab a Request ID in the top-right of the chat, then Copy Request ID, and share it here so I can check the model routing for your request.
I’m marking this as solved based on your update. Let me know if anything changes.