Auto mode not using prompt caching (0 cache read/write) → sudden usage spike

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

Hi,

I’m trying to understand a sudden spike in my token usage when using Cursor in Auto mode, and I’m wondering if anyone has experienced something similar.

I’ve been using Cursor daily for several months, always in Auto mode, with very stable usage. Typically my prompts cost around $0.10–$0.20, sometimes a bit more for larger tasks.

Yesterday, however, I unexpectedly consumed the entire $20 usage limit allocated by my company in just a few hours, without doing anything unusual compared to my normal workflow.

After reviewing the usage dashboard, I noticed something unusual on the 18 prompt made the 09/03/2026:

  • 17 prompts from yesterday show 0 cache read and 0 cache write
  • Previously (for example on March 5th) caching was working normally with significant cache usage
  • 1 prompt shows about 4k cache read on ~584k total tokens, but every other request has no caching at all

To test this further, I tried the same prompt with different settings:

  • Forcing Claude Sonnet 4.6 → caching works normally (cache read/write visible)
  • Using Auto mode → no cache read/write appears

So my assumption is that either:

  1. Auto mode started routing my requests to a model that does not support prompt caching, or
  2. There might be a bug where caching is not applied when using Auto mode

One important detail: privacy mode is enforced by my company, so I cannot disable it even if needed for debugging. I understand this may limit the information available on your side, but I wanted to mention it in case it affects caching behavior.

Has anyone seen something similar recently, or could someone from the Cursor team help clarify what might be happening here?

Thanks!

Operating System

Linux (Windows WSL2)

Version Information

Version: 2.6.14 (system setup)
VSCode Version: 1.105.1
Commit: eb1c4e0702d201d1226d2a7afb25c501c2e56080
Date: 2026-03-08T15:36:54.709Z
Build Type: Stable
Release Track: Default
Electron: 39.6.0
Chromium: 142.0.7444.265
Node.js: 22.22.0
V8: 14.2.231.22-electron.0
OS: Windows_NT x64 10.0.26200

For AI issues: which model did you use?

Auto

Does this stop you from using Cursor

No - Cursor works, but with this issue

1 Like

Hi there!

We detected that this may be a bug report, so we’ve moved your post to the Bug Reports category.

To help us investigate and fix this faster, could you edit your original post to include the details from the template below?

Bug Report Template - Click to expand

Where does the bug appear (feature/product)?

  • Cursor IDE
  • Cursor CLI
  • Background Agent (GitHub, Slack, Web, Linear)
  • BugBot
  • Somewhere else…

Describe the Bug
A clear and concise description of what the bug is.


Steps to Reproduce
How can you reproduce this bug? We have a much better chance at fixing issues if we can reproduce them!


Expected Behavior
What is meant to happen here that isn’t working correctly?


Screenshots / Screen Recordings
If applicable, attach images or videos (.jpg, .png, .gif, .mp4, .mov)


Operating System

  • Windows 10/11
  • MacOS
  • Linux

Version Information

  • For Cursor IDE: Menu → About Cursor → Copy
  • For Cursor CLI: Run agent about in your terminal
IDE:
Version: 2.xx.x
VSCode Version: 1.105.1
Commit: ......

CLI:
CLI Version 2026.01.17-d239e66

For AI issues: which model did you use?
Model name (e.g., Sonnet 4, Tab…)


For AI issues: add Request ID with privacy disabled
Request ID: f9a7046a-279b-47e5-ab48-6e8dc12daba1
For Background Agent issues, also post the ID: bc-…


Additional Information
Add any other context about the problem here.


Does this stop you from using Cursor?

  • Yes - Cursor is unusable
  • Sometimes - I can sometimes use Cursor
  • No - Cursor works, but with this issue

The more details you provide, the easier it is for us to reproduce and fix the issue. Thanks!

Hi,

I am experiencing the exact same issue as described in this thread. My usage dashboard shows a massive spike in costs because prompt caching has suddenly stopped working when I use the Auto mode.

I’ve performed some tests and can confirm that the system is sending my entire codebase as fresh input every time I’m in Auto mode, while manual selection seems to handle the cache correctly.

Here is the detailed bug report and the evidence:

Where does the bug appear (feature/product)?

  • Cursor IDE

Describe the Bug

There is a major discrepancy in prompt caching behavior between “Auto” mode and manual model selection. When using Auto mode, the system consistently fails to use prompt caching (0 Cache Read / 0 Cache Write), forcing the entire context to be resent and billed as fresh Input tokens. When manually selecting a model (like Claude 4.5 Haiku), caching works perfectly.


Steps to Reproduce

  1. Use the “Auto” model selector in a project with a large context.

  2. Ask a question and check the dashboard: Cache Read is 0 and costs are high.

  3. Switch to a manual model (e.g., Claude 4.5 Haiku).

  4. Ask a question: Cache Read/Write is active and costs are normal.


Expected Behavior

Auto mode should utilize prompt caching to avoid redundant token billing, especially on large requests.


Screenshots / Screen Recordings

1. Auto Mode (Bug): 0 Cache Read / High Input Costs

2. Manual Selection (Normal): Cache working correctly


Operating System

  • Linux - Ubuntu 24.04

Version Information

IDE:
Version: 2.6.12
VSCode Version: 1.105.1
Commit: 1917e900a0c4b0111dc7975777cfff60853059d0
Date: 2026-03-04T21:41:18.914Z
Build Type: Stable
Release Track: Default
Electron: 39.6.0
Chromium: 142.0.7444.265
Node.js: 22.22.0
V8: 14.2.231.22-electron.0
OS: Linux x64 6.17.0-1011-oem


For AI issues: which model did you use?

Auto mode (problematic) vs. Claude 4.5 Haiku (working)


Additional Information

My company enforces Privacy Mode. The Auto mode seems to bypass the caching mechanism entirely, which is a critical financial issue when working on large repositories.


Does this stop you from using Cursor?

  • I will if I reach spend limit

Thanks for your help

Update after additional testing:

The issue is still present three days later. The first occurrence was on March 9th, and as of March 12th, prompt caching is still not working for me when using Auto mode.

Since my original post, I tried several things to rule out local configuration or environment issues:

  • Tested on another repository with a different and smaller codebase

  • Reset the repository index

  • Deleted the local Cursor configuration and re-logged into my account

  • Reset my Cursor settings

  • Created new agents and new chats

  • Tested different modes (Ask / Plan / Agent)

  • Tested on another computer (macOS) using the same account

None of these tests changed the behavior.

The result is always the same:

  • Auto mode → 0 Cache Read / 0 Cache Write

  • Manual model selection (e.g. Claude Sonnet) → caching works normally

So the problem does not appear to be related to:

  • local machine configuration

  • repository indexing

  • project size

  • operating system

One additional note: Privacy Mode is enforced by my company, so I cannot disable it for testing.

At this point it seems very likely that the issue is related specifically to Auto mode routing, rather than a local configuration problem.

If anyone from the Cursor team is investigating this, I’m happy to provide additional details if needed.

1 Like

Update:

After installing the Cursor update released today, prompt caching started working again when using Auto mode.

New requests in Auto mode now show normal Cache Read / Cache Write values, and token usage has returned to the expected behavior.

So it looks like the issue may have been resolved by the latest Cursor update.

Would be great to know if the update fixed it for others as well.

Hey, glad to hear the update fixed it.

This matches what we saw in a similar report Auto mode: Prompt caching not working. Version 2.6.12 routed requests differently, which made Auto mode prefer models that do not support Anthropic-style prompt caching. Version 2.6.18 and newer fixed this.

If caching drops again in Auto mode, grab a Request ID in the top-right of the chat, then Copy Request ID, and share it here so I can check the model routing for your request.

I’m marking this as solved based on your update. Let me know if anything changes.

This topic was automatically closed 22 days after the last reply. New replies are no longer allowed.