Millions tokens without cache read and extremely high cost

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

Cache Read 0
Cache Write 0
Input 3,134,794
Output 14,216
Total 3,149,010
Cost $16.03

Steps to Reproduce

None

Expected Behavior

Cache Read 2,744,297
Cache Write 112,200
Input 520
Output 27,651
Total 2,884,668
Cost $2.77

Operating System

MacOS

Current Cursor Version (Menu → About Cursor → Copy)

Version: 2.3.34 (Universal)
VSCode Version: 1.105.1
Commit: 643ba67cd252e2888e296dd0cf34a0c5d7625b90
Date: 2026-01-10T21:17:10.428Z
Electron: 37.7.0
Chromium: 138.0.7204.251
Node.js: 22.20.0
V8: 13.8.258.32-electron.0
OS: Darwin arm64 24.6.0

For AI issues: add Request ID with privacy disabled

a124aa9e-e426-4d2a-af95-948a5db727ee

Does this stop you from using Cursor

No - Cursor works, but with this issue

You didn’t provide any steps to reproduce or the model you used.

It just happened.

I just created a new chat session in agent mode with 4.5 opus, and asked one question(349k no cache) and a follow-up question(3.1M no cache).

Before this, the cursor cache read worked as expected.

1 Like

Hey, it looks like you have Privacy Mode enabled, which blocks us from diagnosing the issue on our side.

For the team to investigate, please disable Privacy Mode:

  • Cursor Settings > General
  • Privacy section
  • Turn off Privacy Mode
  • Reproduce the issue again
  • Copy the new Request ID

We suspect summarization might be breaking caching for requests this large (3,1M tokens). That’s a really huge amount. Summarization usually kicks in for long chats and can change the context in a way that stops Anthropic’s cache from working.

You can also try:

  • Starting a new chat instead of continuing a long one
  • Using shorter requests

Send the new Request ID after turning off Privacy Mode, and the team can check the details.

This topic was automatically closed 22 days after the last reply. New replies are no longer allowed.