*sigh* more optimization hell: Summarizing chat neverending

Describe the Bug

Typing along - 50% context window - cursor decides to summarize the chat. Forever. WHY, I dont want this. Pleas estop trying to be smarter than the user, it’s just beyond frustrating. If I want to summarize I will. If I want to shrink the context window, I will. If there is an error condition, flag it, and let me deal with it. Stop being “helpful” with automagic.

Steps to Reproduce

No idea – type stuff, get stuck in hell.

Expected Behavior

An error message or something other than an infinite loop.

Operating System

MacOS

Current Cursor Version (Menu → About Cursor → Copy)

Version: 1.3.9
VSCode Version: 1.99.3
Commit: 54c27320fab08c9f5dd5873f07fca101f7a3e070
Date: 2025-08-01T20:07:18.002Z
Electron: 34.5.1
Chromium: 132.0.6834.210
Node.js: 20.19.0
V8: 13.2.152.41-electron.0
OS: Darwin arm64 24.5.0

Additional Information

Bricked the chat, every message just keeps triggering the summarizing function.

Does this stop you from using Cursor

Yes - Cursor is unusable

3 Likes

I’ve not been able to reproduce this. Can you please update to the latest version and see if the issue persists?

Please let us know if you face any issues on the latest version.

Haven’t seen this in the latest versions. Can close this out.

2 Likes

Okay I spoke to soon. went into infinite loop, this was after several back and forths between GPT-5 and Opus models, and final request to Opus. Context size was 51.7% on GPT5, and 80%+ on Opus. Infinite hang.

Please NEVER EVER summarize. Or execute this code path. This should not be in the code base. Continue on until the context window is full (or nearly full), then ask user what they want to do…. Automated solutions like this do not spark joy. We’re way under the context window in both cases here, this should not exist/trigger.

Now just a random summarizing for no apparent reason at 19%, and the chat ignored that and just kept on going.. It’s still there if I scroll back up and look for it.

@charles Could you post a Request ID with privacy disabled so we can look into the details? Cursor – Getting a Request ID

Also add here the current Cursor version you use so we can see if it is related to the app.

Next time it happens I will do so. This summarization nonsense is totally breaking the entire product. I have a chat context window at 80%, this ■■■■ runs and crushes it to 12%, and the LLM now is unable to continue properly. How can I say this, do not do this. This is a terrible design pattern. I’m paying for the tokens, stop trying to save me from myself. Do not summarize without asking to. You are destroying carefully crafted context, or appear to be anyway.

I have a fix for this summarization doom loop in the works (will hopefully deploy in the next few days), but it won’t prevent summarization happening automatically; auto-summarization is an intentional choice that is fairly industry standard.

We know summarization is annoying and are working on making it less likely (making tool call results more concise etc.), but the context window is only so big, and we don’t want to trigger an error by exceeding the model’s context limit. I believe our threshold is 90% context window usage before summarization is triggered.

:backhand_index_pointing_right::backhand_index_pointing_left::eyes:

1 Like

Its an industry standard for consumer use. This is a professional tool. We know what we’re doing. You are literally deleting user generated content that cannot be recovered. You should never alter/delete or change user generated content without explicit permission. Any summarization you do loses context, and it does so magically and opaquely likely making it useless, or pretty much useless for many many many contexts. If this has to be a feature because someone thinks its a good idea, at least make it a switch so that people who know what they’re doing can manage appropriately. (Like duplicating chats at earlier points). Or letting me cleanup the context by showing me all the attachments and letting me select what to remove intelligently. Magic never wins.

..and to preserve context, instead of reading and re-reading and re-reading bits and pieces of files continually, have it keep a copy of all the files it has ever tried to read in an updated side context you send with the chat, instead of inline with the chat. With token caching and files that rarely change for context, life will be much much better. Only re-read files if they’ve been updated in the chat. Would also be pleasantly much faster.

  1. I have never encountered a situation where summarization occurs too early. Usually it happens either on time or even later than I would like. I have also not seen Cursor optimizations hurt performance.

  2. If you do not enable summarization, then, unless this causes an error on the provider side, you will get a completely filled context that will self-destruct in inappropriate places of this very context (for example, at the very beginning of the context you may have a fragment of a sentence or some important link left, which will break the Agent’s mind).

  3. The more loaded the context is, the dumber the model. Subjectively, I do not feel this when working in Cursor, but there are generally accepted benchmarks on this matter. The more compact the information in the context window, the better the LLM works.

1 Like

Not if the actual context you need is now missing. Provide options “Summarize” “FIFO” “Manually clean”. Assumptions are the mother of all screw ups as they say.

1 Like

100% Agreed. Every SINGLE time it happens, it starts making mistakes I had it correct very recently, then it spends time undoing it’s damage, and then by the time you’re ready to move forward, you see it. “Summarizing Chat Context”

It forgets the most important nuances, starts trying to “solve” problem with one new feature, by re-writing areas of the code that didn’t have any issues before, and instead of actually fixing the issue, make your code broken in all circumstances instead of only when using your new experimental feature.

It’s a vicious and annoying cycle.

Also, why is it there’s always someone saying “it doesn’t happen to me, therefor…”

It’s like saying, “I’ve never seen a bear breaking into my garbage at night, therefor you didn’t either”.

EDIT: Some of this could be avoided if the LLM was aware that Summarizing was happening and actually checked to see if it was still on the same page before rushing into a ton of changes.

AI is doing the summarization, so AI is ‘aware’ of it.

Could those who still have the issue update to latest version and let us know if it still breaks usability?

Yes, just saw it pop up yesterday after a long fruitful chat, it then a) never appeared to complte, but b) trashed the history so when I switched models I was screwed and lost all my data and hour of great conversation/context building work, and was hosed. Would you for the love of god please make this a freaking option. “DO NOT SUMMARIZE EVER on|off”. I have options, I can switch to the 1M token sonnet model for one. The second summarization annihilates my data without asking me, I’m screwed and have no recourse. This is truly, truly, the worst behavior ever. Dont destroy peoples data (and no way to retrieve) without their permission, its the worst frustrating UX anti-pattern ever. LLM magic here will never work, because by definition every token counts in what predicts the next token. You change the history, you change everything.

1 Like

Still happening.

Can I just say, this feature some day will make me and others abandon cursor. For the love of all things holy, I just spent a bunch of money carefully constructing a chat history, then you go ahead and obliterate it and spend more of my money to do so. This is so so so needlessly infuriating!

The gods hate me nearly as much as I hate this feature.

For context optimization, why dont you keep a shadow copy of the files being read by the LLM and substitute them in as a file context instead of inline in the chat. Then you can keep them in the LLM cache, and only update them when the files actually change as references. You can then show the cache of all the files and I can update/delete as needed and the LLM doesnt have to keep reading, and re-reading, and re-reading the same content over and over blowing the context window.

Can you folks add lobotomy-options for us to manipulate? Whatever the summarize does, its annihilating anything, I’d like a FIFO option or something. Or identify duplicate content int he chat and remove it (e.g. code writes that are more than a diff). The current re-write basically means after summary I am better off starting over then trying to continue the chat since all useful context gets obliterated and squashed and its unable to continue doing what its doing. Especially difficult when its literally in the middle of implementation and then it gets summarized out from underneath and it loses the plot.