When using larger conversations with extensive context the little circle with the percent beside it seems to increase, then when you hit the limit of the context view it seems to cut itself down to about 32% or at least a significant reduction.
This also seems to corrispond with the AI entirely loosing it’s position and restarting the conversation into a loop that bruns a lot of api calls on nothing.
Steps to Reproduce
Use latest version of cursor Version 1.3.2, vscode version: 1.99.3
Start a conversation and have it work on a singular document after peforming research deep dives into a code base. Then durring the processes, when the context percentage hits it’s max, it will either reset, or jump to a lower value causing the ai to start a loop, repeating the same task over and over without resolution.
Expected Behavior
AI continues context without the ai getting stuck with a constantly resetting context window
Operating System
Windows 10/11
Current Cursor Version (Menu → About Cursor → Copy)
Hi @Sean_Murphy and thank you for your bug report.
The looping issue has been fixed. If you see it again please post a Request ID with privacy disabled so we can look into the details? Cursor – Getting a Request ID
Note that once a context reaches the limit it will be summarized to avoid excessive token usage, which is shown as reduction in context percentage.
Hmm. Is this the new industry approach to resolving context overrun limits? It has several problematic caveats—for example, context compression is often overly simplified, which can cause context drift when working on very complex tasks, ultimately leading to technical debt.
Claude’s code is currently experiencing significant issues with this approach. Can you confirm whether your context summarization system can be disabled, or at least modified to preserve the original, genuine context—especially for technical systems—rather than losing critical details in the summary?
Note that once context limit is reached there is usually a lot of information in the context that may lead to AI making more mistakes due to confusion or conflicting info. Summarization helps reduce such issues.
There is currently no setting that would disable summarization.
When dealing with complex tasks its important to break them down, either by yourself or by AI. Such details breakdown can be written to an MD file for tracking and AI can be instructed to follow those details or requirements.
In such cases even with a summarization AI would not lose track or details.
Rolling context is essential for working with complex, non-public, internal documentation—the kind of material these models were never trained on and simply cannot be summarized effectively.
I’m often dealing with massive, disorganized documentation sets that support complex engineering systems—things like custom hardware, proprietary architectures, or undocumented edge cases. These are not web articles or open standards. They’re internal, messy, and full of nuance. Summarization consistently strips out that nuance, and I end up in a loop of correcting context drift, re-priming the AI, and wasting time and tokens.
This is more than inconvenient—it breaks the workflow entirely. Full-context priming is what allowed me to align the model with domain-specific knowledge it has no prior exposure to. That’s the real power of a large context window: building understanding in real-time. Summarization can’t replicate that.
This change needs to be optional. Rolling context isn’t for everyone—but for those of us working with private, highly technical, and unsummarizable information, it’s not optional. It’s foundational.
Please reconsider adding a toggle for that chat compressor.
Yes, I’ve often said the requirements / docs that govern expected behavior need to be pinned in the context and use a very long TTL cache.
This feature would be a HUGE benefit - I don’t see how they don’t provide it. The cache is not used effectively so we end up working more and paying more.
I’ve run into another issue with the summarization/context methodology. I have AIs doing deep dives into documentation sets and writing analyses of certain codebases to identify key systems. Every time it hits the summarizer, it completely loses track and writes garbage into the documents. This was not an issue with rolling context.
Now I have multiple document analyses on codebases that are 10+ pages long that need to be entirely scrapped. This was never a problem before. That was an egregious number of tokens I just burned because of this.
Frankly, I’m paying for these tokens—I should be able to decide how to use them to benefit my work. Summarization needs to be optional, and it’s unacceptable that it’s a forced feature.
I’m tired of having to babysit this tool. It’s making my work more difficult, and if this can’t be made optional—especially since I’m paying for the tokens—I’m seriously considering canceling my plan with you all.