Context limitation suddenly limits length

Describe the Bug

I had been working with Cursor for the past 4 months, and I was able to pack a ton of code into each request. I often pay 300-400 dollars in costs, and that’s reasonable, I use it a LOT, almost every day.
Of course, I don’t want to overpay or pay extreme costs, to get it working, but I don’t see why limiting the context so much should be the case.
More on the problem: it says “This file has been condensed to fit in the context limit.”
I don’t see how this suddenly should limit users workflows.

Steps to Reproduce

Just try to add a combined context size of more than 150k tokens and see. It says “This file has been condensed to fit in the context limit.”

Expected Behavior

I should be able to add a LOT of context, as much as I want, up to the models limits, or at least roughly halfway there.

Screenshots / Screen Recordings

Untitled.png

Operating System

Windows 10/11

Current Cursor Version (Menu → About Cursor → Copy)

Version: 1.2.4 (user setup)
VSCode Version: 1.99.3
Commit: a8e95743c5268be73767c46944a71f4465d05c90
Date: 2025-07-10T17:09:01.383Z
Electron: 34.5.1
Chromium: 132.0.6834.210
Node.js: 20.19.0
V8: 13.2.152.41-electron.0
OS: Windows_NT x64 10.0.22631

Additional Information

I am considering switching to a competitor if this does not get better. I really like the workflow (obviously!) but this can be a dealbreaker for me unfortunately.

Does this stop you from using Cursor

Sometimes - I can sometimes use Cursor

hi @g_j thank you for the detailed post.

The message shows that a file is larger than a models context limit.

What happens is that once a model gets over a certain amount of context it may start to hallucinate due to a lot of differing information contained within the context

  • User request prompt
  • Files attached
  • Rules
  • MCPs
  • Tools available
  • System prompt
  • Tool calls and resulting information (code,…)

That info accumulates with each tool call & follow up requests that are separate API calls to AI provider but they are combined in the same chat. It also adds up as tokens used from each API and the cost to you.

Could you share which model you are using and if you use MAX mode?

You can find each model regular and max context size here in the documentation:

Hi,

Thanks for the response.
I did not attach a lot of extra content, basically I only have a very short user prompt and the lengthy file. This used to work. My code file has been growing steadily, now at 6000 lines (250k characters). I am aware this is a lot, and should be refactored, and I may do it. Also, the Agent mode was not working well with this particular lengthy file, but the Ask mode was still perfect.
Up to until about a week or so ago, I was able to attach not only this file, but several others, and the inference results were great, very useful. Now it is tangibly different, as these “compressed” files don’t seem to really reach the models decision making (overcompressed?).
To clarify I dont have significant:

  • Rules
  • MCPs
  • additional Tools
  • extra System prompt
  • no tools, so no tool call feedback loop either

I can not even send this file in the first message of a chat conversation, so the accumulation issue could be ruled out I think.
This happens with every model that I’ve been testing with, but to name a few, my main ones are o3 (MAX) gemini-2.5-pro (MAX) (I pretty much always use MAX mode)

Maybe, if I downgrade to Cursor 1.0 it will solve the issue?

To clarify, and before everyone starts laughing at me, this 6000 line code file is one of the central files of my large project, and the project as a whole has 40k+ lines of code. The unfortunate situation is this 6000 line long file file has grown disproportionately (should be refactored), but the issue also arises if I try to add multiple shorter files to the context. The limit seems to be around 50-100k tokens very roughly.

I can not guarantee that 1.0 will improve things as there were several feature additions and changes in AI calls from more recent launches. You can try it out of course but only latest version receives updates.

Gemini should be able to support 1M context size in MAX, though there were also some issues with Gemini reported in last days.

Its understandable that sometimes large files are used even if not ideal, no worries.

I will pass your report through to Cursor team for review.

Thank you very much for passing this message along. It has been very reassuring to receive attention on this matter, so thank you. Hopefully this issue can get resolved soon, and normal usage can resume :slight_smile: Have a great day!

1 Like

You can use (with caution) my Agent Docstrings on that file and copy only ToC to the prompt window. Agent will find the rest of the context on its own: through semantic search or just read it.

1 Like

Thanks, looks interesting, will check it out. Sounds like a neat optimization trick.

I would probably use it to create roadmaps, or crossroads signs to the agents, so it has a general overview of the codebase, rather than to compress a single files context. I prefer having the full context on current files.

I’m planning to add an option that will generate a global ToC for the entire repo.

But I need to seriously rework Agent Docstrings first, and I’m currently waiting for the chance to get to it.

If you’ve got any more ideas – I’d love to see them as new issues!

And don’t forget that Cursor indexes the repository by itself. However, it is not very clear what information the Agent gets from this Index.

Hi again,

I have updates regarding the issue. We have checked with other team members of the same organization I’m in, and at least 1 more team member has the same issue. Additionally, both my computers suffer from this, so it’s not a local misconfiguration.

Hope this helps diagnosing the issue faster. I think it narrows down the possible issues which can be helpful.

im having this as well. a 36k token file getting condensed on the strongest models. the result is horrible output. this didn’t happen yesterday on the same file. what is this ■■■■

Hi,

It seems that the issue was resolved. Not sure how, somebody at the Cursor dev team might have fixed the configs, or pushed some update, but now it works great! Thanks again for the attention and the fix.

G.

1 Like

Thanks for the update!

@condor unfortunately it did not! (it might have been resolve temporarily yesterday for a few hours)
Using claude sonnet/opus with MAX mode I’m getting the issue on a 36k token file (that again, used to work great a few days ago)
If I switch to gemini or gpt-4.1 with 1M tokens it doesn’t show the error.

This is really unreliable, how can I depend on cursor if I can’t even use a fraction of what the model offer (36K on a 200K model)

Hi! It’s showing an outline of the attached file to the model, and then the model will pull in parts of the file (or read the entire file if it thinks it needs all the context of the file).

The model can use the entire 200k context, its just limited what gets shown per user message so that the model has plenty of room to pull in context via tool calls (and for future user messages in the conversation) without filling up the full context window too quickly (which then forces a summarization).

@Zackh1998 thanks. This wasn’t the behavior a few days ago. My fear (that maybe solved in 1.3?) is its unknown (to me) what gets included in that outline. I have instructions, specs, status of each. I don’t know if what I need for the prompt to be successful is included or not. With this outlining, cursor is saying “trust me bro”. Would be better to udnerstand exactly what gets outlined, and the reasoning for it. If its for context optimization, I’d want the option to not do the summerization, especially if what I include in the context is well below the model’s context window limit (if something gets clipped, its a different story, but I’d like control there too)

This topic was automatically closed 22 days after the last reply. New replies are no longer allowed.