I’ve encountered many situations where the model gets stuck in a loop trying to find portions of the code by reading 200 lines each time, until it exhausts the 25 tool call limit.
I think this could even be considered a bug, because:
- It makes the model reach the long context limit faster (due to verbosity when using the tool calls and the need to use input tokens to call the tool).
- We know long contexts degrade model performance.
I’ve noticed this happens more with claude (which I guess is the most used model provider) than with the other models.
Why the Current Approach May Be Flawed
I know the Cursor dev team probably implemented the 200-line read limit to save token usage and increase capacity, but:
- I believe this implementation is doing the opposite.
- It is frustrating users instead of helping.
Suggestions to Fix This
Solution 1 (Medium Difficulty) — Programmatically Handle File Read Requests
Cursor could automatically:
- Detect that the agent used “read file from line
x
tox+200
” 5 times in a row. - Add an option to use more lines in the next tries.
- Replace the verbose tool call logs after the search is finished with a summarized version of what was found.
Solution 2 (Easy Difficulty) — Improve System Prompt Instructions
Tweak the system prompt to:
- Make the model aware of repeated 200-line reads.
- Instruct it to read more lines in those cases.
- Instruct the model to avoid being verbose when using the read tool multiple times.
- Instruct the model to ask the user where the searched code is if it uses more than X tokens trying to find it.
Note:
I’ve already tried to adjust Cursor rules myself, but it never worked, probably because the master system prompt overrides everything very strictly, prioritizing tool calls and token usage savings.
Solution 3 (Harder) — Introduce Better Codebase Indexing
Add a feature to improve the codebase indexing system. For example:
- The model tries to search for code and fails.
- The model uses a
code_search
tool that employs better indexing techniques. - The indexing provides better high-level and intermediate-level context about the related files the model is trying to explore.
Evidence
Here are some screenshots of this happening and the context getting longer and longer, making the model’s performance noticeably worse:
It would be cool if people seeing this also pasted some screenshots so the devs can see a potential feature that was thought to be saving token usage (and making the availability of the models higher) might be actually increasing it and making the users experience more frustrating than it needs to be. Claude is already verbose as it is, if the IDE encourages it’s verbosity, it becomes an even bigger problem, since model providers haven’t solved the long context problem yet.