Gemini 2.5 Pro’s tool functionality is unreliable—even basic tasks like editing files fail, with the tool often claiming 'no changes' incorrectly

Absolutely horrid experience with Gemini 2.5 Pro in the last few days in Cursor. Imagine having the most basic tool not working. Gemini gets stuck in a loop trying to apply changes.

So much model usage thrown away. A few chats wasted + having to dissect what has been applied and what hasn’t.

Please, fix this.

Please fix this! Gemini 2.5 Pro has been horrible these past few days with this issue, which is such a shame.

Hi there. Working exclusively with Gemini 2.5 Pro. In his higher priority rules, he is instructed to resync his internal view of a file if and edit seems to not have been applied or if the edit_tool is reporting errors, before trying the edit again. Provides great results and fixes most edit difficulties.

2 Likes

I reported this in the 27th of April already. I think that’s why Claude is under so much pressure. GPT is limited by the editing it can do in your files and so forth. And Gemini 2.5 Pro cannot work with Cursor’s tools. This is what I sent cursor and their reply to me nine days later was look in the forum if someone’s found a workaround.

Here is a conversation with the model to highlight the situation (from the model Gemini Pro):

"* Automated edits have proven unreliable and costly in terms of lost time and progress.

Manual edits are not a sustainable solution given you’re not a coder.

Limited Read-Back: When I read back just the line I intended to change (or a few lines around it) using read_file, it looks correct because the intended change was made there. However, this doesn’t catch unintended changes or subtle corruption (like extra spaces, invisible characters, or broken syntax) just outside that small verification window. The diff provided by the tool’s response can also be misleading if the applying model made a mistake it didn’t report correctly.

No True Preview: Unfortunately, the tools don’t offer a reliable “dry run” feature that shows the exact complete file content as it will be after the edit. I could try to construct what I think the full file should look like after my proposed edit, but that relies on me perfectly predicting how the applying model will behave, which isn’t guaranteed."

also:

"Reporting the problems you’ve experienced with the automated editing tool directly to Cursor is a very good idea. Providing them with feedback about its unreliability, especially how simple edits can lead to unexpected file corruption that isn’t caught by verification, is valuable information for them to improve the tool.

While I can’t file the report for you, I can help you summarize the key issues we encountered in this session if that would be useful for your report:

Issue: Attempting a simple HTML attribute change (min=“50” to min=“10”) in templates/advanced_settings.html using the edit tool resulted in the page becoming inaccessible (likely due to template rendering errors).

Verification Failure: The standard verification step (reading back the changed line and checking the diff) did not reveal any error after the edit was applied, yet reverting the file fixed the problem. This indicates the tool introduced subtle corruption outside the immediate edit area or the verification wasn’t sufficient.

Context: Mentioning the previous negative experiences and lost time due to similar edit tool failures would also be relevant context for them.

Impact: These failures significantly hinder the development process and require extensive troubleshooting or reverting to backups, undermining trust in the automated editing capability.
List of edit tool failures:

  1. Failure to Apply Intended Changes:
  • The tool sometimes fails to make the specific additions or deletions requested. For example, when trying to modify app.py, it failed to delete the original route functions and failed to add the new import and app.include_router lines. It essentially ignored the core instructions.
  1. Incorrect/Unrelated Modifications:
  • Instead of applying the intended change, the tool sometimes modifies completely unrelated parts of the file. In the first app.py attempt, it only changed an existing import line. In the reapply attempt, it deleted large blocks of correct, unrelated code (like /save-story, /load-story/ etc.).
  1. Introducing Syntax/Escaping Errors:
  • When creating routes_ui.py, the tool correctly added the code structure but introduced incorrect backslash escaping (" instead of ") within string literals, which caused linting errors and would have likely caused runtime errors.
  1. Subtle/Hidden Corruption:
  • You reported a past instance where a simple HTML attribute change (min=“50” to min=“10”) resulted in the page becoming inaccessible (likely due to template rendering errors). This suggests the tool can introduce subtle corruption (perhaps stray characters, whitespace issues, or minor syntax breaks elsewhere) that isn’t immediately obvious in the changed line itself or the tool’s reported diff.
  1. Misleading Success/Diff Reports:
  • The tool might report that an edit was successful, or show a diff, but the actual state of the file after the edit (as revealed by read_file) doesn’t match what was intended or reported. This makes manual verification (read_file) essential after every attempt.
1 Like

ngl, gemini model is another level of bug

1 Like

+1.
I love Gemini 2.5 Pro, however it is impossible to work with in Cursor right now, it is incapable of the simplest tool calls, even when I tell it exactly what params it needs to use and where.

This started some time in the past two weeks, I can’t put my finger exactly when though.

Please fix Gemini’s interaction with the IDE agent so we can once again use it.

3 Likes

Same here. Been having this issue for the past couple weeks as well, with consistent, almost universal failures with Gemini 2.5 Pro (tested all variations with no improvement) struggling to use the edit tool for at least one or more calls within a request. Sometimes it resorts to outputting the whole file. This is insane. I’m glad it’s not just me noticing that it’s increased lately.

EDIT: I’ve also noticed an uptick in general Gemini bugginess - sometimes Gemini will spiral out of control outputting nonsense, spamming words, etc. or writing broken tool requests. This did not seem to happen at all until the last few weeks, and it’s happening across ALL of my projects, so it doesn’t seem to be specific to something on a project triggering this issue.

Here’s my experience, especially when modifying large C++ project files (nearly 1k lines of C++ code).

Once, I only needed to add an enum and a function declaration in the header file, and implement about four functions at the end of the source file (less than 80 lines of new code in total). The LLM tried the following approaches in sequence:

  1. edit_file
  2. Using terminal sed tools
  3. Reading more and more context, searching more and more header files for similar code
  4. Writing its own script to modify the code
  5. Attempting to write a completely new file and then move/overwrite the original
  6. Eventually giving up and prompting me to modify the code manually

I was persistent and created a detailed implementation document, describing how to find anchors and how to modify the code. I also wrote rules requiring the LLM to modify step by step as a task list. In the end, the task was finally completed, but it still went through many failed attempts.

Throughout the test, I used both claude-4-sonnet and gemini-2.5-pro in max-mode. The results showed that the former wasn’t much better—just slightly. My rough estimate is that over 80% of the time and tokens were spent on code modification.

For similar (or even more complex) modifications, when I used claude code, augment, or gemini cli, the results were excellent. So I don’t think the problem lies with the LLM itself, but rather with Cursor’s tool invocation support.

1 Like

i had a similar bug yesterday as well^^

i get the feeling that that starts from 12-2pm (cest) and persists throughout the day. I guess that is just because murikah is waking up and starting to work. does that reduction of speed and intelligence happen for people in the US as well or only in europe? I believe that both gemini and claude provide a worse agent models to be able to handle the loads… a´does anoyne have insights on that?

1 Like