`edit_file` and `reapply` Tools Lead to Gemini-2.5-pro Misinterpretation and Task Failure

Describe the Bug

The Gemini 2.5 Pro model in Cursor frequently encounters issues when using the edit_file and reapply tools. These issues stem from two primary problems:

  1. edit_file Ambiguity: The diff provided by edit_file after an operation can be ambiguous or not clearly represent the actual change made to the file. While the file modification itself might be correct, the AI model tends to trust the diff at face value. If the diff is misleading, the AI proceeds with incorrect assumptions about the file’s state.
  2. reapply Unpredictable Behavior (Instruction Re-interpretation & Duplication): The reapply tool does not consistently behave as a simple “retry” of the last patch. Instead, it often appears to re-interpret the original instruction from the preceding edit_file call. This is particularly problematic with additive edits (e.g., prepending or appending lines), where reapply can duplicate the content if the initial edit_file call was already successful. This duplication then confuses the AI.

These behaviors cause the AI to misjudge the success or outcome of file operations, leading it to get stuck, make incorrect subsequent edits, or fail the overall task.

Steps to Reproduce

Scenario 1: reapply Duplication Bug

  1. Create a test file:
    • Use edit_file to create a file (e.g., test_bug.txt) with a few lines of initial content.
    Line 1
    Line 2
    
  2. Prepend a line using edit_file:
    • Instruction: “Prepend the line ‘NEW PREPENDED LINE’ to test_bug.txt.”
    • code_edit might look like: NEW PREPENDED LINE\n// ... existing code ...
    • At this point, test_bug.txt should be:
      NEW PREPENDED LINE
      Line 1
      Line 2
      
  3. Call reapply:
    • Execute reapply on test_bug.txt.
  4. Observe the bug:
    • Read test_bug.txt. The line “NEW PREPENDED LINE” will likely be duplicated:
      NEW PREPENDED LINE
      NEW PREPENDED LINE
      Line 1
      Line 2
      
    • The AI, if it only saw the first edit_file as successful, would not expect this duplication from reapply.

Scenario 2: edit_file Ambiguous Diff (Conceptual)

  1. Create a file with specific content:
    • edit_file to create test_ambiguity.txt with:
      IDENTICAL LINE
      IDENTICAL LINE
      Different Line
      
  2. Delete one specific identical line using edit_file:
    • Instruction: “Delete the second instance of ‘IDENTICAL LINE’.”
    • code_edit might be: IDENTICAL LINE\n// ... existing code ... (aiming to keep the first and remove the second).
  3. Observe potentially ambiguous diff:
    • The diff returned by edit_file might show the first “IDENTICAL LINE” as removed, or present the change in a way that doesn’t clearly confirm which of the two identical lines was targeted and removed, even if the file is now correctly:
      IDENTICAL LINE
      Different Line
      
    • The AI might misinterpret this diff, leading to confusion if it needs to act based on which specific line was removed.

Expected Behavior

  • For edit_file:

    • The diff returned should accurately and unambiguously reflect the precise changes made to the file.
    • If a diff cannot be made perfectly unambiguous (e.g., in cases like modifying code with many similar lines), an ambiguity flag should be returned in the tool’s response. This flag would signal to the AI that the diff might not tell the whole story and that extra verification (like a read_file) is advisable.
    • The tool should reliably apply the edit as per the code_edit and instructions.
  • For reapply:

    • Predictable Application: reapply should have a clearly defined and predictable behavior.
      • Ideally, it should intelligently attempt to achieve the original instruction’s goal without introducing new errors like duplication. It should be idempotent where possible (i.e., applying it multiple times to an already correct state results in no further changes and no errors).
      • If the previous edit_file was successful and the file state already reflects the intended change, reapply should consistently make no changes and report this clearly.
    • No Content Duplication: reapply should never duplicate content that was already correctly applied by the initial edit_file operation.
    • Clear Feedback: The output from reapply (diff or status message) must clearly indicate what reapply itself did (or didn’t do), rather than just echoing the previous diff or a generic success.

Operating System

MacOS

Current Cursor Version (Menu → About Cursor → Copy)

Version: 1.1.3
VSCode Version: 1.96.2
Commit: 979ba33804ac150108481c14e0b5cb970bda3260
Date: 2025-06-15T06:35:49.230Z
Electron: 34.5.1
Chromium: 132.0.6834.210
Node.js: 20.19.0
V8: 13.2.152.41-electron.0
OS: Darwin arm64 23.4.0

Additional Information

The core issue affecting Gemini 2.5 pro’s performance is its current tendency to:
a. Trust the edit_file diff output, even if it’s ambiguous.
b. Be unprepared for reapply’s behavior of re-interpreting instructions, especially the duplication side-effect.

Suggestions for the Cursor Team to consider for tool improvement:

  1. edit_file Tool:

    • Improve Diff Generation: Prioritize clarity and accuracy in the diff output.
    • Introduce an Ambiguity Flag: If the model applying the edit or generating the diff detects a high level of ambiguity in how the change could be interpreted from the diff alone, set a flag in the response. This signals the AI to be cautious.
    • (Optional) Richer Feedback: Consider an option for edit_file to return the modified section’s full content or the whole file (if small) alongside the diff, for definitive confirmation.
  2. reapply Tool:

    • Redefine/Refine Behavior:
      • Ensure its “smart” re-interpretation logic is robust against issues like content duplication. It must be better at detecting if the original instruction’s intent is already satisfied.
      • Consider offering distinct modes (if technically feasible for the AI to choose/understand):
        1. A “strict patch re-application” mode.
        2. The current “smart instruction re-interpretation” mode, but significantly improved to avoid errors.
    • Accurate Post-Reapply Feedback: The diff/status from reapply must reflect changes made by reapply itself.
  3. General (Both Tools):

    • Granular Feedback: Provide more detailed status messages that differentiate between various outcomes (e.g., “Edit applied successfully,” “Edit applied, ambiguity detected,” “Reapply: No change needed, state already correct,” “Reapply: Instruction re-applied, resulted in X changes (diff below)”).

Addressing these behaviors will significantly improve the AI’s ability to reliably perform file modifications, especially in multi-step editing scenarios, and reduce user frustration.

Does this stop you from using Cursor

Sometimes - I can sometimes use Cursor

3 Likes

Honestly, if they opted to double raise Sonnet’s price, they could have at least fixed Gemini’s tool calls before doing that.

Two weeks ago the excuse was “we had to wait for the weekend to pass”. There’s been two weekends since then.

2 Likes

Same problem here. I constantly send issue reports, but I don’t know if anyone reads them, hopefully they do. Editing with Gemini models often has issues. I even had to create an additional rule to work around it. I’ve noticed that if I approve all the code as the dialogue progresses, these errors occur much less frequently. Maybe this will help debug the problem.

I’d like to add that I’m using Linux Mint, and this problem has been occurring in some form for about a month now.

Here is an example of the bugged chat. Gemini 2.5 flash here.
cursor_create_text_file_with_repeated_lines(bug).txt (1.5 KB)

The Request ID is c4c16c0b-bf8a-40c7-b8b0-3a8a541e5a0e

Cursor about

Version: 1.1.5
VSCode Version: 1.96.2
Commit: ef5eeb47a684b4c217dfaf0463aa7ea952f8ab90
Date: 2025-06-21T05:25:57.631Z
Electron: 34.5.1
Chromium: 132.0.6834.210
Node.js: 20.19.0
V8: 13.2.152.41-electron.0
OS: Linux x64 5.15.0-136-generic

Same problem. gemini is terrible on editing files constant failures and corrupting

@danperks Could you please comment on the status of this and similar bugs?

Is this bug still not fixed?