I asked the agent to do some edits in some files, and after doing the edits, some characters in the code now are displaying in a very weird way. The agent itself edited those characters and turned them into weird mode, but if you check the cursor diff, you can see there is an inconsistency on how those characters were there originally and how they show now in the diff. Check the screenshot, compare the right pane (chat view) with the left pane (opening the file and seeing the diff). The diff is incorrectly showing those characters as you can see in the screenshot.
Steps to Reproduce
Have the AI edit a file for you that includes some comments with characters like ‘-’ or ‘á’ and see if it also breaks the cursor diff view.
Expected Behavior
The ‘red’ part of the diff should properly show the characters that were there before the edit, not some weird obscure nonsensical character combinations.
This is the first time i see this, and have happened more than 3 times already today. I had to direct it to go edit-by-edit instead of doing a full run and it behave correctly, but at some point it derails and chooses chaos again.
Hey, thanks for the report and the screenshot. In the diff view, the corrupted characters are clearly visible.
This looks like an encoding mismatch. The agent may be reading your file as UTF-8 even though it’s actually saved in an older Windows encoding (for example, Windows-1252). That can break diacritics like á and also turn special dash characters into garbage.
A couple of questions to narrow it down:
What encoding is the file saved in? You can check this in Cursor’s status bar at the bottom right. It should say something like “UTF-8” or “Windows 1252”.
If it’s not UTF-8, try converting the file to UTF-8. Click the encoding in the status bar, then “Save with Encoding” and pick “UTF-8”. After that, check if the agent keeps the characters correctly.
This is a known risk area on Windows when files aren’t UTF-8. Let me know what encoding your files use.
And actually i didn’t give the full context: i’m using cursor on windows but right now i’m working on a debian machine through SSH with cursor, so the files there have nothing to do with windows even tho i’m accessing them through windows.
I hope i haven’t confused you more, i’m sick with a fever and have been working for too many hours already for my brain to function properly.
Ok, that’s important context. UTF-8 files on Debian over SSH changes things.
A couple questions so I can understand:
Is it only broken in the diff view, or are the actual files corrupted too? If you open the file directly (not via diff), do the characters look normal or broken there too?
What locale is set on the Debian machine? Can you check by running locale in the terminal?
Can you share the chat Request ID where this happened? (Top right of the chat → Copy Request ID)
There are similar bug reports about encoding corruption during AI edits, but mostly for non-UTF-8 files. Your case with UTF-8 over SSH might be a separate bug. The request ID will help the team dig deeper.
Also, feel better and don’t overwork yourself while you’ve got a fever.
Okay, now i’m having a hard time reproducing it (i reverted all the weird character edits to a point i can no longer restore the corrupted code), and i’m trying to guide the agent into introducing more changes so i can force the error again, but now it’s behaving correctly (why clanker, why ). But i can respond to question 2:
I can’t properly respond to question 3 also bc all the relevant request ids are gone now, but if you wanna check the current chat status, req_id = 817da61b-fb0c-4ce0-9586-80af47f04b3d
My hypothesis is that, even if i’m connected through SSH to a Debian machine, windows is still in the middle when it comes to parsing text to the agent and vice-versa, there might be some encoding issue in all that windows middleware; and maybe depending on the way the agent chooses to edit the code, it either happens or not.
A bit more of context that i just remembered, when i directed it to edit JUST a single part of a file, most of the times it did it properly, but SOMETIMES it tried to edit the whole file even if the changes were just 5 lines on a 500 line file, and thus replacing the semi-weird characters with those ultra-weird ones. And, i work with this debian ssh setup everyday since more than a year ago with cursor, and this is the first time i’ve seen it happen.
Locale looks fine. C.UTF-8 shouldn’t cause any issues.
Your idea about Windows middleware during SSH could be right. The most interesting part is that it breaks only when the agent rewrites the whole file instead of making small edits. That could mean the encoding layer handles the content differently on a full rewrite than it does with str_replace style changes.
Here’s what I suggest:
If it happens again, copy the Request ID from this chat right away and send it here. Also check whether the files on disk are actually corrupted, or if it’s only in the diff view, open the file directly, not in diff.
If you can reproduce it with a new ID, I’ll pass it to the team to investigate.