Where does the bug appear (feature/product)?
Cursor IDE
Describe the Bug
When Cursor AI applies edits to files encoded in Windows-1252, the € symbol (euro sign, byte 0x80) gets corrupted and replaced with “?”. This appears to be an architectural limitation where AI-generated text is always UTF-8 encoded before being inserted into files.
Steps to Reproduce
Create a workspace with files.encoding set to windows1252 in settings
Open a PHP file containing the € symbol (e.g., "Prezzo: € 1.000")
Ask Cursor AI to make any modification to that file
Accept/Apply the AI changes
All € symbols in the modified sections become “?”
What I’ve already tried (nothing worked)
Setting files.encoding: windows1252 in .vscode/settings.json
Setting files.autoGuessEncoding: false
Creating a multi-root workspace (.code-workspace) with per-folder encoding settings
Adding .cursorrules instructing AI to preserve € symbols
Setting files.eol to CRLF
Expected Behavior
Cursor should detect the target file’s encoding and convert AI-generated text accordingly before applying edits, preserving special characters like €.
For AI issues: which model did you use?
All models (tested with Opus 4.5, same issue)
Additional Information
This makes Cursor AI unusable for legacy projects that cannot be migrated to UTF-8. Many enterprise/legacy codebases in Europe use Windows-1252 encoding and contain currency symbols (€) throughout the codebase.
Is this an architectural limitation? Is there any plan to support encoding conversion for AI edits?
Does this stop you from using Cursor?
Yes - Cursor is unusable (for this specific legacy project)
For AI issues: which model did you use?
Model name (e.g., Sonnet 4, Tab…)
For AI issues: add Request ID with privacy disabled
Request ID: f9a7046a-279b-47e5-ab48-6e8dc12daba1
For Background Agent issues, also post the ID: bc-…
Additional Information
Add any other context about the problem here.
Does this stop you from using Cursor?
Yes - Cursor is unusable
Sometimes - I can sometimes use Cursor
No - Cursor works, but with this issue
The more details you provide, the easier it is for us to reproduce and fix the issue. Thanks!
Hey, thanks for the detailed report. This is a known issue: the Agent in 2.4.x forces all files to be saved as UTF-8 and ignores the original encoding.
You’re not alone. There are already 8+ similar threads for Windows-1252, EUC-KR, and GB2312/GBK. The team is aware and the bug is logged.
Workarounds:
Manually re-save after every AI edit:
After the Agent changes the file: bottom-right corner of the editor → click the encoding → “Reopen with Encoding” → Windows-1252
Then “Save with Encoding” → Windows-1252
Annoying, but it works
Try CTRL+K (inline edit) instead of Agent. It might break encoding less often, but it’s not guaranteed
I see you tried .cursorrules, .code-workspace, and files.encoding. You’re right, this is an architectural issue in how Agent applies edits. Settings are ignored right now.
Thanks for confirming — that matches what I’m seeing.
Could you share the public issue ID / tracker link for this bug, and whether there’s an ETA (or at least the target version) for a fix? This is a hard blocker for legacy Windows-1252 codebases (common in EU enterprise projects).
Manual “Reopen/Save with Encoding” after every AI edit is not a viable workaround in real workflows.
Also: is the fix planned specifically for Agent mode only, or for all AI-applied edits (including inline edits / Ctrl+K)?
I’m sorry, but this issue of reopening the file isn’t working. This problem is already quite annoying because it ends up altering important and functional parts of the code, as it might interpret a character like “-” within a .split as special and end up changing it as well.
Thanks for the update. Yeah, you’re right. The “Reopen/Save with Encoding” workaround doesn’t help because the corruption happens on the server side before the code comes back to the editor. By the time you see the diffs, the characters have already been replaced with “?”.
What actually works right now (confirmed by other users):
Roll back to version 2.3.41 from Download · Cursor and turn off auto-update
A few users in similar threads confirmed the issue doesn’t happen on 2.3.x
I know it’s not ideal, but it’s the only option for legacy codebases until a fix is released.
I’ll roll back to 2.3.41 and disable auto-update as suggested, since this is currently a hard blocker for Windows-1252 legacy codebases.
Really appreciate the help and the confirmations from the community. Cursor is genuinely very valuable in my workflow, so I’m looking forward to a proper fix in 2.4+ when it lands.
If there’s any public tracker/issue link or release note I should watch, please share it — I’ll happily retest as soon as a build includes the fix. Thanks again!
I had the same problem. Solved adding a new user rule:
OverviewAlways preserve UTF-8 encoding and special characters (accents, ç, ñ, etc.) when modifying code. Never replace special characters with question marks or other symbols. All file edits must maintain the original character encoding and preserve all Catalan characters (à, è, é, í, ò, ó, ú, ç, ñ, etc.) exactly as they appear in the original code.
I can confirm that for accents and many special characters, user rules help on my side too.
Unfortunately, for the € symbol (Windows-1252, byte 0x80) nothing has worked: as mentioned above, the corruption seems to happen server-side when applying edits (especially in Agent / 2.4.x). By the time the diff comes back to the editor, the character has already been replaced with “?”, so rules can’t really prevent it.
If you don’t mind, could you try a quick test on your side?
The exact repro steps are in my post above (a Windows-1252 file with something like Prezzo: € 1.000, then any AI edit and apply).
I’d love to know whether, with your user rule, the € stays intact or still gets turned into “?”.
Originally, my codes are UTF-8 encoded, so it is not exactly the same issue as you experienced. I tried with the € symbol and it works well.
The second test I made for you is to generate an ANSI-encoded file (I think it is the same as the mentioned Windows-1252) with accents and the ‘€’ symbol. In this case:
a) Before any agent-modification, the special characters are already misdisplayed in the cursor IDE
b) After a agent-modification, the file becomes UTF-8 codified, and ALL the special characters are lost.
a) and b) effects are with my user-rule mentioned active, so in this case seems the user rule is not helping.
Thanks a lot for testing this so thoroughly — super helpful.
That confirms what I’m seeing: the user rule may help when the file is already UTF-8, but it doesn’t solve the real blocker for ANSI / Windows-1252 files. In your ANSI test, Cursor already mis-displays special chars before any Agent edit, and after an Agent modification the file gets converted to UTF-8 and the special characters are lost anyway — even with the rule enabled.
So it looks like this isn’t something user rules can fix, and it’s likely an Agent/apply-edits pipeline / encoding handling issue (possibly server-side) rather than prompt guidance.
For now, the only viable workaround we’ve found is rolling back to Cursor 2.3.41 and disabling auto-update for legacy Windows-1252 codebases, until a proper fix lands. If you hear of any public issue/tracker link or a build that specifically mentions encoding preservation for Agent edits, please share — I’m happy to retest immediately.