Where does the bug appear (feature/product)?
Cursor IDE
Describe the Bug
When the agent edits a file encoded as UTF-8 with BOM, it silently strips the Byte Order Mark when rewriting the file. This changes the encoding to plain UTF-8 (without BOM), which corrupts projects that depend on BOM-encoded source files.
I’m the maintainer of msaccess-vcs-addin, a popular open-source add-in that builds Microsoft Access databases from source files. The add-in expects UTF-8 BOM-encoded files — if the BOM is missing, the build misinterprets the encoding and the build fails. The repository has hundreds of source files, so even a single stripped BOM can break the entire build.
I’ve attempted to work around this with Cursor rules that instruct the agent to verify the BOM after every edit, plus a PowerShell script to re-insert missing BOMs. Even with these measures, the rules are not consistently followed and builds still get corrupted regularly.
Steps to Reproduce
- Open a repository in Cursor containing files encoded as UTF-8 with BOM.
- Use the agent to edit one of these files.
- After the edit, inspect the file encoding (e.g., check the first three bytes for the BOM signature
EF BB BFusing a hex editor, or run Format-Hex -Path .\file.txt -Count 3 in PowerShell). - Observe that the BOM has been stripped — the file is now plain UTF-8.
Expected Behavior
Cursor should detect the file’s original encoding (including the presence of a BOM) before making edits, and preserve that encoding when writing the file back to disk. Edits should change only the content the agent was asked to modify, not the file encoding.
Operating System
Windows 10/11
Version Information
Version: 2.4.31 (user setup)
VSCode Version: 1.105.1
Commit: 3578107fdf149b00059ddad37048220e41681000
Date: 2026-02-08T07:42:24.999Z
Build Type: Stable
Release Track: Default
Electron: 39.2.7
Chromium: 142.0.7444.235
Node.js: 22.21.1
V8: 14.2.231.21-electron.0
OS: Windows_NT x64 10.0.26100
For AI issues: which model did you use?
Opus 4.5, Compose-1, (and others) It does not seem to matter which model you are using.
Additional Information
This likely affects any encoding-sensitive workflow on Windows, not just BOM. The suggested fix is straightforward: before writing a file, read and record its original encoding; after applying edits, write it back with the same encoding. This is standard behavior for any editor and would also address potential issues with other encodings (UTF-16, Latin-1, etc.).
Does this stop you from using Cursor
Sometimes - I can sometimes use Cursor