Where does the bug appear (feature/product)?
Cursor IDE
Describe the Bug
On Windows, Cursor’s Agent (Write and StrReplace tools) often persists code files as UTF-16 LE when UTF-8 is expected. This is not fixed by standard VS Code/Cursor settings (files.encoding: utf8, files.autoGuessEncoding: false) or by .editorconfig with charset = utf-8.
The code looks correct in the chat, but on disk it breaks runtimes and tooling (Node.js, TypeScript, psql, etc.), causing rework, excessive token usage, and slower development.
Environment
OS: Windows 10/11 (e.g. Windows_NT x64 10.0.26200)
Cursor: recent version (Agent / Composer)
Stack: TypeScript, JavaScript, SQL, JSON (Node + PostgreSQL repos)
Workspace: local project on disk (not WSL-only)
Observed behavior
The Agent uses Write (new file or full overwrite) or StrReplace to edit code.
The file is saved as UTF-16 LE (alternating character + 0x00 bytes, e.g. 69 00 6D 00 70 00 for import).
When running the project:
Node.js: SyntaxError: Invalid or unexpected token at line 1, often showing only i (first byte of import in UTF-16 read as UTF-8).
TypeScript: invalid character errors.
psql: syntax error at the start of SQL files.
Chat output is fine; the issue is only on-disk file encoding.
What does not fix it (already tried)
“files.encoding”: “utf8” and “files.autoGuessEncoding”: false in .vscode/settings.json
.editorconfig with charset = utf-8
Manual save in the editor (after the Agent has already written UTF-16)
Project rules instructing “always UTF-8” — the Agent still writes UTF-16 for new files or large rewrites
Conclusion: the Agent’s write pipeline appears to bypass editor encoding settings and select UTF-16 LE on Windows.
Business impact (tokens and velocity)
Wasted tokens: each corrupted file triggers another Agent cycle to diagnose, explain, run conversion scripts, or rewrite the file — often 2–3× the tokens of the original task.
Extra verification step: after most Agent-created/edited files we must verify encoding (hex dump, node --check, tsc, psql, UTF-16 detection scripts).
False “done” state: the Agent reports success, but the app does not run until encoding is fixed.
Fragile workarounds: community uses hooks (afterFileEdit), avoid Write, copy UTF-8 templates, shell/Python writes, pre-commit fixes — all outside the product.
Community threads (same symptoms):
Steps to Reproduce
On Windows, open a workspace with a Node/TypeScript project.
Ask the Agent to create a new .js or .ts file with the Write tool (or fully rewrite an existing file).
Inspect the file on disk:
Hex: XX 00 XX 00 pattern or FF FE BOM
Or: node --check path/file.js → SyntaxError at line 1
Convert the same content to UTF-8 (no BOM) → error goes away.
Real example: a controller .js file starts with 69 00 6d 00 70 00 6f 00 72 00 74 00 instead of 69 6d 70 6f 72 74 (UTF-8 import).
Expected Behavior
All Agent writes on Windows should use UTF-8 (ideally without BOM), consistent with files.encoding and the ecosystem (Node, npm, Git, PostgreSQL).
Write and StrReplace should follow the workspace encoding policy.
Users should not need extra Agent turns just to fix encoding corruption caused by the Agent itself.
Operating System
Windows 10/11
Version Information
Version: 3.6.31 (user setup)
VS Code Extension API: 1.105.1
Commit: 81fcf2931d7687b4ff3f3017858d0c6dee7e2a60
Date: 2026-05-31T17:46:29.630Z
Layout: editor
Build Type: Stable
Release Track: Default
Electron: 39.8.1
Chromium: 142.0.7444.265
Node.js: 22.22.1
V8: 14.2.231.22-electron.0
xterm.js: 6.1.0-beta.220
OS: Windows_NT x64 10.0.26200
For AI issues: which model did you use?
Agent (Composer) with model set to “Auto” on Windows.
For AI issues: add Request ID with privacy disabled
I am not disabling Privacy Mode and I do not have a Request ID to share. I am not sure what “Request ID with privacy disabled” means in practice, and I prefer to keep privacy settings as they are.
The issue is not occasional: it happens dozens of times per day in real development work. Almost every Agent session that creates or heavily edits files (.ts, .tsx, .js, .sql) on Windows risks writing UTF-16 LE instead of UTF-8. I then have to detect the corruption (Node/TypeScript/psql errors), run conversion scripts, or ask the Agent again — which wastes a large amount of tokens and time.
I can reproduce on demand: ask Agent to Write a new .js/.ts file and inspect hex or run node --check. Cursor version: [paste from Help → About Cursor → Copy].
Additional Information
This is a daily, high-volume problem for me — not a one-off. I use Agent all day with Auto. Each corrupted file often costs an extra Agent turn (diagnosis + fix), so the token and time impact is severe across a full workday.
Does this stop you from using Cursor
No - Cursor works, but with this issue