Agent Write/StrReplace tools output GBK-encoded files on Chinese Windows — persists in 3.0.13 even after ACP changed to 65001

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

Agent mode — Write tool and StrReplace tool

Cursor version: 3.0.13 OS: Windows 11 (build 26200), Simplified Chinese locale

Description
When Agent writes or edits a file containing Chinese characters (or any non-ASCII text), the file is saved in GBK encoding instead of UTF-8. This happens silently — the content looks correct in the chat window, but the file on disk is GBK-encoded, causing UnicodeDecodeError at runtime and garbled display in the editor.

This bug was first reported for 2.4.x (#150243, #149887). It is still present in 3.0.13.

Steps to Reproduce
Use Cursor on a Simplified Chinese Windows machine (system locale zh-CN).
Ask Agent to create a new Python file with Chinese comments or strings.
Open the generated file — it displays garbled characters (乱码).
Verify with:

with open(“the_file.py”, “rb”) as f:
raw = f.read()
raw.decode(“utf-8”) # raises UnicodeDecodeError
raw.decode(“gbk”) # succeeds — confirms file is GBK
What I tried (none of these fixed it)
Attempted fix Effect
.vscode/settings.json: “files.encoding”: “utf8”, “files.autoGuessEncoding”: false
No effect on Agent writes
Windows Beta UTF-8 mode (changed system ACP from 936 → 65001, rebooted)
No effect — Agent still writes GBK
PYTHONUTF8=1 set as user-level env var
No effect
terminal.integrated.env.windows: PYTHONIOENCODING=UTF-8
No effect
.editorconfig with charset = utf-8
No effect
The fact that changing the system ACP to 65001 has no effect suggests the Agent’s file write path does not use the system default encoding at all — it appears to have a hardcoded or cached encoding that maps to GBK on zh-CN Windows regardless of system settings.

Scope of impact
After scanning the entire project, I found 12 text files (.py, .md, .json) had been silently corrupted to GBK by previous Agent sessions. Files I never manually edited.

Expected behavior
Agent-written files should always be UTF-8, matching files.encoding and the .editorconfig charset declaration.

Workaround (for others hitting this)

Detect and fix GBK-encoded files back to UTF-8

python fix_encoding.py path/to/file.py

fix_encoding.py (minimal version)

import sys
from pathlib import Path
for arg in sys.argv[1:]:
p = Path(arg)
raw = p.read_bytes()
try:
raw.decode(“utf-8”)
print(f"OK: {p}“)
except UnicodeDecodeError:
p.write_text(raw.decode(“gbk”), encoding=“utf-8”)
print(f"Fixed (GBK→UTF-8): {p}”)
This is a silent data corruption bug. Because there’s no warning when a file is written with the wrong encoding, users may not notice until a runtime error occurs — or worse, may commit corrupted files to version control. Please prioritize this fix for CJK users.

Steps to Reproduce

Use Cursor on a Simplified Chinese Windows machine (system locale zh-CN).
Ask Agent to create a new Python file with Chinese comments or strings.
Open the generated file — it displays garbled characters (乱码).
Verify with:

with open(“the_file.py”, “rb”) as f:
raw = f.read()
raw.decode(“utf-8”) # raises UnicodeDecodeError
raw.decode(“gbk”) # succeeds — confirms file is GBK

Expected Behavior

This is a silent data corruption bug. Because there’s no warning when a file is written with the wrong encoding, users may not notice until a runtime error occurs — or worse, may commit corrupted files to version control. Please prioritize this fix for CJK users.

Operating System

Windows 10/11

Version Information

Version: 3.0.13 (user setup)
VSCode Version: 1.105.1
Commit: 48a15759f53cd5fc9b5c20936ad7d79847d914b0
Date: 2026-04-07T03:05:17.114Z
Layout: editor
Build Type: Stable
Release Track: Default
Electron: 39.8.1
Chromium: 142.0.7444.265
Node.js: 22.22.1
V8: 14.2.231.22-electron.0
OS: Windows_NT x64 10.0.26200

Does this stop you from using Cursor

Sometimes - I can sometimes use Cursor

This is a known issue with how Cursor’s Agent file write path handles encoding on Windows. Several fixes shipped in recent versions (most recently in March), but based on your report, the GBK-specific variant appears to persist in 3.0.13.

Other users have reported the same class of bug:

Your observation about ACP changes having no effect is consistent with how the encoding path works.

Your detailed reproduction case and workaround script are very helpful. I’ve shared this with our engineering team as an additional data point, specifically highlighting the GBK/zh-CN behavior that persists after the March encoding fixes.

For now, your Python fix script is the best workaround. You might also try creating a fresh workspace with no pre-existing GBK-encoded files to avoid the majority-detection feedback loop.

Actually, Opus 4.6 helped me to indentify the bug and provide me the fix script, costs me $10 of token at least. So I really hope u can fix this SOOOOOON, thank you~

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

When using the Agent to create new code files, everything looks correct in the chat output (Chinese characters display normally). However, after the file is written to the workspace, the actual file content becomes garbled.

Steps to Reproduce

My workspace is configured to use UTF-8 encoding.
Use the Agent to create new code files(including chinese).
The newly generated file will contain unreadable characters.
To read the file correctly, I have to open it using GBK encoding and then manually re-save it as UTF-8.
It seems that the Agent is writing files using GBK encoding instead of UTF-8, despite the workspace being configured for UTF-8.

Expected Behavior

The Agent should always write files using UTF-8 encoding, consistent with the workspace encoding. Chinese characters should be correctly preserved without requiring manual re-encoding.

Screenshots / Screen Recordings

Operating System

Windows 10/11

Version Information

Version: 3.1.14 (user setup)
VSCode Version: 1.105.1
Commit: d8673fb56ba50fda33ad78382000b519bb8acb70
Date: 2026-04-14T01:39:23.679Z
Layout: editor
Build Type: Stable
Release Track: Default
Electron: 39.8.1
Chromium: 142.0.7444.265
Node.js: 22.22.1
V8: 14.2.231.22-electron.0
OS: Windows_NT x64 10.0.26200

For AI issues: which model did you use?

Composer2 gpt-5.4 Codex-5.3

For AI issues: add Request ID with privacy disabled

3e6bae18-8b10-4626-b5fc-e0135ae057ce

Does this stop you from using Cursor

No - Cursor works, but with this issue

If possible, I kindly request a quick resolution to this issue, as it has seriously impacted our normal work efficiency. The problem is not only related to GBK encoding, but also occasionally involves a large number of “?” characters that cannot be manually corrected. The only solution is to use an agent, which has resulted in me using a significant amount of additional tokens and time.