Describe the Bug
I think there is a safety gate missing… The model worked nicely to add parameters to python to allow for “–test” vs. “–no-test” mode (production) and that flag controls if it actually posts to an API that affects the real world. After I had it do some refactoring, after I allowed it to run commands, it finished refactoring and then ran the “test” in non-test mode. That seems like a pretty basic violation.
Steps to Reproduce
I think it would be hard to reproduce, these things are non-deterministic.
Expected Behavior
It should not run things that obviously affect production if it is given permission to run commands.
Screenshots / Screen Recordings
Operating System
MacOS
Current Cursor Version (Menu → About Cursor → Copy)
Version: 1.2.1 (Universal)
VSCode Version: 1.99.3
Commit: 031e7e0ff1e2eda9c1a0f5df67d44053b059c5d0
Date: 2025-07-03T06:08:06.355Z
Electron: 34.5.1
Chromium: 132.0.6834.210
Node.js: 20.19.0
V8: 13.2.152.41-electron.0
OS: Darwin arm64 24.5.0
Does this stop you from using Cursor
Sometimes - I can sometimes use Cursor