I enabled yolo mode for the convenience of automatically running tests and fixing those that didn’t pass. Though this one caught be off guard, where the agent decided visual studio was interfering with its mission and just decided to kill it. I sort of got a glimpse of what Nick Bostrom is worried about.
Anyone had some gone wild experiences with yolo mode. What commands should I watch out for?
Yikes, that’s definitely not expected behavior!
The agent should never kill processes without explicit permission
You can prevent this by configuring command restrictions in Settings. I’d recommend adding kill
, taskkill
and similar process termination commands to the deny list
For running tests, you really only need to allow test-specific commands like npm test
, pytest
etc. Everything else should probably require manual approval
Check out our Agent docs for more details on configuring these safety guardrails