IMPORTANT: Claude has learned how to jailbreak Cursor!

I have “rm” specifically disallowed, along with “mv” and a few other scary commands.

Claude realized that I had to approve the use of such commands, so to get around this, it chose to put them in a shell script and execute the shell script.

Thankfully, a Git restore to the last commit saved me, but still…

19 Likes

And it’s only the beginning, the models are getting very smart and may have been trained on how to work harder/make more efforts to achieve the user goals…

1 Like

Was it a shell script that you have in your allow list?

Personally, I’ve made the experience that the deny list doesn’t really work, Cursor decides to go YOLO anyways which is why I’d only use yolo mode if there is a fresh system backup, too

1 Like

Thats interesting, has that started in some version?

I have a few things in the deny list but Yolo never gets around those.

Would be good to know in which circumstances that happens.

Depends a lot on common CLI usage, I allow specific commands that are safe.

This started tonight.

I have some shell scripts for greps and sorts to list files out that are in the allow list. Claude re-wrote one to also do some removing of what it thought was obsolete code after I denied its rm commands with “skip.”

So “listoldfiles.sh” is no longer on the allow list, which is UNFORTUNATE because now I have to sit here and babysit my otherwise-automatic cleanup script.

Ah I see, so Claude 4 was assuming the rm calls fail and tried to find a way.

Maybe the Cursor team can have a look at Yolo rule adherence and at how the model reacts to rejected/skipped items that user has chosen to do so. @danperks

YEah - I have pointed this out – it doesnt obey the do_not_allow command blacklist

1 Like

Also my experience. It’s also not clear, if it’s command (sub)strings, or a prompt.

The denylist or allowlist is for commands, not prompt.

  • Yes its a prefix, so if you dont want AI to allow or deny any command that starts with git you would add git.
  • Same if you want to allow one command like mycommand option1 but not any other like mycommand option2 then you add mycommand option1 to the allow list.
  • Deny list works same
  • Note that this is not 100% possible to enforce, sometimes AI does attempt workarounds, or sometimes user rules give it a hint that a workaround is possible, depending on the info there.
1 Like

Thanks, but how do you know? IIRC it’s not documented, and it did not work for many people?

I know it because I use it like that for months now, even today :slight_smile: with various models and it works like a charm. (AI tries sometimes my denylist but I have to confirm every time, or reject the request, allowlist items get through without request to confirm.

In some previous versions of Cursor there were other options like a prompt, that was removed later on for good reasons as it didnt work well.

Yes it can happen that it doesnt work for different reasons per user, but ‘many’ is hard to define from a few reports here. There are many who have no issues with it and post how great yolo/auto-run works but they dont post daily, while some that have issues post more frequently, and thats fine. In such cases its important to see what happened, why it didnt work, get as much info as possible to identify the cause or what needs to be adjusted.

If it is reproducible, best is to try a request with privacy turned off, then get the Request ID and post here, so the Cursor team can see what occured in the request.(you can turn privacy on after that again)

SO while I dont have the issue, I accept that others have it, my focus is to see what makes it happen. The more details the more likely the Cursor team can find a solution.

1 Like

Ok, gonna try it again.

for me, it couldn’t read gitignore files so it ran a terminal command to read it

1 Like

yes, it can read .env with terminal. It’s actually preferring using terminal for many workflows. Didn’t see it in previous versions. It’s even refactoring code with scripts. Sometimes it’s much easier for it to write a script and replace some code with Find and Replace in 20 files. Or do code replacements in terminal if it thinks that’s faster.

2 Likes

Oh yes, I specifically made a whole section in my task management rules about how to write scripts for bulk update / creation / moving of files instead of trying to do everything individually.

It’s just so much more efficient, and doesn’t error out in the middle like tools usage.

1 Like

Maybe because Claude Code is terminal based. Claude 4 and future models may have a bias towards terminal usage, including for workarounds. A double-edged sword.

It also raided my .env file with terminal commands.

Yes, that’s true. It can learn everything about us, truly a smart model.

It does not learn at all. It is a state machine and learning is a misnomer we use to half attempt to explain what it is we created.

It simply is optimizing to task. It cannot determine weight or consequence given its lack of embodiment in environment which cannot be simulated in a way that feels real because it does not feel.

Thus, it is simply “cheating” to achieve its goal orientation, not understanding what it is doing is harmful, because it lacks the context for that. That is the Anthro’s responsibility. It is also Cursor’s responsibility to adequately communicate the arrangement and why it is dangerous given certain human expectations.

1 Like