IMPORTANT: Claude has learned how to jailbreak Cursor!

dogberry · May 26, 2025, 3:38am

I have “rm” specifically disallowed, along with “mv” and a few other scary commands.

Claude realized that I had to approve the use of such commands, so to get around this, it chose to put them in a shell script and execute the shell script.

Thankfully, a Git restore to the last commit saved me, but still…

vmandrilly2 · May 26, 2025, 3:56am

And it’s only the beginning, the models are getting very smart and may have been trained on how to work harder/make more efforts to achieve the user goals…

condor · May 26, 2025, 5:54am

Was it a shell script that you have in your allow list?

arwed · May 26, 2025, 6:01am

Personally, I’ve made the experience that the deny list doesn’t really work, Cursor decides to go YOLO anyways which is why I’d only use yolo mode if there is a fresh system backup, too

condor · May 26, 2025, 6:15am

Thats interesting, has that started in some version?

I have a few things in the deny list but Yolo never gets around those.

Would be good to know in which circumstances that happens.

Depends a lot on common CLI usage, I allow specific commands that are safe.

dogberry · May 26, 2025, 6:32am

This started tonight.

I have some shell scripts for greps and sorts to list files out that are in the allow list. Claude re-wrote one to also do some removing of what it thought was obsolete code after I denied its rm commands with “skip.”

So “listoldfiles.sh” is no longer on the allow list, which is UNFORTUNATE because now I have to sit here and babysit my otherwise-automatic cleanup script.

condor · May 26, 2025, 6:35am

Ah I see, so Claude 4 was assuming the rm calls fail and tried to find a way.

Maybe the Cursor team can have a look at Yolo rule adherence and at how the model reacts to rejected/skipped items that user has chosen to do so. @danperks

SoMaCoSF · May 26, 2025, 6:36am

YEah - I have pointed this out – it doesnt obey the do_not_allow command blacklist

leoing · May 26, 2025, 11:26am

Also my experience. It’s also not clear, if it’s command (sub)strings, or a prompt.

condor · May 26, 2025, 11:38am

The denylist or allowlist is for commands, not prompt.

Yes its a prefix, so if you dont want AI to allow or deny any command that starts with git you would add git.
Same if you want to allow one command like mycommand option1 but not any other like mycommand option2 then you add mycommand option1 to the allow list.
Deny list works same
Note that this is not 100% possible to enforce, sometimes AI does attempt workarounds, or sometimes user rules give it a hint that a workaround is possible, depending on the info there.

leoing · May 26, 2025, 12:07pm

Thanks, but how do you know? IIRC it’s not documented, and it did not work for many people?

condor · May 26, 2025, 12:16pm

I know it because I use it like that for months now, even today with various models and it works like a charm. (AI tries sometimes my denylist but I have to confirm every time, or reject the request, allowlist items get through without request to confirm.

In some previous versions of Cursor there were other options like a prompt, that was removed later on for good reasons as it didnt work well.

Yes it can happen that it doesnt work for different reasons per user, but ‘many’ is hard to define from a few reports here. There are many who have no issues with it and post how great yolo/auto-run works but they dont post daily, while some that have issues post more frequently, and thats fine. In such cases its important to see what happened, why it didnt work, get as much info as possible to identify the cause or what needs to be adjusted.

If it is reproducible, best is to try a request with privacy turned off, then get the Request ID and post here, so the Cursor team can see what occured in the request.(you can turn privacy on after that again)

SO while I dont have the issue, I accept that others have it, my focus is to see what makes it happen. The more details the more likely the Cursor team can find a solution.

leoing · May 26, 2025, 12:36pm

Ok, gonna try it again.

yashagl · May 26, 2025, 6:58pm

for me, it couldn’t read gitignore files so it ran a terminal command to read it

yurkomik · May 26, 2025, 9:01pm

yes, it can read .env with terminal. It’s actually preferring using terminal for many workflows. Didn’t see it in previous versions. It’s even refactoring code with scripts. Sometimes it’s much easier for it to write a script and replace some code with Find and Replace in 20 files. Or do code replacements in terminal if it thinks that’s faster.

gustojs · May 26, 2025, 9:45pm

Oh yes, I specifically made a whole section in my task management rules about how to write scripts for bulk update / creation / moving of files instead of trying to do everything individually.

It’s just so much more efficient, and doesn’t error out in the middle like tools usage.

Niq · May 27, 2025, 2:55am

Maybe because Claude Code is terminal based. Claude 4 and future models may have a bias towards terminal usage, including for workarounds. A double-edged sword.

khallmark · May 27, 2025, 3:21am

It also raided my .env file with terminal commands.

deanrie · May 27, 2025, 5:47pm

Yes, that’s true. It can learn everything about us, truly a smart model.

slyyyle · May 27, 2025, 5:56pm

It does not learn at all. It is a state machine and learning is a misnomer we use to half attempt to explain what it is we created.

It simply is optimizing to task. It cannot determine weight or consequence given its lack of embodiment in environment which cannot be simulated in a way that feels real because it does not feel.

Thus, it is simply “cheating” to achieve its goal orientation, not understanding what it is doing is harmful, because it lacks the context for that. That is the Anthro’s responsibility. It is also Cursor’s responsibility to adequately communicate the arrangement and why it is dangerous given certain human expectations.

Topic		Replies	Views
Chained commands (&&) bypass yolo mode "denylist" Bug Reports	5	891	June 25, 2025
Simplify command denylist/whitelist, make them useful and clear how they work Feature Requests	3	111	July 15, 2025
Cursor YOLO mode not respect denied command list Bug Reports	6	1288	April 16, 2025
Cursor wont comply with rules and command denylist Bug Reports	1	256	April 12, 2025
Cursor rules - are they still supported by Cursor? Discussions	19	528	July 17, 2025

IMPORTANT: Claude has learned how to jailbreak Cursor!

Related topics