YOLO Mode is Amazing!

Holy smokes! This feature is bonkers! Kudos to the team. The composer is now capable of writing code, executing that code in the terminal, then fixing any bugs that may come up, on repeat, all without your input. And yes, there is an element of danger involved. But that only makes it even more exciting to use!

23 Likes

I enabled the yolo mode how to use it ?

This is an automatic mode. It can execute terminal commands that you specify, so you donā€™t have to press buttons. Just keep an eye on what it does, and thatā€™s it.

6 Likes

how to enable ? Iā€™m on 0.44.5

Go to Cursor settings by pressing Ctrl + Shift + J or Cmd + Shift + J, navigate to the ā€œFeaturesā€ tab, and enable the checkbox.

2 Likes

Here was my YOLO win today ā€¦

I developed a web scraper awhile back. Simple PHP script that calls public endpoints, processes its structured data, and writes it to a db.

The public endpoints recently changed their auth methods and also their structure, which broke my scraper.

With YOLO mode, I was able to explain to Cursor the issue and then asked it to run the script, analyze the responses from the endpoints, look for errors in the script output, analyze data written to the database, and continue to run the code, refine the code, and test it until it ran to my requirements.

Aside from asking me to make one cURL call and give it the response, it got the entire script running on its own in about 10 minutes without any intervention. Absolutely incredible. All the meanwhile, Iā€™m writing code for a different project in another Cursor window.

The agentic features of Cursor are incredible. Later today Iā€™m going to try to get it to provision an EC2 instance for me for a new project Iā€™m working on. Iā€™ve written a detailed spec of what needs to be done to get the EC2 set up and Iā€™ve also created a checklist file that Iā€™ll ask it to follow so itā€™s stays on track and I can easily monitor its progress.

UPDATE: the EC2 set up worked! See my post with details below ā€¦

8 Likes

it can only run baby scripts. you there not do anything remotely complex with it.

Idk, man. Aside from needing to tell Cursor it can go beyond the 25 tool call limit, it was able to complete this entire EC2 provisioning checklist for me from a spec without needing any help - pretty impressive:

# EC2 Provisioning Checklist

## AWS Credentials Setup
- [x] Create ~/.aws directory
- [x] Configure AWS credentials file
- [x] Configure AWS config file
- [x] Test AWS configuration with `aws s3 ls`

## Initial System Setup
- [x] Update system packages
- [x] Install all required packages
- [x] Create application directories
- [x] Set directory permissions

## Database Setup
- [x] Start MariaDB service
- [x] Run mysql_secure_installation
- [x] Create [redacted] database
- [x] Create database user
- [x] Verify database access

## Backend Setup
- [x] Clone backend repository
- [x] Create Python virtual environment
- [x] Create requirements.txt
- [x] Install Python dependencies
- [x] Create .env configuration
- [x] Create systemd service file
- [x] Create logging configuration
- [x] Enable and start backend service
- [x] Verify backend service is running

## Frontend Setup
- [x] Clone frontend repository
- [x] Create .env configuration
- [x] Install Node dependencies
- [x] Build frontend
- [x] Verify build output

## Nginx Configuration
- [x] Create frontend Nginx configuration
- [x] Create backend Nginx configuration
- [x] Configure logrotate
- [x] Test Nginx configuration
- [x] Restart Nginx
- [x] Verify Nginx is serving frontend
- [x] Verify Nginx is proxying backend

## SSL Certificate Setup
- [x] Install SSL certificates
- [x] Configure auto-renewal
- [x] Test SSL renewal
- [x] Verify HTTPS access

## Database Backup Configuration
- [x] Create backup script
- [x] Make script executable
- [x] Create cron job
- [x] Test backup script
- [x] Verify backup in S3 bucket

## Final Verification
- [x] Check all services are running
- [x] Verify SSL certificates
- [x] Verify database backup
- [x] Check all log files
- [x] Test frontend access
- [x] Test backend API
- [x] Test health check endpoint

## Post-Installation
- [x] Document any deviations from specification
- [x] Record generated passwords/keys
- [x] Test complete user flow
- [x] Verify backup retention policy
- [x] Clean up any temporary files
5 Likes

So here was mine - I am developing a FluiX3d based Fluid Dynamics physics baseed roulette probability simulation machine.

It pulls the specs from the manufacturers for the actual tables, the regulations from the gaming board, the operating parameters.

Uses orbital dynamics to determine the decay of the ball, based on the friction coefficient of the materials as based on the mfr spec sheet.

I had it YOLO last night


I am going to bed - so just keep iterating until I come back to you. log your journey, I want at least 4 iteration upon eachother getting better each time and comparing results to the distribution context file.

YOLO GO!

LOG & DIARY & RMD as you go.




And I have it auto-document and idary as it goes ā€“ with a developmen_diary.json

and all the docs in .Rmarkdown files - such that it can document the development process, then refer back to it for context.

I talk about it here:

My Prompt for Yolo

I am going to bed - so just keep iterating until I come back to you. log your journey, I want at least 4 iteration upon eachother getting better each time and comparing results to the distribution context file.

YOLO GO!

LOG & DIARY & RMD as you go.

ā€“

However

YOLIMIT:

Note: we default stop the agent after 25 tool calls since it is an experimental feature. Please ping us at hi@cursor.com if you think we should increase this limit. If you need more tool calls, please give the agent feedback and ask it to continue.

ā€“

YOLO is AMAZING, but the 25 limit hurt me - so now I need to see if thats daily?

1 Like

Thats wonderful - I hit the 25 limit - so you just told it to ignore the limit?

ā€“

Can you please share the code in a repo/gist if small enough? .7z?

That sounds like a tooling I really need shortly when my roulette Fluid Dynamics Phys Sim is completeā€¦


Post below:

I have not been able to get it to ignore the 25 limit cap. In my prompt, I will tell it to, but it still stops at 25 and then you just have to tell it that it can go past the limit and it continues.

For my AWS provisioning, I first built a spec by talking to Claude. I used the Claude website for that because I find itā€™s a really nice interface to do that sort of work with their artifacts feature. And then once Claude and I got the spec dialed in, I asked it to make the checklist that I added above.

I signed Cursor into the root of the EC2, put both files in the /tmp directory, and gave it this prompt:

@provisioning_spec.md @provisioning_roadmap.md you are signed into a brand new aws ec2 running amazon linux 2023.

i want you to provision this server for a project i am working on. you are in agent mode with yolo enabled, so you can run all the commands and read all the files you need to. while signed in as an ec2-user you can always run sudo or sudo -i to be able to make root commands.

attached is the complete ec2 provisioning plan and checklist. i also have this stored at /tmp/provisioning_spec.md and /tmp/provisioning_checklist.md. i want you to update this checklist as you work, checking off each item after you have completed it and verified it's functioning as expected. you may exceed the cap of 25 tool calls.

any questions before you get started? remember you are in agent mode with yolo, so you can complete this task completely automated without my intervention. of course, if you get stuck or have a question, ask me.

Me using using the composer today:

I paid for 25 tool calls, and Iā€™m using 25 tool calls!

1 Like

I mentioned this to the @Cursor team yesterday:

somewhat related to ā€œyolo modeā€ - is there a way to give cursor the ability to see the terminal?

my normal situation:

  • i have ā€˜bun run devā€™ going
  • the server is auto-updating (re building the site) when composer changes stuff
  • the lints are caught on a per-file basis, butā€¦
  • ā€¦that doesnā€™t catch everythingā€¦and now most of my workflow is just pasting the terminal errors back into composer. (i am a copy/paste monkey once again.)

if this isnā€™t possible, is there an ability to give cursor a script to run as a test? (in addition to the linter?) this would be more like the https://aider.chat workflow, where one could for example specify bun run build as the test script and aider will keep going until that passes.

Maybe a different way of phrasing this is that YOLO mode would be less YOLO, as in less risky, if I could specify a test script (terminal command) that needed to pass (or at least needed to be read by composer)ā€¦and the composer were instructed to do this after checking for linter errors but before ending the session.

(and i would feel less like a monkey that copies and pastes terminal errors into the composer pane)

EDIT: I think I answered my own question. This is pretty awesome, but I would appreciate if folks would share their own YOLO mode tips

1 Like

I think there needs to be a ā€œDelete directory protectionā€ option next to ā€œDelete file protectionā€, since YOLO mode can still delete directories.

1 Like

Add rm to the Command Denylist and it will ask for confirmation to delete anything.

1 Like

anyone have yolo mode prompting tips? iā€™m finding it quite unreliable.

my goal is sort of to have an ā€œinternal CIā€ where there is linting and shell commands that are always run before the composer decides the session is over.

Iā€™m not sure how useful this will be, and I am still experimenting, but this is what Iā€™ve been doing with the YOLO composer:

Iā€™ve been asking it to write unit tests for my javascript project.
This is what a prompt to improve coverage looks like:
///

I want you to improve the coverage of ParticleSystem.test.js.

  1. Use ā€œnpm run test:coverage ā€“ src/tests/graphics/components/ParticleSystem.test.jsā€ to see the current coverage.
  2. Fully read ParticleSystem.test.js, 250 lines at a time, to understand the current testing setup
  3. Fully read ParticleSystem.js, 250 lines at a time, to understand the implementation. Focus on the lines reported uncovered by testing.
  4. Add new tests to the suite, aim for 80%+ coverage
  5. Run the new tests to make sure everything functions as expected.
  6. Refer back to the implementation to solve any failing tests.

///

Also these are my Cursor settings Rules for AI:

///

You are a genius PhD senior JS developer with 50 years of experience.
Think carefully please, step by step and consider every file that might be involved.
Investigate meticulously before making changes. Consider you last several responses.
Feel free to grep the codebase and read full files. Tackle every issue like Sherlock Holmes would.

///

In response to this prompt, the composer will run the test to check its coverage, read both the test and implementation file, add new tests, continuously run those tests automatically in the terminal, fixing any failing tests in the meantime. The only thing I have to do is prompt it to continue if there are still tests failing after 25 tool calls.

With this process Iā€™ve improved the testing coverage for my 17kloc project from 0 to 55% in about a week (I had to run the tests by hand before YOLO mode was introduced). Mind you I had no idea what Jest was a week ago, and had never used javascript until three weeks ago.
I have done all this while simultaneously binging Stargate Atlantis and only paying minor attention to what Cursor is doing.

PS Sherlock Holmes is the greatest debugger that ever lived.

2 Likes

Please go through my post history::

Ive been posting about it quite actively ā€“ and there is a lot of text to consume - but grab some of the info from my posts and throw it at composer and have it summarize for you.

Some primary ti[s I have is to ensure you tell every YOLO bot to EXTENSIVELY verbosely log and diary - and tell it how many times you wnat it to iterate.

If you hit the 25 call limit warning - just tell it to keep going and it will.

But please read my posts about development_diary.json, diary.rmd and postgres_diary.rmd ā€“ look at the examples Iā€™ve posted and why I do it ā€“ as it keeps both you and bot on context especially if you take a break from the session for a day or two.

3 Likes