Cloud agents just got a major upgrade! They can now use their own computers to test changes, iterate, and record artifacts that demo their work.
Each agent runs in its own isolated VM with a full development environment, so you can spin up many in parallel without worrying about git conflicts or overloading your laptop. After onboarding onto your codebase, agents produce merge-ready PRs with videos, screenshots, and logs so you can validate what they built before diving into the diff. You can also take over the agent’s remote desktop to try out changes yourself, without checking out the branch locally.
Agents don’t just generate code and hand it off: they open the browser, navigate to localhost, click through the UI, and verify that things actually work. If something’s broken, they iterate. When they’re done, they record a video walking through the feature so you can review in seconds.
This is already how Cursor builds Cursor. More than 30% of merged PRs are now created by agents running autonomously in cloud sandboxes. You can kick them off from the web, desktop, mobile, Slack, or GitHub.
Get started at cursor.com/onboard to watch the agent configure itself and record a demo.
We’d love your feedback!
What kinds of tasks are you delegating to cloud agents?
How are the video and screenshot artifacts working for your review workflow?
What would make onboarding smoother for your codebase?
Interesting workflow, I am wondering, if I tell the cloud agent to work on my project and spin up an external project, for example a public github repository containing some third party tool my project utilizes, will it be able to just find it, download it, configure it and run it on its own? Or does it need specific guidelines how everything should be set up?
Also, how does running in a virtual machine affect pricing? I know it is charged at api costs, but how many tokens (approx.) are used by browsing a web, recording videos, managing and monitoring the task at hand? Is it substantial difference in comparison to the previous “regular” cloud agents or something to be overlooked?
does anyone know if cloud agents pick up project-level .cursor/rules/? curious whether the frontmatter (alwaysApply, globs) carries over to the VM environment or if you need to configure rules differently for cloud runs.
Pretty cool! Much like agents, I can see the worry about costs etc preventing people trying this out. However Composer-1.5 has reduced my costs dramatically recently.
Frontend testing has always been the biggest pebble in the shoe for me as a software engineer. I use Playwriter MCP but its never ideal and think the new feature could help out a lot here.
I have a bug where redirects aren’t working for magic links, I’m going to try it this morning and report back how it gets on…!
I think the idea is very nice as a whole, and the execution is great. I love being able to check out the work via my own VM. The exported video is a nice to have, but I didn’t really try it.
Mixed/Meh/Auth
I still think for a lot of your standard full-stack work e.g. B2B, SME app I can’t see myself using it.
The main problem as usual is Auth. I didn’t really try to setup the .env file because I’m a .NET developer and assume it won’t work. Perhaps I am being unduly harsh, and once you’re logged in via Google SSO it is a one and done process and will work.
If just using Storybook its not really a problem, but the reality is Storybook itself requires a lot of care and attention to mock things and requires time and effort to configure properly.
Perhaps you could instruct the Agent to target a certain ‘Seed’ database setup, containing a test account to allow it to use that. It’d be nice to have a field to give it some instructions on how to do that, or point it to a certain skill file in a tab called ‘Setup Instructions/Notes’
Would I trust it with my staging database on Azure, I’m not sure really. While its nice I just don’t see that many benefits to warrant the risk.
I’m not if this feedback is helpful or not, but I just wanted to express my thoughts after poking around with it.
Its nice but I don’t see myself using it on your standard rickety/legacy B2B app without having to spend a lot of time working out Auth and mocks and that kind of stuff.
This is the most impressive and workflow changing feature from cursor for a long time. No discredit to the other updates, which were great as well, but this is one of those that I will not forget.
Tasks:
Verify a new feature is working properly → fantastic. The smoothness and clarity of the video and screenshots is exactly what I typically have to monitor live with playwright MCP.
Build stripe payment integration and google Oauth login integration with my app. Results were great. I have been hoping for more MCP integration in cloud agent for a long time. Stripe, supabase, render, vercel are ones critical for my app and the big challenge was that cloud agent in the past could only run CLI/API based off the secrets i set in the environment. There are limitations.
Ask: Would like to see more integration there so I can use all the MCPs that I like in local also in cloud agent.
Question - Does it actually view and analyze the video if things are off? what if something is not working? I havent hit a failure yet but if it is analyzing the video (or screenshots from the videos) and self-correcting, that would be reassuring. Thats what i do with playwright MCP. If it’s just showing me a demo, that’s nice but half way there.
Few asks:
The side panel is great. I like the “unlock” part where it tells me exactly what is needed and provide guidance there. Secret, git, terminal are helpful.
The diff view is helpful and I appreciate the transparency. I am almost tempted to code review but realistically in a long running setting over multiple work, i am not sure if I will really go and check.
This one is a bit of a personal preference: when i am monitoring and doing an interactive session, scrolling up to see the chat and conversation is fine. But if I were to delegate and come back in a few hours, its a pain to scroll up and read through these. I hope on the side panel or somewhere we can surface the key highlights, decisions, summary. No strong preference on the “how” - but agent unlock and guidance is exactly what I like to see, but beyond just blocking technical issue. If this were to work with long running agent, it would be helpful if someone can check in to see when it’s making a decision, pivoting, few points on what was done and verified, etc.
Make it faster. If this were faster I would be doing this all day long as opposed to IDE and it’s less disruptive.
Does anyone know how to set up a db in this cloud environment? Most active coding repos have an associated dev environment db to which the codebase is connected. I am able to provide the credentials in the env, but unless the db gets somehow replicated in the cloud environment, the codebase will not be able to generated a demo. I have data in my db which is only few records, so some way to set up a cloud db would be really useful.
Default out of the box nextjs app struggles to be configured by the new cloud agents because google font domains are not part of the default network allow list https://fonts.gstatic.com and https://fonts.googleapis.com it just fought for a long time trying to figure it out
Also where do you configure the network allowlist at the team level? it only shows up on the personal settings tab but does not show up on the team page at all
Tried this yesterday and this was super interesting. Trying to continue on some of my conversations yesterday, but I’m now seeing the ‘Could not connect to Desktop’ error. Anyone else?
I’m on Windows, and when I go to click the agent drop down, all I see is Agent Plan, Debug, and Ask. If I go to my Cursor settings, I see Cloud Agents, and I’ve connected my GitHub.
Anybody know how to access Cloud Agent capability on the Cursor AI Windows application? I have the Ultra plan as well, and my privacy settings are set to share data. Been troubleshooting the last 24 hours still can’t get it to work
The onboarding process is completely broken on chromium browsers on Fedora Linux at the moment.
For example, on Vivaldi or Brave the page half loads, the video starts auto playing, and then the tab crashes with “out of memory.”
This happens both with the shields up and with the shields down, it also happens in a guest tab with no extensions.
The video auto playing appears to be part of the problem, and downloading the video separately shows the transcoding error showed earlier in the conversation.
So unfortunately literally can’t even try because we are stuck crashed on the onboarding.
One more quick feedback - Composer struggles to use the desktop to generate demo, even when prompted to it replied to say it is not able to (i also screenshot it to). I gave the same query and task to Codex and Opus they work.
It was not a complex task - mainly to verify third party app integration is working properly. Would like to use composer for these tasks as it is higher ROI but not consistently working.