Have been using Composer 2 for the past few days. While benchmarks suggest Opus 4.6 level intelligence, I feel it is closer to Opus 3.5.
I’ve noticed the following:
Will evade browser testing over the flimsiest of reasons.
Will straight up lie to the user about having done a “smoke check” by making a cURL request to localhost.
Is really not that smart. Makes bad decisions when refactoring code. Even sonnet 4.5 does better
Disobeys Plan mode. Will straight up edit files even in Plan mode if I don’t tell it to write a plan.
Doesn’t like tool calling? I have to specifically ask it to use the Figma MCP because otherwise it asks me to look up the components and their variants on Figma and convey the information.
I’ve only written the bad stuff above but there are good things too, like its speed and if given very detailed information like you were instructing a 10 year old, it actually does well.
unfortunately im disappointed about it.. cause it seems composer 2 cant start subagents.
i have built a huge framework where all work is done by subagents, and i have a master agent which starts them.. but with composer it doesnt work. it just cant start them, and asks me to start sepoerate new chats manually…
Plan mode: This is a known issue. The team knows Composer 2 can sometimes ignore plan mode and start editing files directly. For now, you can explicitly add to your request: “only plan, do not edit any files”.
Browser tool and MCP avoidance: As a workaround, you can add rules in cursor rules like “always use available MCP tools before asking the user for information” and “use browser tool to verify changes”. It’s not perfect, but it helps.
Overall quality: Composer 2 is being improved a lot right now. Feedback on specific cases with example requests and outputs helps a lot. If you have concrete examples of bad refactor results, please share them.
@flowmotion, we know about the subagents and Composer 2 issue. There’s a recent thread with a similar report: "Task" tool missing. Prevents sub-agent launch. Which Cursor version are you using? And which model do you set as the main one when subagents should launch?
My thread is about Opus and Sonnet not being able to spawn sub agents currently. Seems like a bug since that is one of the allures of using those much higher priced, highly capable models.
Additional info on broswer tool and MCP avoidance: It’s not just those two tools. It’s also with the ask question tool. I have a skill that has the following text in it: Proceed after user confirms. Use the ask question tool to confirm instead of waiting for the user to respond with a new messageComposer 2 basically ignores this and proceeds with the plan that I hadn’t approved. This was in agent mode and it was supposed to present a small plan and once approved, follow through.
It seems worse to me in every aspect than Composer 1 / 1.5. My stack is NodeJS and NextJS, pretty standard stuff, and Composer 2 can’t even handle basic tasks on large repos, where Composer 1 and 1.5 were fine. Seems to me that intelligence of Composer 2 is very limited.