Composer 2 Feedback

Have been using Composer 2 for the past few days. While benchmarks suggest Opus 4.6 level intelligence, I feel it is closer to Opus 3.5.
I’ve noticed the following:

  • Will evade browser testing over the flimsiest of reasons.
  • Will straight up lie to the user about having done a “smoke check” by making a cURL request to localhost.
  • Is really not that smart. Makes bad decisions when refactoring code. Even sonnet 4.5 does better
  • Disobeys Plan mode. Will straight up edit files even in Plan mode if I don’t tell it to write a plan.
  • Doesn’t like tool calling? I have to specifically ask it to use the Figma MCP because otherwise it asks me to look up the components and their variants on Figma and convey the information.

I’ve only written the bad stuff above but there are good things too, like its speed and if given very detailed information like you were instructing a 10 year old, it actually does well.

What has been your experience with Composer 2?

3 Likes

hi just wanted to open a thread about composer 2

unfortunately im disappointed about it.. cause it seems composer 2 cant start subagents.

i have built a huge framework where all work is done by subagents, and i have a master agent which starts them.. but with composer it doesnt work. it just cant start them, and asks me to start sepoerate new chats manually…

i hope cursor team sees my feedback!

1 Like

Hey, thanks for the detailed feedback on Composer 2.

@Aniganesh, on your points:

  • Plan mode: This is a known issue. The team knows Composer 2 can sometimes ignore plan mode and start editing files directly. For now, you can explicitly add to your request: “only plan, do not edit any files”.
  • Browser tool and MCP avoidance: As a workaround, you can add rules in cursor rules like “always use available MCP tools before asking the user for information” and “use browser tool to verify changes”. It’s not perfect, but it helps.
  • Overall quality: Composer 2 is being improved a lot right now. Feedback on specific cases with example requests and outputs helps a lot. If you have concrete examples of bad refactor results, please share them.

@flowmotion, we know about the subagents and Composer 2 issue. There’s a recent thread with a similar report: "Task" tool missing. Prevents sub-agent launch. Which Cursor version are you using? And which model do you set as the main one when subagents should launch?

Let me know if any of the workarounds help.

3 Likes

My thread is about Opus and Sonnet not being able to spawn sub agents currently. Seems like a bug since that is one of the allures of using those much higher priced, highly capable models.

So my experience is similar:

  1. Not following instructions that 1.5 followed really well
  2. Probably because of 1, my orchestration workflow is just stopped working. It supposed to start sub-agents, but instead jumps to work right away
  3. I think it got worse in the last 24 hours, maybe subjective I don’t know, but I think it degraded.

Additional info on broswer tool and MCP avoidance: It’s not just those two tools. It’s also with the ask question tool. I have a skill that has the following text in it: Proceed after user confirms. Use the ask question tool to confirm instead of waiting for the user to respond with a new messageComposer 2 basically ignores this and proceeds with the plan that I hadn’t approved. This was in agent mode and it was supposed to present a small plan and once approved, follow through.

I’ve been very much disappointed by Composer 2 :frowning:

It seems worse to me in every aspect than Composer 1 / 1.5. My stack is NodeJS and NextJS, pretty standard stuff, and Composer 2 can’t even handle basic tasks on large repos, where Composer 1 and 1.5 were fine. Seems to me that intelligence of Composer 2 is very limited.

I wish I had Composer 1 or 1.5 back :frowning:

Anyone else disappointed, too?

3 Likes

Composer 1.5 surprised me, a lot. Right balance of thinking (not too much) and getting things done quick. And with good results.

So far, composer 2 tries too hard. Results not as dependable.

3 Likes

Cursor is 2.5.17

i use latest codex when it works fine.

Composer 2 is about the worse model I have ever seen. It seems like this model was essentially optimized to reduce token usage for Cursor, rather than deliver goals.

1.) Alignment: Composer 2 often fails the jobs, and come up with a ridiculous excuse.
2.) Capability: I rarely see composer 2 succeeding at anything. In fact, success is probably not messing up the current code base
3.) I think using auto is better than using composer 2. At least auto just requires a little more hand holding, a few more prompts.

1 Like

Really? Can you tell me more about how bad it is and how frustrating it’s been? Is it not working at all? Or did a mistake just ruin everything?

For me, it’s a great model and it’s quite flexible to use

1 Like

Same. My daily driver since it came out and so far it’s been great, no complaints.

I was positively surprised by Composer 1.5 and did use it frequently. Composer 2 is more capable but i do not use it anymore. It feels like it is not really doing what i want it to do. It does seem very verbose to me and takes a lot of unnecessary reasoning for straightforward tasks. In fast mode it easily comes up with incorrect answers. When it comes to more complex tasks, Opus 4.6 wins in planning and working on the plan, keeping the plan updated.

I would say it is a bit overengineered

I suspect it all depends on the use case. From Cursor’s perspective, they judge themselves based on benchmarks, which often do not reflect what I am trying to do. Since they are essentially rewarded for reducing token consumption of composer2, I think it sets up a pretty horrific principal-agent problem where our goals and cursor’s goals do not really align.

Also, the way I will think about coding is that writing good code (aka by opus) > 0 >>>>> writing suspicious code by weaker models.

I am using Composer 2 - and then I switch to Opus to get it done..

I’ve already provided feedback in this thread, but I have a concrete example of how Composer 2 continues to fail at every task I give it. Composer 2 just spent 30m thinking about how to solve a problem - a relatively easy problem. I had to stop it, switch to GPT-5.3 Codex and it just fixed it. Who knows how long Composer 2 would have kept spinning.

Request ID: 3fba68a3-3b57-42e6-8570-8fc25dcaeb61

Since first starting this thread, Composer 2 has improved and I don’t see it writing terrible code anymore.
I also don’t find it escaping Plan mode either. Kudos to the team on that.

I feel the plan mode with Composer 2 jump into changing files not creating a plan theese days, even if I state “make a plan please” in a prompt.

@Aniganesh, glad to hear there’s progress with Plan mode and code quality. I’ll pass the feedback to the team. Updates like this really help us see that the improvements are working.

@kellewic, thanks for the Request ID. The team is aware of cases where the model spins for 30 minutes on a task that GPT-5.3 Codex solves right away. If you hit it again, please send the Request ID. The more examples we have, the easier it is to spot the pattern.

@fritse, on verbose reasoning and over-engineering, it’s on our radar. Composer 1.5 really had a good balance of thinks vs does, and the goal is to bring that balance back in Composer 2. For straightforward tasks, you can try fast mode or Auto for now.

@flowmotion, on subagents and Composer 2, the team is aware of the issue. No specific timeline yet, but your feedback helps with prioritization. If you use subagents a lot, I’d recommend using Opus 4.6 or Codex for those tasks for now.

Let us know if you notice anything new with Composer 2.