Share your Thoughts on Composer 2.5!

Main announcement · Blog


Now that Composer 2.5 is available, we’d love to hear how it’s working for you.

Some things we’re especially curious about:

  • What types of tasks are you using it for?
  • How does it compare to Composer 2 in your day-to-day workflows — any specific improvements in long-running tasks, instruction-following, or communication style?
  • Anything that feels noticeably better or is noticeably worse than Composer 2?

We’re particularly interested in the behavior side of this release: the work we did on communication, effort calibration, and tool use is the kind of thing that’s hard to capture in a benchmark, so your real-world feedback genuinely shapes where Composer goes next.

Drop your thoughts below — we read everything. :folded_hands:

I’m really excited to use and test Composer 2.5. I really hope that we have improved the quality.

Thanks @gabriel-filincowsky ! Excited to hear your thoughts as you get a chance to explore the model.

Hey Kevin, just a quick update after the first few minutes of using 2.5.

I’ve noticed some improvement. I can’t say for sure yet how significant the improvement is in terms of the model’s overall behavior and responses, but that was one of the main weaknesses of 2.0. It often felt similar to ChatGPT 5.2 or 5.5: very dry and sometimes antagonistic. Version 2.5 seems better in that regard, but I need to use it more before drawing conclusions.

One thing I can say for sure is that the adaptive thinking and the comprehensiveness of responses, depending on the input, are still very hit-or-miss. I had to modify my prompt three or four times for a more complex request before I finally got the model to think longer and provide a comprehensive response.

The previous three attempts produced answers that were too generic and simplistic. The only changes I made were instructions such as “please think harder,” “please take more time,” or similar phrasing. The core body of the prompt itself remained the same. The request was related to engineering and coding. Still, the first three responses were lightweight and generic, while the fourth one was finally a proper, comprehensive answer.

Composer 2, as a subagent under GPT-5.4, was able to perform better than GPT-5.1, 5.2, and GPT-5.4-mini. Considering the comparison chart from the announcement, I’m very curious to see what you’ve achieved in version 2.5 :fire:

I’ve got some long running tasks I want to test - how can I select the “non-fast” variant of Composer 2.5? I don’t see it available in the model list. The dropdown only shows a fast variant but the model sheet shows a normal version of 2.5 exists? ( Composer 2.5 | Cursor Docs )
Well I found the switch in the dropdown - nevermind! I’ll report 2.5 feedback later lol.

After a couple of hours working with Composer 2.5, I can say with a high degree of certainty that this is a meaningful update. I cannot shake the feeling that Composer 2.0 has been performing poorly over the last few weeks to a month.

I do not know whether Cursor is using the same strategy (no nice) that other large labs use, where performance is degraded before the launch of a new version. What I can say is that in version 2.5, there are meaningful differences in tone and in how it responds.

Most of my work involves problem-solving, discussing ideas, and developing specs. It is less coding and more engineering, including mechanical and chemical engineering, which is my primary focus. In that context, it is very helpful to have a model that is willing to think with you and is not antagonistic.

I still have some doubts. I noticed that Composer 2.5 seems to outperform GPT-5.5 High. Have you switched the base model (Kimi 2.5)? Or is it really possible to achieve such a large improvement without changing the base model?

I am enjoying 2.5 its much better than 2

why not kimi k2.6 which inherently is a more powerful model then k2.5

but, i will say i have been grinding my office tasks with composer 2 for a while now and being a decent dev + composer 2 can work wonders. its not like opus max where u can prompt like a caveman and can produce results but composer 2 gets it done for me when i have sufficient knowledge

love it, looking forward

Training a model takes time and computational power. K2.6 hasn’t been out for long, so it requires adaptation, training, and other work. But Composer 2 is just too terrible…I think composer 2.5 is a temp version and will be updated soon. hopefully. and I just tried, excited, improved a lot, really.

Their checkpoint is still Kimi, the magic of further scaling post training.

Although I didn’t understand properly for the future, but it seems their next model would be from scratch? This is very plausible since they have Elon’s compute now.

I have tried using it today and I get about 5tps on the fast variant, not really usable.

Maybe it is possible to raise the limit of the context window to 250 or 240k tokens? The base model has a 256k window :thinking:

In some tasks, the OpenAI model is preferable to Claude because of the larger context window. And you declare your model as their competitor.

Sadly, it looks like I won’t get the chance to use Composer 2.5 this month. Even though I purchased the Ultra plan, I hit the usage limit after just five or six days. I later registered another account and bought the Ultra plan again, but the same thing happened — I used up all the quota within a few days.

Cursor really should offer users a higher-tier, better-value plan to choose from!

If you don’t use GPT-5.5 and Claude, you can stretch the quota out to two weeks.

pics


So Double Ultra looks reasonable.
And, if Composer 2.5 is really as good as the blog post claims, then we can use paid models even less now.

update after using for a while: the model makes beginner mistakes, e.g. it is very happy to create n+1 queries… the model has good process, e.g. it verifies its changes, analyzes first before making changes but lacks in intelligence.

apparently composer 2.5 cant make a pie chart. nice :skull:

After trying Composer 2.5 on several tasks, I can say around 80% of my work now uses Composer 2.5.

Yes, I still run into some problems, especially when asking it to rewrite code or make the code more human-friendly using principles like:

  • KISS
  • DRY

Sometimes, the plan from Composer 2.5 fails. So I revert the changes, use GPT-5.5 Medium to fix the plan, then use Composer 2.5 again to execute it.

And it works.

So, my conclusion:

With the pricing still relatively cheap, Composer 2.5 is a great model, especially compared to other models at this cost. It is a strong option for people on the $20 or $60 plan, as long as they are not relying too much on Fast Mode.

For daily tasks such as:

  • Implementing features from tickets

  • Bug fixing

  • Turning Figma designs into code

It works as expected. I do not see many problems there.

Hopefully, with Composer 3 and SpaceXAI, Cursor gives us even more choice for models, especially now that we can compare performance across the Artificial Analysis Coding Agent Index.