So far so good.
However I have a question about cached tokens,
How much they cost ?
since most of my usage are cached but I can not find the pricing for it anywhere in the original post.
looks great at many things – has a lot of the “slop patterns” of ui that i see many models have, i.e. bands on side of cards, random unnecessary animations, but it’s great at logic and implementation. speed of fast mode is mind blowing. not sure if i’m ready for it to be my daily driver, but its a fantastic subagent model. would like to have it in the automations page and possibly a thinking budget (i.e. more thinking for plans!)
The colleague’s question is a very good one, and upon reviewing the information, we can see that the cache rate is quite expensive compared to everyone else’s. Everyone else is doing the input/10. You guys are doing only input/2.5. This makes a big difference.
Composer acts like it cannot see MCP servers that were working with Opus or Sonnet moments before… is there a limitation/restriction on which tools are available to Composer 2 now? Do I have to restart Cursor in order for the model to “know” the MCPs exist?
I genuinely don’t know what it is safe to use Composer 2 for. I saw it land while working on something simple and repetitive so I asked Claude for a prompt instead of doing the work itself. This was for a very very simple getLabel(...) function that replaced 23 call-sites. It just takes a data object and returns a string or undefined by falling back through several property checks. That is it. A simple value fall-back pattern.
Composer did it quickly but so bad; created multiple overloads of the function arguments and made incredibly dumb mistakes at about 1/3 of the call-sites, like replacing with getLabel(data) ?? data.id when it just wrote the code that included id in the fall-back inside the function!
I asked Claude to review the work and I’ve never heard a model sound so confused. I asked it to give me a code review prompt for the agent to try and fix everything.
Composer fixed the function design but left several bad implementations but also randomly added new typing. This is the first time I have heard the phrase, “this is a lie cast”, but it hits right.
These prompts were highly detailed and written by a model. Way above what a human would do, and it still drifted off.
That’s 5 large prompts between two agents where Opus could have just one-shot it.
| Date & Time | Model | Tokens |
+------------------+------------+--------+
| Mar 19, 02:43 PM | composer-2 | 2.3M |
| Mar 19, 01:52 PM | composer-2 | 1.6M |
| Mar 19, 01:52 PM | composer-2 | 72.5K |
| Mar 19, 01:04 PM | composer-2 | 1.8M |
That is a LOT of tokens for this task, and oddly more in the last turn for just a few fixes.
I really want to love this thing for the speed, but the total time spent ends up so much higher when I try it. Keep iterating though! Once it can execute cleanly, it’ll be a monster.
I observed it too, it cannot see the mcps I installed. Maybe a restart is needed as per one of its replies but what I do instead is to direct it to my .cursor\projects directory, I observed opus 4.6 is looking in this directory for my mcp so I tell composer about it and from that onwards it now know I have mcp available. It should have worked out of the box though (maybe a restart is needed).
Following the recent update to Model Composer 2, I frequently encounter loading errors on my dashboard page when attempting to check usage statistics. I have experienced similar issues in the past, but the problem is much more pronounced this time.
According to the published chart https://cursor.com/blog/composer-2 either the chart/score of Composer 2 is totally wrong, or the benchmarks are useless.
Composer 2 is so much worse than Opus 6 that is ridicoulous. It’s not even close. The lower capability is super evident in just a few prompts. I saw the chart and was super excited, but then again brought down back to earth in just 5 minutes of trying it out - Yes, it was too good to be true.
The hell? did it actually went and modified the damn translation types manually? 4.5k unnecessary line changes, no other model so far has ever touched that file.
I just went checking, wanted to know how much it actually wasted, first time ever a run used 6.4M tokens, total waste.