The only thing I’d add is that it’s not really fair to compare “expensive” and “cheap” models directly. We’ve got to keep something like a hedonistic index in mind: paid GPT-5 and Sonnet 4.5 can handle certain tasks that free GPT-5-mini and Grok Code Fast will never manage — or they’ll manage, but way worse.
Right now I’ve pretty much switched entirely to GPT-5-high, mainly because it’s really smart and not that pricey. But there are some super simple tasks that are perfectly safe to run through GCF. And then there are some that actually need Sonnet 4.5 (or Haiku 4.5).
Can’t wait to see your take on Composer 1 if/when you get around to testing how it compares. If you start a new version of this thread please leave a link, much appreciated!
■■■■ right man. Grok is completed broken.
None of any rules/Agents instruction work.
Don’t know why Cursor trying to neft this model ?
It completed difference when i’m using Continue.dev and Cline in Vscode, Grok 4 and Grok 4 Fast is out play almost everything.
Even calling API for my external app for specific roleplay for workflow, Grok still ■■■■ good for the response/result
(i try back and forth to experience some new model in Openrouter)
IMO currently best for daily code without expensive is Kimi K2 (but it broken format of display in Chat windows => might fix this with Hard injection on Rules or Agents instruction by named it in first line)
The most balance in Cursor right now is Gemini Pro 2. It good with COT and calling tool and work pretty well with Agents.md instruction
I used Composer. It felt like fast and correct in 5-6 turn. But it keep turning loop of reasoning when hard problem come in.