How To Optimize Your Usage: The Best AI Models to Use, version 2.2

Previous version of the guide:


Gemini 2.5 Pro + Claude 4 Sonnet Thinking > Grok 4 >? GPT-5 >> Auto > GPT-5 Mini > Sonic

  • Grok 4 broken - I like his approach, but he is always unable to continue chat.
  • Gemini 2.5 Pro less broken, then Grok 4 - It can also interrupt execution in the middle of a task, like Grok, but less often.
  • Claude Sonnet 4 Thinking very good when he has enough intelligence to do the task, but expensive. Very good as QA Engineer. I usually use it when other models have ■■■■■■ me off with their stupidity or when I’m fed up with the errors in Cursor Agent Chat.
  • I tried Claude Opus 4.1 Thinking and it was absurdly expensive considering what I got in return.
  • GPT-5: I’m still trying it out. I switch between GPT-5-high and GPT-5-low.
  • Sonic (Grok 4 Coder Lite?) is lightning dumb but fast. I really hope that the unnamed provider gave a lightweight version of the neural network, and not a full one.
  • I previously recommended o4-mini as a budget option. Now it seems slow, and it’s also pretty lazy. So for very simple tasks, use Auto, GPT-5 Mini, or Sonic. Or GPT-5-low - it’s pretty cheap and powerful.

I also recommend giving my tools a try (though I’ve had to put them developing on hold for now :melting_face:).

Turn on bell in Watching to get updates about the guide!

@condor Reddit say me that Reddit-version in r/Cursor was autodeleted by Reddit autofilters, even if I remove my Github links from it. Can you help me somehow?

I sent it to Reddit mods. They mentioned that autofilters are not managed by mods so this is automatic. Also it seems your account there was suspended which may be a reason why its not showing as well.

Edit: Moved Gemini to the top of the food chain. Still not 100% stable, but for complex tasks it’s the best in terms of cost/quality ratio.
Grok 4 also approaches tasks comprehensively and in some cases is better than Gemini, but he is chaotically good.

I decided to recommend both Claude and Gemini.

  • Gemini is good for writing code. For debugging too.
  • Claude is worse in code architecture. Perhaps I am not using a completely correct and complete term, but that is exactly the feeling.
  • Gemini sometimes breaks right in the middle of execution. Claude is absolutely stable.
  • The main thing: they have different approaches. When one works poorly, let the other one have a look.
  • Well, Claude is the least lazy of all the models. Sometimes this is bad. But if you intentionally launch it with this intent, it turns out great.
  • Grok 4 is sometimes smarter and can sometimes work for tens of minutes without interruption. But I can’t say that it’s better than Gemini or Claude. At least I couldn’t achieve such an effect, although I really hoped for Grok.



I thought Sonnet 4 was more expensive than Grok 4 :thinking:

If you are a user of Agent Compass, then at the end of its work, Agent outputs a report. GPT-5 is lying in these reports, even if she has evidence that she is lying. Neither Gemini nor Claude allow themselves to do this.

I jumped to conclusions - all models (GPT-5, Gemini, Grok 4) have problems noticing an incorrect message in the console output :man_facepalming:

2 Likes

My choice of models now: Gemini 2.5 Pro, GPT-5-high, Grok 4, Grok Coder Fast

  • Gemini 2.5 Pro and GPT-5-high: GPT-5-high is smarter, but Gemini less lazzy. I thought Gemini had a more complex approach because he’s smarter, but in reality, the GPT-5 is just a soulless machine that does only what it was asked to do in the last request. You can’t even give her hints on the way through the Send to Queue - she’ll just ignore the previous commands and do only what was in this short hint.
  • Grok 4 is expensive and still broken. Occasionally I run it for a different look at the problem.
  • Grok Coder Fast - I use it to quickly collect information about something in the repository. But sometimes he’s too dumb even for that. It can be replaced with Auto or GPT-5-mini (I haven’t tried -nano, as mini is already quite far behind in intelligence)
  • Claude Sonnet 4 Thinking - I like him, but he’s too expensive relative to everyone else.
1 Like

I forgot to say that Gemini is the best in terms of cost-quality-speed. But GPT-5-high is more trouble-free.

1 Like