I’ve been testing different models, especially the Gemini 2.5 Pro and the Claude Sonnet 4, and they seem to have very similar, if not identical, responses or solutions. So, I’d like to know which one is the best as an agent, or at least which one is the best in each case.
Hi, I use Cursor a lot. Like 12 hours or more.
”Claude Sonnet 4 Thinking” is my favorite. My project is pretty complex and cs4t has saved my A.S.S. many times.
agent mode best Claude 4 And the annotations are very standardized
Gemini performs better if a very large context is required.
Grok 3 is now my favorite
Agreed, Grok-3 is the best even despite the fact Cursor has still failed to get Grok-3 to use reasoning despite stating it can & halved it’s context to 60k.
Ya. Thanks to your comment. It fixed one of my WordPress plugin issues immediately. The other one in two tries. Thank you. I was getting mad last night bad.
Grok-3 even uses the user_prompt MCP I use sometimes on tasks I want to control better or when I’m working through step by step UI changes meaning I can fix a lot of issues in one request instead of 5.