What am I doing wrong with claude?

exoder · June 19, 2025, 9:00am

Hello! Everywhere I find info that claude-4 is best at coding. Here is my story.

For my app I asked 3 models the same - to analyze my code, and give me a plan to implement new feature.

Tested models:
o3, gemini-2.5-pro and claude-4-sonnet (all thinking variants)

My prompt was the same, all chats new. But I did not mention in prompt one requirement which is crucial for the right architecture - I forgot.

Out of 3 models only gemini gave me good architectural plan because it pointed out for the possible requirement which was not mentioned.

o3 and claude gave me plan of the wrong architecture - they did not foresee potential problem like gemini did. I had for add additional requirement for them to redraw plan.

Than I gave all three plans to every models and asked to compare in terms of robustness, logic, good practises etc.

As a result all 3 models gave me practically same response:
o3 - best, most robust and clever plan
gemini - almost same, but with some minor flaws
claude-4 - the plan it the worst (and interestingly, claude itself gave the most negative review of its own plan).

I understand that it is only one case. But how come the “best” coding model according to maaaany opinions out there makes worst coding decisions?

Secon - when I presented same ■■■■■ plan to claude - but told that it was his plan and asked to analyze it - claude answered that is a very good plan and very smart!

What am I doing wrong?

davedev · June 19, 2025, 9:51am

You’re using claude to structure your plan, i personally don’t do that!

the best ones are chagpt o1/o3/deepseek to structure.

but to EXECUTE THE PLAN!

oh my friend… then claude goes stupdly better imo haha

exoder · June 19, 2025, 10:20am

Hey, thank you for reply. I did not like how claude-3.7 did coding - almost always ruined code which had nothing to do with what was in my task. Claude-4 was not yet tested in this regard. o3 on the other hand almost never ruined my code. Gemini 50/50

condor · June 19, 2025, 10:23am

Hi @exoder and welcome to Cursor Forum

I also did not like how Claude 3.7 Sonnet coded, it was making mistakes or coded too much for features I never asked.

Claude 4 Sonnet works well with coding for me.

Its definitely good to try different models to see how they perform.

condor · June 19, 2025, 10:24am

Hi @davedev also to you welcome to Cursor Forum and thanks for contributing its always good to see what works for others

Topic		Replies	Views
Is GPT-4o better to use or is Claude 3.5 sonnet better to use? Discussions	4	1310	February 18, 2025
Since o3 is now 1x in cost, why would anyone use Claude Opus? Discussions	7	1681	June 16, 2025
Developers’ perspective: a comparative analysis of the applications of Claude‑3.7‑Thinking, Gemini‑2.5‑Pro, and the o3/o4‑High series models Discussions	2	669	April 19, 2025
What's going on with Claude 4 Sonnet? Discussions	13	2656	July 3, 2025
Deciding which model to use (Claude vs O3-mini) Discussions	18	5148	February 16, 2025

What am I doing wrong with claude?

Related topics