Is Claude 4 Opus good as Agent?

What are the results, the sensations, the size of the hole in your pocket?

Has anyone tried writing a lot of code through it?

i dont find any huge advantage in comparison to sonnet 4, i did use it many times, i also always used sonnet 4 on the same task to compare and they are nearly identical, just not in costs. if sonnet 4 cant solve it, neither can opus, if opus can, so can sonnet 4

2 Likes

With the right prompts, auto is teally the only thing you need.

6 Likes

share your prompts, cuz I’m about to loose my mind with this auto mode :smiley:

1 Like

From my side Sonnet is working well where Opus also works but so far doesnt provide for main task implementation, testing or debugging a significant benefit in comparison. There are likely tasks which need Opus but those might have higher complexity or require better reasoning quality.

1 Like

It goes like this:

"<problem statement>

<actual implementation written by you>

…and then the secret sauce at the end:

Take my implementation and copy paste it into the project. Do nothing else."

This prompt strategy has never let me down, even on Auto.

PS. I’m not serious.

3 Likes

Bruh, you’re the laziest dev I ever met :joy:

1 Like

My take: Opus is solid for really complex problems when other agents get stuck - it’s great at identifying root causes and giving detailed analysis. But man, it’s expensive as hell.

I’ve found a better workflow using o3 for debugging and Sonnet 4 for implementing, but honestly it depends on your experience and prompting skills. I’m still learning a lot of this stuff and not great at prompting yet, so Opus actually helps way more since it can work with my mediocre prompts better than smaller agents.

The one place Opus really shines is app architecture reviews - minimal prompting needed for solid feedback and reasoning. For building small/medium apps I rarely use Opus unless I wanna do something fancy or need to compare and contrast approaches.

When I do need architecture brainstorming/feedback, I prefer using it in Claude over Cursor. In Cursor it drains my usage fast and the refresh wait is brutal, but in Claude I wait 5 hours and know exactly how many credits I have left.

1 Like

Give me your prompts and i’ll tell you how to make it better.

the more examples the better.