I don’t know if it’s just my impression, but it seems like Claude Sonnet 4.5 Thinking has dropped in performance: it can’t quite follow my instructions for creating a section of my software. I also tried Claude Opus 4.5 Thinking, but it didn’t do it well either.
I wanted to try Gemini 3 Flash Mini (High), which instead understood the instructions immediately and created the section in a flash and exactly as I wanted.
Is it just my impression, or have you also noticed that Gemini 3 Flash Mini is now superior to Claude Sonnet 4.5 Thinking?
What do you think?
Hey! This is an interesting observation. Model performance can vary depending on the type of task, the prompt, and the context.
A few points:
-
Different models really are better at different kinds of tasks. Gemini Flash is good for quick, simple tasks, while Claude is better for complex reasoning and large contexts.
-
There are known issues where Claude 4.5 sometimes doesn’t follow instructions properly, and the team is working on improvements.
-
If Claude Thinking isn’t giving the results you need, try:
- Rewriting the instructions more explicitly
- Using shorter, clearer prompts
- Breaking the task into smaller steps
- Using Cursor rules to set preferences
I’d be interested to hear from other users about their experience with different models. What kinds of tasks are you working on?
Hi Dean,
I’m building a WordPress plugin and have been using Claude Sonnet 4.5 so far, and I have to say I’ve always been happy with it. Recently, however, I’ve noticed a decline in its performance: on tasks (more or less similar) that it used to perform flawlessly, I now see it having problems, as if it were a little “less intelligent”. On a graphical interface (html+js+css), it couldn’t make the changes I asked it to: something was always wrong: the interface wasn’t how I wanted it. When I gave the same instructions to the Gemini 3 Flash Mini (High), it immediately understood them and executed the instructions perfectly, creating what I asked for. Now I’m unsure which AI to continue developing with…
Thanks for the extra details! I get your situation. For UI tasks (HTML + JS + CSS), different models can give different results.
About picking a model, if Gemini 3 Flash Mini works better for your WordPress plugin tasks, it’s totally fine to use it. Different models are better at different kinds of work. You can always switch models depending on what you’re doing.
The team is working on improving Claude 4.5’s behavior, but for now I’d recommend using the model that gives the best results for your tasks.
Thanks for your reply, Dean.
Bye
The sonnet version of 4.5 is varying wildly. It used to be really good no matter what you threw at it - within reason. But sometimes it feels like it’s a completely different model - almost getting 4.0 vibes lately.
When I tried using 3.0 flash I was surprised how bad it performs. In my experience I do not use flash 3.0 at all and likely never will, at least for coding tasks. It can also get rather expensive due to its extended thinking.
There was a task I needed done, Opus 4.5 one shot it and cost me around 1.5usd.
Before I tried the same prompts with 3.0 flash, it took a really long time, result was broken, it failed to understand what I want and the request cost me 1.2usd. I just reverted and went to Opus.
Your experience might vary.
Hi everyone,
after testing Gemini 3 Flash for a whole day, I can say it’s fast but not very precise: it doesn’t analyze the context and existing code well. It seems to rush to execute instructions without much thought. I’ve gone back to using Claude Opus 4.5, which now seems to respond better to instructions; let’s hope it lasts. What do you think?
i think its not wise for compare Gemini 3 Flash with Opus 4.5, its better to Compare
- Gemini 3 Pro vs Opus 4.5
- Gemini Flash 3 vs Sonnet 4.5
Since Opus and Gemini Pro its frontier and alsmost have same Intellegent.
So in case for using Gemini Flash, for simple task or even better, use it for execution. Keep it mind, than for plan i dont use Gemini 3 Pro, since i dont have good experience using it. usually i plan with GPT-5.2 or Auto than execute with Gemini Flash 3 or Composer-1