What's going on with Claude 4 Sonnet?

ShivaYama · May 28, 2025, 7:28am

I don’t understand why this particular 4th model CONSTANTLY hallucinates, reports the existence of something in the code that is not there and offers such erroneous solutions that I am shocked. The previous Claude 3.7 Sonnet works perfectly. Fellow developers, have you encountered this?

Norr · May 28, 2025, 1:00pm

Same here, Sonnet 4 cannot be trusted at all - some might argue that no AI can be trusted, but 4 is ridiculous, it is lying, cheating, skipping - at least from my pov. I switched back to 3.7 and it will take some convincing to bring me back to that mess.

liquefy · May 28, 2025, 1:09pm

I notice the same thing, but only throughout the day. During night, it is perfect, not even comparable, not sure how but during the day it is quite unusable for various reasons and every night, boom, magically it works like a charm.

condor · May 28, 2025, 1:25pm

Its literally the load in a specific regional server center and how the connection from your internet to that area is routed.

In my area lots of people complain to have so many issues with certain models, where I’m prompting the same model and its super smooth (but I use VPN which improves my connection and likely re-routes the request to better regional hub).

For Sonnet 4 hallucinations, each model requires some adjustments on prompts and to get a feeling for how to instruct it. Best is to start small and build up in complexity.

It happens to me often with well working prompts on one model, when i switch to another its compleltey useless. So for some tasks I still use Claude 3.5 Sonnet because its just performing so well.

We might need a library of well working prompts and stuff that makes responses better or worse.

Marlon · May 28, 2025, 10:33pm

Something special happened from last night to this morning (NY time). Claude 4.0 started fixing issues without a blink…I am sure something happened in the backend after all the ranting in this forum.

Thank you!

condor · May 29, 2025, 6:02am

No the ranting didnt particularly help but users who reported issues helped identify causes

AhmedFatrah2001 · June 4, 2025, 10:14pm

still ■■■■■ actually imagining stuff that doesnt exist and not following clear instructions dont recommend 3.7 is still much better

ShivaYama · June 18, 2025, 10:58am

It’s been a long time, eventually I’ve pretty much given up on claude 4 sonnet, I only use gemini 2.5 pro and o3. I don’t understand how you can use claude 4 in real tasks that require accuracy, it’s only good for jokes.

reg · July 2, 2025, 5:44pm

exactly the same problem, and it just hangs for hours, you give it a prod and it appoligises continues with a few more lines and does the same thing again. or its half way through doing something and says its busy and advises to use auto mode and a different host. or it just give you the VPN message . its beyound a joke now

Xernive · July 2, 2025, 6:01pm

Yes I have expressed this concern before with @condor Claude 4 as a “enhanced” LLM which supposedly utilizes more “LAYERS” of information and processing certainly seems lacking in the intelligent department. When a older model exceeds you by “3x”, clearly a issue is at hand especially when it’s everyone and not a single user.

As stated before, Claude 3.7 Thinking (non-max) works much better than Claude 4. I can’t even test with Claude 4 because the very first “code” it outputs is literal string of emojis and “OH MY!!! YOU WERE RIGHT!!!”

Its more CHATGPT than CLAUDE

Reznal · July 2, 2025, 11:02pm

Interesting, I found quite the opposite.
Previously I was using GPT 4.1 over 3.7 because 3.7 either hallucinated or went off on its own mission all the time.
With sonnet 4 I get very small amounts of this, and its generally only if I have been in the same window for ages.

Xernive · July 3, 2025, 2:04am

Interesting, Normal Claude 4 works for you? No excessive emojis or 3rd grade level sentences?

Reznal · July 3, 2025, 2:31am

Generally Claude 4 has worked really well for me. It sticks to what I want it to do and doesn’t hallucinate (much).
I do have to tell it to disagree with me or tell me I’m wrong if I’m wrong instead of trying to make me seem right all the time, but I need to do that with all the models.
I do use a memory bank with a bunch of rules and custom agent instructions, perhaps this has something to do with it?

Xernive · July 3, 2025, 4:50am

“I do use a memory bank with a bunch of rules and custom agent instructions”

Yes this would contribute most likely to that. It really depends on how much contextual awareness it’s allowed and can utilize. It’s basically raising a child in your own way.

Topic		Replies	Views
Today for me the sonnet 4 was crazy, it's normal? Discussions	4	319	July 8, 2025
Cloude Sonnet 4 works as 3.5 Bug Reports	10	463	August 1, 2025
Offer claude-3.7-sonnet as option (as claude-4-sonnet is worse for many tasks) Feature Requests	2	595	May 23, 2025
Claude Sonnet’s Identity Crisis… Solved! Discussions	11	366	July 15, 2025
Bug using Claude Sonnet 4.0 as agent Bug Reports	2	58	July 4, 2025

What's going on with Claude 4 Sonnet?

Related topics