Has Anyone Noticed Model Output Quality Is Dropping as Usage-Based Billing Rises?

Generally speaking, it seems they definitely changed the system prompt about a day after Grok 4 was launched. It became noticeable not just because Grok 4 started working properly, but I also saw a shift in how Gemini 2.5 Pro was behaving. And honestly, I was pretty happy with these changes. All this time, my personal user rules have stayed the same.

Regarding your point about the output becoming more verbose — my experience has been the complete opposite. If you look at how my Grok 4 works now, especially after that first update, its behavior changed significantly. Right now, my Grok 4 will first spend a very long time studying the context without saying a single unnecessary word. Then, it makes its adjustments. Only after the work is completely finished does it report back on what it did. So, it basically only “talks” at the very end of the process.

If we’re talking about verbosity, Gemini has always been wordy, and Claude’s behavior seems to be pretty much unchanged.

I haven’t really seen any behavior that suggests it’s intentionally padding its responses. In fact, I’m more often dealing with the opposite problem: the agent just stops its work earlier than it’s supposed to. And I’m not just talking about the typical Grok 4 errors where it stops after the first action, but a more general “laziness.”

As for the slowdown in response time, I do occasionally see the model taking a moment before it starts working. I’m not sure when that started, but it hasn’t really been a big issue for me.

1 Like