Since the release of the new Gemini 05-06 model just a few days ago, I’ve noticed a dramatic slowdown in the Gemini-2.5-pro-exp-03-25 model when running in the slow queue. This used to be my go-to because of its near-instant responses, but now it often takes 30 seconds to 2 minutes to reply.
The Flash model still responds quickly, but the quality feels noticeably worse—it struggles with complex tasks and image handling. Auto Mode hasn’t been much help either since it doesn’t show which model is active, and the response quality has been underwhelming.
Is anyone else seeing this? Could it be rate-limiting from Cursor or Google? Or is it just a traffic spike from users flocking to test the new model? I haven’t found any official updates yet, so I’m hoping it’s a temporary hiccup.
Would love to hear if others are experiencing this—and if you’ve found any workarounds.
Yes. Me too. Previously it still responded quite quickly even if it was a slow request. But now it is probably long as Claude 3.7 Sonnet in a slow request.
Massive slowdown across all models. I’m literally trying to do now with agent, choosing any models (i’m on paid subscription) it’s ‘generating …’ endlessly
After like 15 minutes i get message that “Your conversation is too long. Please try creating a new conversation or shortening your messages.”
The conversation is not too long… maybe 4-5 back and forth
Yesterday and today i got pretty slow responses compared to the weeks before from gemini-2.5-pro-03-25, but wow it’s now outputting like 3 times more text and analysis when i ask it to review something on my codebase. I think the wait is worth it on this new updated model! I heard the gemini-2.5-pro-03-25 is now actually rerouting to gemini-2.5-pro-05-06 since a few days right? So, yeah, very different from before, big waiting time now on par with 3.7 sonnet, but the output quality and the fact it does go very deep into the “last mile” of the code analysis when it needs to is pretty impressive. I hope this new behavior on Gemini 2.5 pro will last!
For quick responses, i guess i’ll just switch to GPT4.1, it’s good enough for most simple to medium complexity requests.