I reported this and asked for someone to tell us how summarize is supposed to work here: How is `/summarize` supposed to work? - #3 by RafeSacks. So far no answers.
I have not found the right case for using Gemini. I was so excited to give it a try after seeing the benchmarks but it seems ātoo creativeā to write code that follows both direction and is consistent with the codebase. Iāve found undoing its work and using either GPT-5.1 or Sonnet is more likely to write code that looks like something I would write (or is at least consistent with code that is already there).
I am sure that ācreativityā is because it is more likely to hallucinate and chose unexpected paths to a solutionā¦it does seem to have a very high chance to make something that works, but also something that may need to be re-worked/cleaned.
I suspect Gemini is better for more abstract use-cases, like new code problems or planningā¦maybe complex debugging without edit access(?), but I havenāt had the opportunity to test this theory yet.
Anyone else finding this?
Itās not as good as Sonnet at following instructions
I have a set of agents (in Opencode) with the last step including a formatting rule for the output. Sonnet always follows this, whilst Gemini simply ignores it ![]()
Try to use commands as well, for me they work better than rules.
Last week I might have said something else, but after this message of mine I have not been able to do anything sensible with Sonnet 4.5 anymore. It ignores what I say, forgets parts of plans, doesnāt seem to notice obvious mistakes and takes weird paths on making things happen. Iām very afraid even trying, although have to when gemini3 is overloaded, because most of the time it will screw-up the code more than it manages to do good.
Not sure why, but every time it runs a terminal command, it wont know when the command finished and I will have to manually interrupt that command
Its only 184 for the sentences you see. The rest is input tokens, so your files, etc.
Despite the hype and the ā1 shot skillsā you have to be really careful with Gemini
You can give it a plan and tell it ādo step 1 and 2ā and it will try to build the entire thing, whereas Sonnet, Haiku, Codex, etc will only do what you ask. Thatās fine for simple tasks, but if youāve broken down a big task ā get ready for Gemini to build slop on top of slopā¦
The solution is to remove anything from your plans you donāt want it to do in that run
Whenever I use any AI model, I give it a plan to follow, whether itās for research or something to accomplish. A model, any model, should never be allowed to act on its own without knowing the details.
Gemini 3.0 is great because it doesnāt make up solutions, and when well-guided itās very powerful. Something Iāve noticed here is that many times the model is given very poor direction and thereās no follow-up, and thatās why the model creates whatever it wants and becomes overly creative. But if you set the rules of the game using commands and a structured, reviewed plan, it doesnāt failāor fails very little. This model is far superior when used correctly.
Fix the connection problems already, for crying out loud. Why do I keep getting errors all the time, ā ā ā ā it? Why canāt you test things before releasing something
Okay, the model is now working, but someone made it super dumb meanwhile. It now makes conclusions that are far from the superior thinking it originally had.
After a few days of getting ā ā ā ā ā ā off, I went back to Sonnet 4.5. Gemini 3 Pro is completely unusable ā it thinks forever, gets overloaded all the time, and even though it fixed a few things in my code, Sonnet still had to finish them because halfway through Gemini would throw an overloaded-model error. For me, itās unusable on the Ultra plan.
One common bug in Gemini 3 is that it throws the thinking as the output of the prompt, so you end up with this huge answer, where the model is just thinking out loud.
Itās a new model, it needs some time for them to fix the bugs. Iām sticking with Sonnet 4.5. This month Iām already using my third Ultra package and Iām at 63%, but itās still way cheaper than Coding House. And I donāt even want to think about how long this would take through Coding House. This is my third project that actually feels like it might finally work out. Iām roughly 15k euros down, but I donāt regret any of it. People have bigger problems anyway xD
I donāt think it beat Claude 4.5 on SWE or maybe one percent better?
Gemini didnāt:
No idea what this output is, but itās going around in circles not great it seems the thinking part is wheel spinning, wait, let me think, oh yeah i need to do this, oh wait i missed thatā¦.
Does Gemini 3 keep context between requests?
Dashboard shows that the entire context is loaded into the Input, and not as a Cache Read.
I didnāt have this issue on my end, but it annoys me that I have to confirm its action plan every moment. Sonnet makes more decisions on its own ā not always the right ones, but itās easier to type āstopā once than āOKā ten times. With Sonnet I can work for 5 hours without resetting the chat, while Gemini has no idea what itās doing after 30 minutes.
Cursor should really think about creating a higher package than Ultra.
Iām also on my 2nd Ultra.
