Gemini 3.0 Pro - Out Now!

I reported this and asked for someone to tell us how summarize is supposed to work here: How is `/summarize` supposed to work? - #3 by RafeSacks. So far no answers.

I have not found the right case for using Gemini. I was so excited to give it a try after seeing the benchmarks but it seems ā€œtoo creativeā€ to write code that follows both direction and is consistent with the codebase. I’ve found undoing its work and using either GPT-5.1 or Sonnet is more likely to write code that looks like something I would write (or is at least consistent with code that is already there).

I am sure that ā€œcreativityā€ is because it is more likely to hallucinate and chose unexpected paths to a solution…it does seem to have a very high chance to make something that works, but also something that may need to be re-worked/cleaned.

I suspect Gemini is better for more abstract use-cases, like new code problems or planning…maybe complex debugging without edit access(?), but I haven’t had the opportunity to test this theory yet.

Anyone else finding this?

2 Likes

It’s not as good as Sonnet at following instructions

I have a set of agents (in Opencode) with the last step including a formatting rule for the output. Sonnet always follows this, whilst Gemini simply ignores it :sob:

Try to use commands as well, for me they work better than rules.

Last week I might have said something else, but after this message of mine I have not been able to do anything sensible with Sonnet 4.5 anymore. It ignores what I say, forgets parts of plans, doesn’t seem to notice obvious mistakes and takes weird paths on making things happen. I’m very afraid even trying, although have to when gemini3 is overloaded, because most of the time it will screw-up the code more than it manages to do good.

Not sure why, but every time it runs a terminal command, it wont know when the command finished and I will have to manually interrupt that command

1 Like

Its only 184 for the sentences you see. The rest is input tokens, so your files, etc.

Despite the hype and the ā€œ1 shot skillsā€ you have to be really careful with Gemini

You can give it a plan and tell it ā€œdo step 1 and 2ā€ and it will try to build the entire thing, whereas Sonnet, Haiku, Codex, etc will only do what you ask. That’s fine for simple tasks, but if you’ve broken down a big task – get ready for Gemini to build slop on top of slop…

The solution is to remove anything from your plans you don’t want it to do in that run

2 Likes

Whenever I use any AI model, I give it a plan to follow, whether it’s for research or something to accomplish. A model, any model, should never be allowed to act on its own without knowing the details.
Gemini 3.0 is great because it doesn’t make up solutions, and when well-guided it’s very powerful. Something I’ve noticed here is that many times the model is given very poor direction and there’s no follow-up, and that’s why the model creates whatever it wants and becomes overly creative. But if you set the rules of the game using commands and a structured, reviewed plan, it doesn’t fail—or fails very little. This model is far superior when used correctly.

3 Likes

Fix the connection problems already, for crying out loud. Why do I keep getting errors all the time, ā– ā– ā– ā–  it? Why can’t you test things before releasing something

Okay, the model is now working, but someone made it super dumb meanwhile. It now makes conclusions that are far from the superior thinking it originally had.

After a few days of getting ā– ā– ā– ā– ā– ā–  off, I went back to Sonnet 4.5. Gemini 3 Pro is completely unusable — it thinks forever, gets overloaded all the time, and even though it fixed a few things in my code, Sonnet still had to finish them because halfway through Gemini would throw an overloaded-model error. For me, it’s unusable on the Ultra plan.

2 Likes

One common bug in Gemini 3 is that it throws the thinking as the output of the prompt, so you end up with this huge answer, where the model is just thinking out loud.

It’s a new model, it needs some time for them to fix the bugs. I’m sticking with Sonnet 4.5. This month I’m already using my third Ultra package and I’m at 63%, but it’s still way cheaper than Coding House. And I don’t even want to think about how long this would take through Coding House. This is my third project that actually feels like it might finally work out. I’m roughly 15k euros down, but I don’t regret any of it. People have bigger problems anyway xD

1 Like

I don’t think it beat Claude 4.5 on SWE or maybe one percent better?

Gemini didn’t:

1 Like

No idea what this output is, but it’s going around in circles not great it seems the thinking part is wheel spinning, wait, let me think, oh yeah i need to do this, oh wait i missed that….

2 Likes

Does Gemini 3 keep context between requests?
Dashboard shows that the entire context is loaded into the Input, and not as a Cache Read.

I didn’t have this issue on my end, but it annoys me that I have to confirm its action plan every moment. Sonnet makes more decisions on its own — not always the right ones, but it’s easier to type ā€œstopā€ once than ā€œOKā€ ten times. With Sonnet I can work for 5 hours without resetting the chat, while Gemini has no idea what it’s doing after 30 minutes.

Cursor should really think about creating a higher package than Ultra.

I’m also on my 2nd Ultra.

1 Like