Discussion of gemini-2.5-pro-05-06

I’m having a rough experience with the new model. It exhibits longer thinking times, and seems to be slightly worse than its predecessor. Occasionally, the model pauses for a long time, and when it starts outputting tokens it returns a boatload of them - usually gibberish spread out over many new lines

I would love to try it out properly but Gemini contantly just stops mid flow and I cant get it restarted. How are people getting it to actually finish a large set of changes?

any updates when the new version will be available in cursor (are you sure its autoroutes to the new version) ?

1 Like

Appears twice Gemini 2.5 pro… Which one is the correct?

1 Like

My understanding is 200k with long context setting on (which supposedly is more expensive in terms of used fast requests).

The docs around long context are unclear how it exactly works sadly, so it’s a guessing game.

1 Like

I greatly dislike cursor model naming decisions. o4-mini is not o4-mini, gemini-0325 is not gemini-0325. If you are going to update it everytime just put the name as “gemini 2.5 pro” not the specific model number

1 Like

I saw a video on the new Gemini model today. They ran the 3-25 version next to the 5-6 model that came out yesterday. The new model takes about 15% longer thinking than the older version. It looked like it is an improvement.

What made this image?

Amazing model so far.

Yeah, he’s great.

1 Like

Retro comic panel: hero figure hugging a shining ‘Gemini’ device, speech bubble saying ‘Goodbye Claude!’, another small panel below shows ‘Claude 3.7 Sonnet’ box crashing, bold halftone dots, vibrant primary palette”

Using AI-generated, cue words

1 Like

Similar here.
I got this error the first time it tried to write code.


It’s next attempt took a long time then all the code appeared at once.

That was the first and only time it ever worked, then it was just error after error.
image

image

I’m giving up on it for now.

It’s a bit ironic Google replaced their best model with a ‘broken’ one for their developer conference… :melting_face:

1 Like

which LLM/platform?

So far it has beaten every other model I have recently used.

Really capable.

2 Likes

The AI generating the image is a Chinese beanbag, but he doesn’t seem to have an English version

豆包 - 字节跳动旗下 AI 智能助手

1 Like

SWE Bench is my Go to

Do you guys know if gemini pro MAX is using the newest 05-06 model?

Yes, I’ve noticed that the new version of the linter has a lot fewer errors I was doing the url react and found that after writing 10 files 1000+ lines of code once, the linter was error free and the code was significantly more capable (but took twice as long or more to think about)

I didn’t know it was the new model.

I was using GPT 4.1, which got stuck on solving a problem, so I switched to Gemini, thinking it was still the old version.

It solved the issue with a single line edit. On other tasks as well, it made very smart edits and comments. I also asked it to review some code and was impressed by the suggestions. It didn’t try to reinvent the wheel — it said the code was mostly fine, aside from a few minor issues. The suggestions were all relevant, and it caught some things I had completely missed.

Now I understand why it feels so good. From these early tests, I’d say it’s a step above the rest.

I definitely noticed the longer thinking time, though.

4 Likes

I did some testing today, and the new 2.5 is definitely better. I compared it with Claude 3.7; this is the first time I got much better results from Gemini than from Claude. It does think longer but is much less error-prone. This also means it is “potentially” cheaper than Claude because it makes fewer mistakes and requires much less back-and-forth.

In my opinion, the more extended thinking is an acceptable tradeoff if it means fewer prompts to get it to where it works.

Much more testing is required, ofc, but so far I am actually impressed

5 Likes