04-mini / clause 3.7 / gemini 2.5 pro

ArtAndrew · May 7, 2025, 7:44am

Which better? Please share your experience

VibesUnbound · May 7, 2025, 9:30pm

I can’t stand o4-mini’s lack of context… its like 50 tool calls and no information about what it is doing.

Claude 3.7 trivially changed a unit test in TDD to pass test case in a very early chat with my testing and I just lost trust. It’s tool calling was better than 2.5 pro exp 3/25 and had more context for the human overseer.

Gemini 2.5 pro is my favorite but has issues with tool calling (edit_file issues in another post as well). It does a great job of keeping me in the loop and hasn’t taken shortcuts that were trust breaking although it does apply simplest fix ideologies with explanations. I’d rather wrestle with tools than bad or trivial code. Can’t wait for the new 2.5 pro exp 5/6/25 model to be in Cursor!

asim1801 · May 8, 2025, 7:07am

Claude has a better tool calling system, it doesn’t fail as often as Gemini. Gemini is great, but it struggles with tool calls. Check the image below, this is a new chat, though. And it keeps failing. Currently, I use Claude for UI/UX and Gemini for everything else.

asim1801 · May 8, 2025, 7:10am

Based on what I have read on the forums, the current model gemini-2.5-pro-exp-03-25 is actually the latest one under the hood.

VibesUnbound · May 8, 2025, 3:45pm

Thanks for the model context! Sounds like Gemini still needs to get better at tool calls and not generating the termination token when a response is unfinished, unfortunately.

Topic		Replies	Views
Gemini 2.5 pro performing poorly? Discussions	18	4237	April 14, 2025
Gemini 2.5 vs Sonnet 3.7 vs Grok 3 vs GPT-4.1 vs GPT-o3 Discussions	12	10973	April 20, 2025
Discussion of gemini-2.5-pro-05-06 Discussions	45	7832	May 12, 2025
Deciding which model to use (Claude vs O3-mini) Discussions	18	5178	February 16, 2025
Gemini-2.5-pro / Will Lie and Decieve Discussions	6	299	June 3, 2025

04-mini / clause 3.7 / gemini 2.5 pro

Related topics