Gemini-2.5-pro / Will Lie and Decieve

Burnett · June 1, 2025, 8:11pm

I used Gemini pro for the first time yesterday.
I began with small UI tasks. It handled these tasks without issues.

I started feeling comfortable enough to give it a complex task:
"We need to create a window that will allow the user to select an Icon.
In short, I watched Gemini roll my screen with code unil the server would not start. After back tracking Gemini got the server running.

I opened my projec to see the UI a complete mess.
Gemini explained how I didn’t need any of the other components to be viewable and how I could just use this new component I never seen. The lie was complex and tried to convince me. This happened twice.

In Short, I will never use Gemini again.

gustojs · June 2, 2025, 1:17am

By all means give it a second try later.

Gemini 2.5 pro is a great model as long as there’s enough project rules and documentation for it to work with. The more it knows about your requirements, the better decisions it makes.

It also works very nice for discussions and planning. I like to treat is as a coworker and discuss the solutions and whole project with it.

Also, it’s good at doing multiple tasks in the same response (as long as it doesn’t error out with tools usage), which allows us to save fast requests.

Finally, what really helps is making it write down what it wants to do in markdown files, that you can then both iterate on, from little tasks to epic-scale new features it works on from scratch.

The main downsides of Gemini 2.5 pro:

terrible issues with tools usage (either fails when trying to use tools, or tells it will now code and stops, probably erroring out under the hood)
obfuscated thinking process (another model summarizes it, so we can’t use the thinking process as a source of feedback for making our prompts better, unless we use a hack)

Burnett · June 2, 2025, 4:01am

I documented my project for 1 year, then found cursor as I was looking for a programmer. I have images and documentation galore.
I’ll give Gemini another try. Also, Just prior to my session with Gemini, I cleared the conversation and delted the index. Started fresh, so it’s not that either. I’ll try again and see how it goes. Claude 4 is dope!

XhakaTech · June 2, 2025, 4:34am

all LLMS lie, and they don’t even notice.

gustojs · June 3, 2025, 3:59pm

That is a surprisingly human behavior, btw.

tiz.io · June 3, 2025, 9:29pm

Humans never lie.

Burnett · June 3, 2025, 9:31pm

I gave Gemini another session and I ended up reverting 50% of what I did.
It’s my opinion that Claude-3.7-thinking is better than Gemini.

Topic		Replies	Views
04-mini / clause 3.7 / gemini 2.5 pro Discussions	4	312	May 8, 2025
Gemini 2.5 pro performing poorly? Discussions	18	4254	April 14, 2025
Gemini 2.5 now VS 2 weeks ago... what has changed? Discussions	2	600	July 4, 2025
Gemini keeps telling me to code Discussions	7	331	April 19, 2025
Gemini 3 Pro is basically unusable! Feedback gemini	8	500	February 17, 2026

Gemini-2.5-pro / Will Lie and Decieve

Related topics