My opinion about gemini 3.5 flash

After trying out the Gemini 3.5 Flash model for a while, I wanted to share some thoughts. Sure, it’s incredibly fast, but when given the exact same prompt, it burns through way more tokens than other models. For instance, solving the same issue took Codex 5.3 about 200k tokens, whereas Gemini 3.5 Flash used 1.32 million—that’s over six times as much!

Because of this massive token consumption, Flash basically loses its two main selling points: being cheap and fast. It seems like Flash tries to brute-force solutions by exploring everything, while Codex is better at deep reasoning and only checking the most relevant files.

Hey, thanks for the feedback, that’s an interesting observation.

What you’re seeing is generally expected. Flash is optimized for speed and often trades that off with broader exploration like more tool calls and reading more files. Codex and Sonnet usually take more targeted actions by doing deeper reasoning. On simple or medium tasks, Flash wins on latency and cost, but on harder tasks with lots of context, that strategy really can end up costing more.

If you want the team to look into a specific case, send the Request ID for both runs (Chat > three dots > Copy Request ID) plus a short description of the task. It’s hard to tell from the raw numbers whether this was a typical pattern or just a case where this task didn’t fit Flash well.