GPT-5-mini is more than usable now!

I’m obsessed with the balance of speed & price vs quality of gpt-5-mini, it is my new go-to model and a great replacement for unlimited auto.

It struggled with tool calls and stuff for a while since it became available. But today it works almost perfectly. No editing errors, no wrong MCP calls, nothing!

Even the todo list feature now works flawlessly (and what I really like about it — it doesn’t use it until there’s a reason). Sonnet / Gemini / others often just create todos even if I ask them just for discussion. GPT-5-mini on the other hand didn’t create anything, didn’t touch anything, just researched the codebase and provided me with its analysis. And the moment I asked it to make it into actionable todo list, it did just that.

I then proceeded with asking it to act on the last item first, it did just that. After that I had a discussion with it and we threw away a few other items and proceeded with others with some changes. Didn’t do anything I didn’t ask it to, used all tool calls flawlessly and respected all the changes I asked for and all the rules I have in the project.

(note: I didn’t even wanna work on the improving said entities pattern, but its suggestions were actually pretty decent, and after slightly tuning them I’ve decided to implement them. IT WAS SUPPOSED TO BE A TEST OF HOW IT HANDLES TOOL CALLS NOW LMAO)

I think people don’t appreciate enough of how obedient this model is with instructions.

2 Likes

Great post, thank you. What are you using gpt-5-min usually for?

mainly for brainstorming, researching, refining plans, etc. I rarely let LLMs generate meaningful code, only boilerplates and existing patterns. GPT-5 in general is really good in analysis and technical writing, and mini version is quite decent at it as well especially for the price so it works wonders for me :slight_smile:

I wonder if we can even replicate some of the things Alibaba’s Qoder introduced in Cursor with it like automatically documenting a project for agents to use or refining user requests before sending them to an expensive “main” model (maybe even like “quests” in Qoder), since it’s really cheap to use…