Agent behavior much different for o4-mini, o3

The Agent seems to behave much differently when preparing context for o4-mini and o3, as compared to Sonnet 3.7.
Sonnet 3.7 (thinking) does quick visible thinking with the context I provided, and gets immediately to work, sometimes with no initial tool calls.

With o4-mini and o3, the agent makes 10-60 tool calls just grepping and reading files. It takes 3-20 minutes before I see any visible thinking, just so so many tool calls. And the model may forget the context that I provided at the beginning, launching into its own plan based on the context it assembled.

Am I missing something here? Is your agent code for Sonnet just that much better adjusted? The new OpenAI models have great instincts for coding, but something seems wrong with how you have them wired into agent.

The difference in agent behavior is the most confusing. I feel like I must have configured something incorrectly.
Why does it take many MINUTES to scan a bunch of local files? Could my file indexing be broken? It shows 1000+ files indexed.

1 Like

Same here, o4-mini is painfully slow, but it’s more accurate than gemini 2.5 pro in my case

Apart from the questions I raised above, this feature request would help the situation:

1 Like