"420-bug": The agentic o3-mini often gets stuck just before actions it announces

it happens in Cursor 0.45.9, e.g.:

“Now I am going to change file XYZ” says o3-mini, but nothing happens.

or

I say “run tests” and o3-mini answers “I am going to run tests now” and even no CLI box appears to confirm.

Switching back to claude-3.5-sonnet helps.

I guess this kind of bugs in agents will be named 420-bug :joy:

4 Likes

Hey, could you share a screenshot of the error?

I have the same thing. After a few messages, the model says what he did, but he doesn’t do it. It works fine with a fresh compose.

2 Likes

as an immediate reply I can share now the non-English screenshot only.
(I became too lazy to write in English with nowadays LLMs).

I gonna switch the language to English and will come back with English screenshot.

1 Like

Same experience with o3-mini. It says it will use the tools, but never does.

I often have to say “Proceed” to make it work.

(This is for applying code changes to the file.)

1 Like

I’ve had the same.

yea its terrible at knowing when to call its tools. You can usually edit your previous message and add something like “use your tools and edit the files” or “use your cli tool to run the command”
and re-run it.

I say edit the file because I get the sense that if you leave the “bad” responses in history it can get stuck unable to edit.

even that sometimes doesnt work… but i found another trick that seems a lot more stable. EVERY TIME you write a message, the last line you can say “use your tools and edit the files like you did before” assuming you got it to work at least initially.

Same issue.

Image with some redactions below:

Same issue here. I imagine this is an issue with how tool invocation is set up. would make sense for cursor to fix asap as I, like probably most of us, am having to make 2-3 calls before the change is actually applied, which is likely costly on the backend

1 Like

Yeah o3-mini can be a bit flaky with tool calls sometimes. The best fix is to start a fresh composer window when this happens. You can also try explicitly telling it to “use tools” or “edit the file” in your prompt, but starting fresh usually works better

If you need more reliable tool usage, Claude 3 Sonnet or GPT-4 are probably better options, they’re just a bit more expensive