It’s funny how you get used to the quirks of a model. If I switch I have to get to know a whole new pair programmer style. It’s a very organic and weird thing.
where did you get that chart from? I don’t think I’m navigating SWE-bench correctly.
sonnet seems to be getting slower recently. I don’t know if it’s because the computing power is used to train new models. However, the gpt context in the cursor does not perform as well as on the web page.
I’m surprised Anthropic didnt release a new model , they were probably very complacent
proof?
I wonder why it’s so difficult to get an answer from devs though
I see it’s true, there are cry babies everywhere.
+1 to thank you to the devs for the fast and free implementation! but I would also like to know which version of 03-mini this is pretty please
Does the o3 mini model work well for you in Agent mode? Mine is terrible… I don’t understand what the problem is… Or do you mainly use it in chat mode?
There are many issues with it currently, but devs are working on it. O3-mini not agentic?
I’m noticing that it wont give a summary of what it did in agent mode. Very fast though.
I put up a hot take on my experience with it today on youtube - (2) Is 03 Mini The Best For AI Dev - Cursor AI Feb 25 - YouTube
TLDR
- I’m sticking with Sonnet 3.5
- o3 works with agent unlike DeepSeek which is nice
- It thinks but does not implement, I have to push it to implement
- It seemed to have issues pulling context and having project awareness in agent mode unlike Claude
- The reason I’m not using it more is more so due to it’s implementation in Cursor at this time opposed to independently testing it at the native level. I’m not expecting the Cursor team to have the bugs ironed out at this point it literally just came out.
- When I switch models even with New Composers it kept going back to o3 is this a bug?
Very good
It happens to me, both in agent and in chat sometimes I have to “insist” several times to apply the code, or to at least show what should be changed in the code, and at most I see a diff in the composer but if you try to apply it, it tries to do it in the wrong file so it fails to apply.
It ends up working for the moment more as a chat to talk than to apply code, but without the additional reasoning capacity that ChatGPT offers from the web, which is working better for me at the moment.
It’s still very early, they are surely improving the implementation in Cursor.
Edit:
This often happens to me too when using o3-mini, randomly
The “personalities” between the two models are quite distinct
My impressions are that Claude Sonnet is chummy, chipper, infinitely more helpful as an assistant. Sometimes TOO helpful to a fault for sure – but it’s a reliable go-getter, problem-solver.
O3-mini is kind of the grouchy know-it-all developer who likes to drop knowledge-bombs, but does the bare minimum amount of work. It doesn’t feel like it actually goes out of its way to be helpful…
o3-mini can write quite nicely. I have few short notes about role-play in global pre-prompt and I kinda like it:
BTW I also tested the same prompt on ChatGPT, which in free should be medium o3-mini and it also didn’t manage to get the number of letters right.
I’ve upgraded to the latest version 0.45.8, but unfortunately it still happens, even if we try to delete/reindex the codebase.
You have to beg o3-mini to apply the changes or to show some code even in diff. It often does so after several attempts answering “yes, apply it”, “yes, show me the code”, “yes, please do it!”
It would be awesome to be able to set it to o3-mini high with your own LLM key. Or even to be able to set arbitrary completion kwargs when using your own LLM keys.
Confirming that o3-mini
has been updated to use high
reasoning level in Cursor.
Please let us know what you think
Thank you!
you guys are awesome! <3
Thank you so much!