O3-mini is LIVE! What version are we getting?

First off, shoutout to the Cursor team for pushing this out so quickly!

The team stating that their devs still prefer Sonnet for most tasks (which surprised them). (source: x.com)

According to the OpenAI post, o3-mini only outperforms o1 when it is pushed to ‘high’. (source: https://openai.com/index/openai-o3-mini/)

Curious which version Cursor is defaulting to, or if depending on the request, it dynamically calls the version that require the level of reasoning for the task.

Thanks!

20 Likes

Can we have an option to use the high version, or please give guidance on how we can set this up manually

10 Likes

I don’t know exactly which version they use but it already works amazingly well. I’ve been using deepseek-r1 for 3 days and was starting to really master it and make some pretty nice things. I just tried o3 for 30 minutes and it’s even more impressive. Coding has become so enjoyable for me :slight_smile:

5 Likes

It’s live in Github Copilot as well:

4 Likes

glorious capitalism :star_struck:

cant wait to see how well it works

1 Like

I really hope deepseek V3, deepseek R1 and o3-mini will remove pressure on claude sonnet 3.5

7 Likes

Yes we definitely need a full control and clarification over it

What’s the pricing model for o3-mini? From what I see it is cheaper than sonnet (sonnet is $3 / MTok input and $15 / MTok, where as o3-mini is $1.10 / 1M input tokens and $4.40 / 1M output** tokens, even lesser with batch api). I think it would be unfair to count this as a fast request.

14 Likes

Actually there are additional reasoning tokens in every response, so while the price may be cheaper per token it is likely similar per request

It doesn’t work very well yet. or there might be a problem with the version cursor offers us. the answers are too short, it is difficult to establish context. it misunderstands sentences. still sonnet is better…

It doesn’t feel like O3-mini to be honest. I tried it on ChatGPT and it had different behavior.

In Cursor, there seems to be no reasoning at all; responses are very immediate. Isn’t it all about reasoning for this model?!

7 Likes

The Cursor team is not clear on which o3-mini (low, medium, high)…

7 Likes

Hello community,

The conversation around the differences between the slow, medium, and high reasoning models has been very enlightening. I’d like to know which of these models is currently deployed in Cursor. Additionally, are there any plans to eventually offer all three options, perhaps based on the complexity of the tasks? This flexibility would be a great asset.

Thanks in advance for any clarification!

2 Likes

Cursor is being cheap again - it’s the O3-mini low version, not the O3-mini high. It’s useless don’t use it.

They did the same as R1, serving a cheap version.

7 Likes

Yes, I have the same impression. Question still remains whether they will give choice of o3-mini high. I still can’t understand why Sonnet is soooo dumb in Cursor, making trivial mistakes. When I switch to Aider or Cline Sonnet is much smarter.
Currently I can only use Deepseek in Cursor to get good enough solutions.

1 Like

We urgently need to access o3-mini-high, it’s by far the best coding model out there. Why do Cursor team get to decide for us which models we should be using? Why is it better to use chatgpt.com than use Cursor? Why is it better to use aider than use Cursor? Seriously? I’m so mad

11 Likes

My assumption is they’re getting subsidized by Anthropic and OpenAI which is why we get Claude and 4o included.

Whatever version of R1 they’re using is terrible. We really need o3 high ASAP.

3 Likes

Looks like in the benchmarks o3 mini outperforms o1 on software engineering and SWEbench etc.But mostly in the higher modes.

Also I just tried to compare it to the other IDE involving watersports and the Forum blocked me from using the name. Is there some kind of “He who shall not be named rule in this forum I need to be aware of”

8 Likes

“Watersports”, haha. Same happened to me, way stoopid.

2 Likes

Can’t believe i will see a day where sonnet is worse than other models. For the longest time i’ve been dependent on claude sonnet oh my god, it feels like chatgot 3.5 turbo now

2 Likes