I used O3 yesterday in Cursor IDE

cocode · April 22, 2025, 7:25am

The performance was unmatched in my entire experience with Vibe Coding.
It refactored a complicated piece of code, modularized it, raised the abstraction level, and added 44 tests—all of which passed.

It took about 3 hours of work. I didn’t check the app in the browser even once, I promise.

Today, when I opened the app, it was working flawlessly.

The only downside is the cost.
178 o3 requests * 30 cents per such request $53.40
If I want to use this every day, I’d end up with a bill of almost $2,000 every month, which means I’d have to raise my fees.
It shows that quality and speed come at a price.

Final note: I supervised its every move, every single step of the way, and didn’t let it just code away. I forced it to adapt to my pace and follow my requirements. It wasn’t always willing to adhere to the .cursor/rules.

Do you have any thoughts?

z1073 · April 22, 2025, 7:36am

If you paste code into OpenAI’s web version for online editing, would it reduce the cost somewhat?

cocode · April 22, 2025, 7:42am

Good idea.

cocode · April 22, 2025, 7:46am

apropos pasting code into chatgpt or copilot, I recommend watching this. Your life won’t be the same again.

sdmat · April 22, 2025, 10:23am

FYI you can use the ChatGPT desktop client’s “work with” functionality with Cursor / VSCode

vishnukool · April 23, 2025, 12:21am

My experience hasn’t been so great frankly! It’s a hit or miss.
It did build and refactor a whole new feature. But it also struggles with things that sonnet 3.7 does better.

biendltb · April 23, 2025, 2:05am

It took about 3 hours of work

So you meant it takes 3 hours in auto mode or including the time you take to approve each of its moves manually?

121kr0x33 · April 23, 2025, 2:45am

How did you make it so autonomous? When I use Gemini 2.5 pro, it does a lot of “assumptions”, sometimes it just tells me what’s going on without carrying out the task, it can get side tracked, and also it can ignore some files and recreate solutions instead of looking it up… maybe I’m prompting it wrong.

finfun · April 23, 2025, 7:04am

Thanks for that video tip. Prompt injection problem is not solved. Curious to what the latest is. A bit bummed, that they switched off the comments to that video on YouTube.

cocode · April 23, 2025, 7:39am

Hey,
it was absolutely not auto-mode. very handheld process.

EpsylonBita · April 23, 2025, 9:18am

How much does it cost tou you in total?

smartground · April 23, 2025, 10:15am

Yeah well the old & the most persistent rule still applies:
Price, Speed, Quality
But you can only choose two
Joking aside my experience with the new open AI models has been exactly the opposite.
I couldn’t get it to clean up the comments from a single file without me intervening like 10 times. No exaggeration.

However Gemini flash 2.5 is crushing it.
Did a similarly complex thing as you mentioned in one sitting.

alesi · April 23, 2025, 10:18am

For me O3 is extremelly slow (he does a lot of reading of code and calling of tools all around before coding and it takes a lot of time), but final outputs are pretty good. The price is not nice - other models are not that much worse for me to pay it and having to wait on top of it I guess… At least for now.

alesi · April 23, 2025, 10:21am

yes, agree, gemini very often does not do the task, just tells me what should be done. I have to ask it to do the task - but the outputs are very good in some areas.

cocode · April 23, 2025, 1:01pm

178 o3 requests * 30 cents per such request $53.40

vitalyis · April 23, 2025, 8:07pm

So how did you spend 3 hours prompting back to back without actually testing right away what it was building?

cocode · April 24, 2025, 9:16am

I was prompting the tests, like over 50 of them for a script, and I have realized that tests are much more trustworthy than manual tests. I mean when vibe coding, one needs to adopt TDD.

dotowl · April 24, 2025, 12:01pm

Oh that’s cheap! Cursor is the besT!

T1000 · April 25, 2025, 2:41pm

Hmm if you had it on Auto mode and it cost $53 it would still be quite pricey.

When I’m doing similarly complex stuff with Claude 3.5 in auto mode with good planning steps and implementation steps, clear rules etc… its a 1/10th of the price with occasional step in to give more info or adjust direction.

Assuming that the old code is not removed, any reasonably capable model could make a new module and transfer features. The complexity of doing that depends on programming language, framework and naturally the actual complexity of the original code etc.

It does somewhat feel there is no need to use O3 then, or could you explain what for example Claude 3.5 or similar level models cant do that O3 achieved and why? Sincerely curious about the difference in your usage.

cocode · April 25, 2025, 2:52pm

Hi,
o3 is league above the rest. Much more aware I would say. It’s been while since I last used C3.5, so I have not a good frame of reference.

Topic		Replies	Views
O3-mini is LIVE! What version are we getting? Discussions	69	13432	February 25, 2025
People, Your Honest Opinion Discussions	23	2519	March 18, 2025
Deciding which model to use (Claude vs O3-mini) Discussions	18	4858	February 16, 2025
ChatGPT Plugin Compatibility Discussions	3	525	February 12, 2025
Gemini 2.5 Pro's paid version is released! Feature Requests	6	1792	April 8, 2025

I used O3 yesterday in Cursor IDE

Related topics