Using o1-Preview When All Else Fails

devgrinder · October 21, 2024, 4:25pm

I am finding that if Cursor gets stuck on something using claude-3.5-sonnet or gpt-4o that I can usually switch to o1-Preview and it can figure out the issue.

I use this technique in those situations where the model will start suggesting code edits that are already in the file or keep switching back and forth between the same two failing edits and get stuck trying to debug or build something. In this case, I find switching to o1-Preview and making the request again will usually present an output that fixes the issue.

I am finding o1-Preview costs me about $0.40 per message, so I use it sparingly. But I think it’s worth every penny to get past those times when the other models get stuck or hallucinate.

tkataja · October 21, 2024, 5:11pm

If you are on high enough tier (on OpenAI API tiers) you could use “Toggle OpenAI key” functionality and use your own key for o1-preview requests. I use it like this when it’s time to consult o1 for problems that Sonnet cannot solve. Of course depending on your particular context size $0.40 might be cheaper but I use it sparingly with my own key and single request might cost easily less than $0.10.

EDIT: perhaps one could use openrouter if not applicable for direct OpenAI use.

fun_strange · October 21, 2024, 6:19pm

Same. I don’t even bother using anything except o1-preview anymore, besides some minor edits I know Claude can handle. It costs me around 100$ a month, but saves a lot of time.

What I’m wondering is whether o1-mini could be better in some instances. So far I’ve found o1-preview to work best in my testing (however, I haven’t extensively tested it against o1-mini, only Sonnet).

I’m confused about the benchmarks that say Sonnet is still better for coding (e.g. livebench). In most of my use cases (large codebase, large data analyses) o1-preview has been superior.

Perhaps the only time where Sonnet was preferable was some recent API and Sonnet’s training included new info about this API (due to a more recent cut off date), which was lacking in o1-preview.

Update: this holds true for me even after the new Sonnet release.

Topic		Replies	Views
Cursor IDE stuck using o1-preview Bug Report payment	1	291	September 16, 2024
O1-preview inclusion rationale is not clear General	7	149	October 30, 2024
O1-preview vs Claide-3.5-sonnet General	2	214	October 29, 2024
Being charged for failed o1-preview requests Bug Report unsolved , payment	8	573	November 2, 2024
Claude Sonnet 3.4 finally confessed after wasting 4-5 hours of mine General	1	696	September 24, 2024

Using o1-Preview When All Else Fails

Related Topics