New Benchmark shows the new gpt-4-0125-preview is lazier than the previous version

beowolx · February 1, 2024, 6:44am

I wanted to share with you folks, mainly with the Cursor team, this new benchmark that just came out: Lazy coding benchmark for gpt-4-0125-preview | aider

Basically, the benchmark indicates that the new model is actually lazier than the previous one.

Also, if you check OpenAI Discord, X, Reddit or any social media, you will see similar statements: Turbo models are lazy, hallucinate a lot, provide wrong answers and do not follow instructions.

at my company, i was thinking about getting a Cursor license for the whole team but gave up the idea because we are being forced to use the new Turbo models, so much easier to just get OpenAI API key and use the good old model, but I’d rather give my money to Cursor than to OpenAi tbh

Someone here told me fast GPT-4 requests still used the old model but I don’t think it’s true, at least for my account it was using the Turbo model when I was a Pro subscriber because model was with knowledge updated to April 2023.

leoing · February 6, 2024, 12:12pm

“fast GPT-4” is just “priority GPT-4 in our queue”, misleading name.

Edit: It’s getting confusing - there is also the “cursor-fast” model, a finetuned gpt-4 IIRC…

Yvette · September 4, 2024, 5:38am

Is there any benchmark on the implementation side? Like different code copilot platforms on same model?

Topic		Replies	Views
GPt-4 Latest Release (Up to April 2023) Discussions	8	1957	November 11, 2023
Gpt-4-0125-preview just dropped from OpenAI Discussions	5	1184	February 25, 2024
Is it possible to get back the old GPT-4 model? Discussions	4	825	February 1, 2024
GPT-4 Turbo for Pro users Feature Requests	17	1769	February 22, 2024
Can we get an update on implementation of the new GPT4 model? Discussions	1	726	November 17, 2023

New Benchmark shows the new gpt-4-0125-preview is lazier than the previous version

Related topics