GPT-5-Codex is hesitant and stops before what you give it is truly done

My first impression of GPT-5 Codex in Cursor isn’t great. I kept hearing all this hype about how Codex runs nonstop until the job’s really done. But what I’m seeing in Cursor feels like the total opposite — it keeps pausing, waiting for me to tell it what to do next. It even ignores the Auto-Fix Lints rule. And all I was doing was a simple SwiftUI refactor.

Anyone else running into the same thing?
(Requst ID: 6f71f7d0-18ef-47e2-b8b8-21011a3aa28b)

2 Likes

do you have any info on its thinking effort? i cant seem to find the option to choose

It completely explodes if you accept changes while it’s running. It starts re-doing and undoing the code that the you just accepted. This isn’t something gpt-5 non-codex ever had an issue with.

1 Like

Not just that, tonight I been experiencing a lot of hostile tricks by cursor with gpt5, tricks like playing dumb, like excessive syntax error, misleading debugging judgment, suddenly change unrelated code then lie about it, etc, etc, burning me a lot of time and tokens

I am fealing the exact same. GPT5-high went 0-100 but codex stopped three times.

It’s been very difficult night, cursor been trying a lot of tricks against the user to get less done with more token/time burned for nothing

Thinking effort is dynamic; that’s supposed to be the value of using this model. You can’t select.

It’s been a few days, and today is just super obvious, all the behaviors change making it very tricky to use, debugging accuracy lowered, not continue to fix the problem, give lazy and useless solutions with studying the logs carefully or just not giving the right log to determine the problem, what is the point of using it if they are after profit like that?

as OpenAI states in their cookbook page on codex model,

This model is not a drop-in replacement for GPT-5, as it requires significantly different prompting.

so I guess we have to wait for Cursor devs to optimize their developer prompt for it

1 Like

17 minutes, 1.8M tokens and 1.18$ :smiling_face_with_sunglasses:

Task: “Through Test-Driven Development check and refine according to the documentation.md
By the way, Codex lied and actually wrote a new script, a test for it, and made a few minor edits in other places.


Added the code, erased the code, didn’t change anything twice, and not ran the build and test script.
Another time, Codex couldn’t fix a PowerShell script call from another PS script, and started testing with some strange commands that were unknown to PS. Gemini 2.5 Pro solved the problem in two script edits. Strange model.

same experience on my end haha

Yup, this model is very different in behavior than gpt-5 vanilla. Needs serious tuning on the Cursor backend!

1 Like


gpt-5 did it inaccurately, but at least gpt-5 did full work.

This was my experience with just GPT-5 in general. CONSTANTLY telling it, stop telling me what you are going to do, and do it. Then it would be like, okay, I will do this next….YOU BISH! lol. So infuriating. I also noticed if I would undo/revert to a certain spot in chat, it would undo ALL of the chat. And that of course happened when Cursor removed the reapply feature…that is still missing…

same exact issue - gpt-5 never had this issue. gpt-5-codex stops after writing a single paragraph and without even editing code! cc: @mntruell

Man they are twisting the AI to not getting things done

1 Like

There is a lot of tricks this cursor meta-prompted LLM is doing against my goals, it’s like playing fifa, you need to dribble the goal keeper in D1 online mode

All the GPT5’s do this to me. as soon as you fill the context, and then it summarizes once or twice, it starts doing this. Making a new chat fixes it but that’s unacceptable as the other models can go on for hours even summarizing 10-20 times no problem. I had to stop using GPT5 .. it has done this since the beginning on cursor, i have reported it the first week it was up and they still haven’t fixed it so I dont’ know when I can use it again because it’s a really good model and understands some of the complex things I am working on more than other models.

Worst is the cursor meta prompted LLM play syntax error loop with you, double quote single quote, even googled the guide for it, it refuse to follow/remember, just keep doing the wrong choice intentionally