Despite having strict instructions to pause after each phase of development in both the rules, and the specific plan for the implementation, 5.3 Codex ignores these and attempts to implement the whole plan each time. 5.2 was arguably the best at adhering to rules but this is making working with 5.3 extremely difficult.
The model reports back as having had instruction to ‘complete all tasks’ which is overriding everything I tell it to do. Is this in the harness for the model? Is it possible to fix this?
Steps to Reproduce
Create a plan with phases in it and include instruction to check in with the user after each phase
Hey, thanks for the report. This is a known issue. Some models in Agent mode try to finish everything on their own, even when the rules clearly say they should stop.
Here are a few things you can try:
Use Plan mode (Shift+Tab in the input box). It creates a plan first, you review it, and only then it starts implementing. This gives you natural checkpoints between steps.
Make the rules as explicit as possible. Instead of “pause after each step”, try something like:
CRITICAL: You MUST fully stop after completing each step.
DO NOT move on to the next step. DO NOT implement anything beyond the current step.
Wait for the user's explicit approval before continuing.
Make the prompt stricter. Instead of “implement the plan”, say “IMPLEMENT ONLY STEP 1. Stop after STEP 1. DO NOT touch STEP 2 or anything after that.”
Try a different model. Can you test with Claude Sonnet 4.5 or another model to confirm this is specific to Codex 5.3?
About the “complete all tasks” instruction you said the model mentions, can you send a screenshot? That would be really helpful for the team to investigate.
Also, if you can, please send:
The exact contents of your rules file
A request ID from one of those sessions (Chat menu > Copy Request ID, with privacy mode turned off)
That will help us dig deeper. The team is aware of issues with rule-following, and your report helps raise visibility of this issue for Codex 5.3 specifically.
one thing that helped in my testing: how you word the rule matters more than you’d expect. “do NOT continue past phase 1” gets ignored way more often than “ONLY proceed to phase 2 when the user explicitly says continue.” positive framing with a specific trigger word tends to stick better.
also worth checking: if you’re using .mdc files, make sure alwaysApply: true is in the frontmatter. without it the rule might not even load depending on the context.
are your rules in .cursorrules or .mdc files? and are you running this in agent mode or plan mode? the model’s compliance can vary a lot depending on the setup.
Thank you both for your responses. I can confirm I did have some pretty explicit rules set and correctly set to always apply. The great news is that this now seems to be working as intended so whatever update has happened it seems to have fixed this. Happy to close the issue