I don’t have enough strength anymore, I won’t describe the exact ways to reproduce the problems, I’ll just describe the general sets of frustrations and I’ve already spent a lot of nerves in the last 2 days.
I have always used auto mode, if I wanted to use only the claude model, I would use the claude code (perhaps now I do), the concept of auto mode seemed to me successful, I did not have to think which model would better cope with a particular task, but now it has become simply unbearable to use.
Eternal edit attempted
He’s stupid, if I use the planning mode, he creates a plan, but after that he can’t edit it at my request, an edit attempt occurs.
He stops. Previously, I could create a plan for a pool of tasks, fixes, and launching the plan into implementation, he completed it completely, but now he completes 1 point and stops.
He started ignoring my rules and requirements, even if I write them directly in the dialog, it is appropriate along with the task.
In general, his trust has disappeared, and I can no longer do my job in peace.
I tried to clean the cache, downgrade the version, I thought that maybe I had some problems locally, but after a lot of actions it did not lead to a result.
Steps to Reproduce
Expected Behavior
Operating System
Windows 10/11
Version Information
For AI issues: which model did you use?
For AI issues: add Request ID with privacy disabled
Hey, we’re aware of these issues. All three things you’re describing are being tracked: edit attempted errors in plan mode, the agent stopping early, and rules being ignored.
A couple of things that might help in the meantime:
Instead of Auto, try selecting a specific model, like Claude Sonnet or GPT-5. Auto routing can sometimes make suboptimal choices for complex multi-step plans.
For the rules issue, double-check that your rules have alwaysApply: true if you need them in every conversation.
If you want us to investigate a specific case more closely, grab a Request ID. Click the three dots at the top of a chat, then Copy Request ID, and share it here. That would help the team pinpoint what’s going wrong on our end.
No, unfortunately your recommendations didn’t help. As for “alwaysApply: true”, it is always set.
It feels like the agent has been replaced.
It’s like I have to look for approaches and formulations for setting tasks all over again, and even if I try to work on the script, making it as clear as possible, there is still a percentage where the agent does something different.
For example, previously, I often set the task of reviewing the code for some problem with suggestions for correction and asked to provide information in the plan, and it worked. Today, when I ask for the same thing, the agent creates a plan for me, in which he sets tasks for review, that is, sets a task for investigating the issue, an additional link has appeared, and no matter how hard I try to be even more specific, I can no longer control it
Previously
“do a review and suggest changes, put them in a plan” → a plan with a review and ways to fix→ execution
Now
“make a review and suggest changes, put them in the plan” → “plan for review” → “execution of review” → and then the plan for correction → correction
and all this even with planning mode enabled.
and just like that, he’s picking up on the little things.
GPT did not change the situation much.
Hey, I see the model selection tip didn’t help, that’s a bummer.
What you described in your last message, the agent breaking a simple task into extra intermediate steps instead of doing it directly, is a useful detail. If you can catch it happening again and send the Request ID, it’ll really help us figure out what’s going on server-side. Click the three dots in the top right of the chat, then Copy Request ID.
The team is aware of all three issues: edit attempted, early stopping, and ignoring rules. Your report helps us prioritize it, but I can’t promise a specific fix timeline yet.
As a temporary workaround for complex tasks, try splitting the plan into smaller chunks manually and running them one by one instead of as a batch. It’s not ideal, but it can reduce cases where the agent gets lost.
Let me know if you manage to share the Request ID.