I don’t have enough strength anymore, I won’t describe the exact ways to reproduce the problems, I’ll just describe the general sets of frustrations and I’ve already spent a lot of nerves in the last 2 days.
I have always used auto mode, if I wanted to use only the claude model, I would use the claude code (perhaps now I do), the concept of auto mode seemed to me successful, I did not have to think which model would better cope with a particular task, but now it has become simply unbearable to use.
Eternal edit attempted
He’s stupid, if I use the planning mode, he creates a plan, but after that he can’t edit it at my request, an edit attempt occurs.
He stops. Previously, I could create a plan for a pool of tasks, fixes, and launching the plan into implementation, he completed it completely, but now he completes 1 point and stops.
He started ignoring my rules and requirements, even if I write them directly in the dialog, it is appropriate along with the task.
In general, his trust has disappeared, and I can no longer do my job in peace.
I tried to clean the cache, downgrade the version, I thought that maybe I had some problems locally, but after a lot of actions it did not lead to a result.
Steps to Reproduce
Expected Behavior
Operating System
Windows 10/11
Version Information
For AI issues: which model did you use?
For AI issues: add Request ID with privacy disabled
Hey, we’re aware of these issues. All three things you’re describing are being tracked: edit attempted errors in plan mode, the agent stopping early, and rules being ignored.
A couple of things that might help in the meantime:
Instead of Auto, try selecting a specific model, like Claude Sonnet or GPT-5. Auto routing can sometimes make suboptimal choices for complex multi-step plans.
For the rules issue, double-check that your rules have alwaysApply: true if you need them in every conversation.
If you want us to investigate a specific case more closely, grab a Request ID. Click the three dots at the top of a chat, then Copy Request ID, and share it here. That would help the team pinpoint what’s going wrong on our end.
No, unfortunately your recommendations didn’t help. As for “alwaysApply: true”, it is always set.
It feels like the agent has been replaced.
It’s like I have to look for approaches and formulations for setting tasks all over again, and even if I try to work on the script, making it as clear as possible, there is still a percentage where the agent does something different.
For example, previously, I often set the task of reviewing the code for some problem with suggestions for correction and asked to provide information in the plan, and it worked. Today, when I ask for the same thing, the agent creates a plan for me, in which he sets tasks for review, that is, sets a task for investigating the issue, an additional link has appeared, and no matter how hard I try to be even more specific, I can no longer control it
Previously
“do a review and suggest changes, put them in a plan” → a plan with a review and ways to fix→ execution
Now
“make a review and suggest changes, put them in the plan” → “plan for review” → “execution of review” → and then the plan for correction → correction
and all this even with planning mode enabled.
and just like that, he’s picking up on the little things.
GPT did not change the situation much.
exactly. Until recently, you could compensate for the degraded model by using planing mode. So i used planning mode for everything. But even that workaround is now gone. And with it, my subscription. Automode is now worse than a brutally quantized open source model.
Hey, I see the model selection tip didn’t help, that’s a bummer.
What you described in your last message, the agent breaking a simple task into extra intermediate steps instead of doing it directly, is a useful detail. If you can catch it happening again and send the Request ID, it’ll really help us figure out what’s going on server-side. Click the three dots in the top right of the chat, then Copy Request ID.
The team is aware of all three issues: edit attempted, early stopping, and ignoring rules. Your report helps us prioritize it, but I can’t promise a specific fix timeline yet.
As a temporary workaround for complex tasks, try splitting the plan into smaller chunks manually and running them one by one instead of as a batch. It’s not ideal, but it can reduce cases where the agent gets lost.
Let me know if you manage to share the Request ID.