Agent model change around the weekend? Frustration rising

Mark_Klimp · May 13, 2026, 7:47pm

This is my second month. In my first month, I tried a lot of different agents. I really liked Opus but realized it was burning my API bucket too quickly. I stepped back to Sonnet 4.6 which helped me finish my month in a productive fashion. From 5/8 through 5/10 I had not used Sonnet. I started back on Monday and it feels like a completely different agent.

I have a small browser extension project and a workspace for a web site and my droplet. Both projects have been super time consuming/frustrating with AI.

Some examples:

First agent statement I tell it to not update any documents. Two exchanges later and it’s updating multiple files in my repo. Previously, it would ask before updating.
I have user level rules and one hook that says if any files are updated, the project plan and readme should be updated. This is not happening. The agent used to keep things aligned and it’s quickly fragmenting.
It seems to get confused/lose track of the entire agent discussion/activity pretty quickly. I have 15 simple rules and one hook. There are so many miss-steps.

At this point, I’m doing things on my own. Before, I thought we had a good balance of my thinking/doing and the agent filling in as needed.

edit: I would like to echo some cost concerns I saw in some discussions. Last week $10 of API got me a week of heavy Sonnet coding. I’ve only used it two days and have backed off a good bit because of the results and I’ve already spent $5. IMO, 60% of it is wasted money this week.

edit: Here is a good time wasting example. Hoping one request ID can point you to the whole chain. 36144ee3-1db0-4201-acd4-6a5435d79629. I still had to make updates after this long exchange. The “I’ll make the new entry look just like the certbot entry” was not a copy with relevant cache purge changes. I had to update/cleanup/QA every file it touched. Overall, what would have been a 30-40 minute activity last week is now 90 minutes.

deanrie · May 21, 2026, 9:55am

Hey, thanks for the detailed report and the Request ID, it really helps us look up the exact chain.

On the main point: we did not swap out Sonnet 4.6 on the server side during that period, and we did not change routing. But the feeling of it getting worse usually comes down to a few real things worth checking:

Rules and how they are set up. If rules are set to Agent Requested or Manual, the agent often ignores them. For strict constraints like “don’t touch files without confirmation”, you need Always. Open each rule and check the mode in the header. More details: Rules | Cursor Docs
Number of rules. 15 rules is already a lot, and in long chats some of them can get pushed out of context due to auto summary. If some are critical, merge them and put them into a single Always rule at the top.
The hook for updating the readme or project plan. Can you confirm if this is an afterFileEdit hook in .cursor/hooks/? Please paste the contents or the hooks.json structure and we can check that it’s valid and actually firing. We’re also seeing reports that some hooks don’t emit for certain tools. Example: Bug: preToolUse / Agent hooks do not emit for built-in Web search in Cursor Agent (works in Claude Code)
Your Cursor version and OS. What are you on right now? The last couple updates included a few agent changes, so the difference vs last week could be from that.

On cost: Sonnet 4.6 token pricing didn’t change, but if the agent is doing extra steps because rules aren’t being applied, usage can go up. After checking item 1, it usually improves a lot.

Send your version and how your rules are set up, the type for each one, and we’ll dig deeper.

Mark_Klimp · May 22, 2026, 4:57pm

Thanks for the response. My rules are all user rules. You can’t see them on your end?

deanrie · May 23, 2026, 9:57am

User Rules are tied to the account and sync via the cloud, but we can’t access them.

A couple of clarifications to my previous message:

The Always / Auto Attached / Agent Requested / Manual modes apply only to Project Rules .cursor/rules/*.mdc. User Rules don’t have these modes, they apply globally to all chats. Docs: https://cursor.com/docs/rules
User Rules can still get pushed out of context in long chats because of auto-summary. If you have hard constraints like “don’t edit files without confirmation”, it’s more reliable to put them into a Project Rule with alwaysApply: true in the project where it matters.

Separately on “I tell it to not update any documents. Two exchanges later it’s updating multiple files” , this might not be only about rules, but also about tool authorization. Check Cursor Settings > Agent > Tool Permissions. If file edits are set to auto-approve, the agent will edit without asking even if a rule says to ask. It’s worth switching that back to Ask.

To move forward, please share:

Your Cursor version and OS in Help > About
Your hook contents in .cursor/hooks/ or hooks.json
Whether Privacy Mode is enabled in Cursor Settings > General > Privacy Mode. That affects how deep we can go into logs using your Request ID

Mark_Klimp · June 1, 2026, 12:25am

Thanks for the clarifcations.

Cursor version

Version: 3.6.21
VS Code Extension API: 1.105.1
Commit: e7a7e93f4d75f8272503ecf33cedbaae10114a10
Date: 2026-05-28T21:45:36.072Z
Layout: editor
Build Type: Stable
Release Track: Default
Electron: 39.8.1
Chromium: 142.0.7444.265
Node.js: 22.22.1
V8: 14.2.231.22-electron.0
xterm.js: 6.1.0-beta.220
OS: Linux x64 6.17.0-29-generic

I Software Update before I begin development so I’m certain it was an earlier version when this post was created.

I am only rarely in privacy mode and was not at the time I submitted this post.

Some recent Cursor update caused Cursor to start opening in Agent mode and the hook was running continuously. So I had the agent try to fix the runaway hook. It was supposed to be a hook that on Stop checked to see if any project files were updated and then confirm the project plan.md and readme.md were accurate.

Here is the hook after the agent tried to fix the continual firing:

#!/bin/bash
# Stop hook: check whether README.md and Project Plan.md were updated
# when extension source files were modified during the session.
#
# Uses git status to detect which files changed. If any file under
# iherb-extension/ was modified but README.md or Project Plan.md
# was not, sends the agent back with a followup message.

cd "$(git rev-parse --show-toplevel 2>/dev/null)" || exit 0

changed_files=$(git status --porcelain 2>/dev/null)
if [ -z "$changed_files" ]; then
  exit 0
fi

extension_changed=$(echo "$changed_files" | grep -c "iherb-extension/")
readme_changed=$(echo "$changed_files" | grep -c "README.md")
plan_changed=$(echo "$changed_files" | grep -c "Project Plan.md")

if [ "$extension_changed" -eq 0 ]; then
  exit 0
fi

# Skip if we already reviewed this exact set of changes this session.
# The signature covers only extension files so adding README.md or
# Project Plan.md naturally produces a new signature and re-enables the hook.
signature=$(echo "$changed_files" | grep "iherb-extension/" | md5sum | cut -d' ' -f1)
state_file="/tmp/.cursor-hook-iherb-$signature"
if [ -f "$state_file" ]; then
  exit 0
fi
touch "$state_file"

missing=""
if [ "$readme_changed" -eq 0 ]; then
  missing="README.md"
fi
if [ "$plan_changed" -eq 0 ]; then
  if [ -n "$missing" ]; then
    missing="$missing and Project Plan.md"
  else
    missing="Project Plan.md"
  fi
fi

if [ -z "$missing" ]; then
  exit 0
fi

cat <<EOF
{
  "followup_message": "Extension source files were modified but $missing was not updated. Review whether $missing needs changes to stay accurate, and update if so."
}
EOF
exit 0

For now, I have removed the hook.

My Rules are all User Rules in Cursor settiings. I don’t have any in files. Here they are:

Anytime curl for an install is considered, first discuss the need for the install, the downsides of curl installs (manual uninstall) and possible alternatives.

Do not use em dash in visible content.

Any SSH key should be named for its purpose, not the algorithm (e.g. deploy_mklimp, not id_ed25519).

Prefer asking a few targeted questions when requirements are ambiguous, instead of listing many possible approaches. For design and architecture, present a small set of options with tradeoffs. For explanations, be brief and on point. If the user’s request is already specific, implement or answer without extra clarification.

Whenver possible, there should only be one source of truth. Some examples include putting server settings in a script instead of documenting them; application version should be in one variable/object and the rest of the application should then reference that variable/object.

Never switch to or suggest Plan mode. Always stay in Agent mode.

Prefer single entry and exit point per code unit (funciton, etc). Guard clauses and continue are acceptable only in functions under ~10 lines.

Prefer direct names like start, scanProducts, setPriceTier, renderBadge

Make sure README.md is accurate after any significant code update

Every application will have a project plan.md. This document should guide all work that is done on the project.

Keep implementations simple and avoid unnecessary abstraction.

Prefer the smallest working design.
Avoid introducing a class unless at least one is true:
- Shared mutable state must persist across operations.
- Multiple instances are required.
- Explicit lifecycle management is required.
Do not introduce new adapter/registry/helper abstractions until the immediate phase requires them.
- If a domain model exists in Project plan.md, implement against it directly before creating parallel temporary models.
- If a layer is introduced early, include a one-line comment explaining the immediate requirement it satisfies.
Keep control flow shallow:
- Max 2 nesting levels per function.
- Prefer guard clauses and early returns.
Keep functions small and single-purpose (target ~20 lines, soft limit).
If two designs are valid, choose the one with fewer abstractions.
Refactor for clarity before adding new patterns or layers.

Comment the code

OOP preferred

At this point, I am sure some of these things are in settings and some were redundant with the hook. I am learning

I took a break from Cursor development for a bit so I cannot say if things are better or worse. I will be active again this week.

deanrie · June 1, 2026, 9:13am

Thanks, this makes things clearer now. A few notes on what you sent:

Rules. You have 13 rules, and they are all User Rules. That’s a lot for permanent context, and in long chats some of them get dropped due to auto-summary. That’s why the agent starts forgetting agreements. I’d suggest this:

In User Rules, keep only truly global preferences like code style, no long dash, naming, and so on.
Move strict constraints that can’t be broken like don’t edit files without confirmation, README.md must stay up to date into Project Rules in .cursor/rules/*.mdc with alwaysApply: true in each project. Those rules stick in context more reliably than User Rules. Docs: Rules | Cursor Docs

About you said don’t touch docs, then two turns later it edits files. Key point: that limitation isn’t in any of your rules. It was a one-off line in chat, so it’s not persistent. Also, if Cursor Settings > Agent > Tool Permissions has file edits set to auto-approve, the agent will edit without asking even if a rule says it should ask. Set file edits to Ask and add this as a Project Rule with alwaysApply: true if you want it to stick.

About the hook. Your Stop hook returns followup_message, and by design that sends the agent into another pass. With auto-run, that can feel like it’s firing continuously. Since you removed it, that’s fine. If you bring it back, keep that behavior in mind and keep Tool Permissions on Ask so each pass is under control.

On cost: Sonnet 4.6 pricing hasn’t changed. Extra spend is usually from rules not being applied and the agent taking extra steps. After the setup above, it usually becomes more stable.

When you get back to work this week, try with consolidated rules and Tool Permissions = Ask, then let me know how it goes. If hook fires by itself happens again without the hook, send the steps and the Request ID and we’ll dig in.

Mark_Klimp · June 1, 2026, 5:40pm

Thanks for taking the time to understand the problem and provide feedback. Some of this broader guidance would be useful at Rules | Cursor Docs . The user rules section is the least detailed but seems to have some constraints.

The broader theme of 30-40 minutes is now 90 minutes was not really addressed here. My specific example about a statement that pretty consistently worked, “don’t update any documents”, was no longer working in the same way.

The one thing that may have happened around that same time was adding the last couple user rules (curl and em dash).

Again, I am still learning. That being said, this topic still generates a lot of thoughts:

There seems to be a sweet spot for user rule quantity. I will experiment. Not many of these rules are project level but that seems to be the only solution. This means I need to create a repository for project rules then ensure all projects use the same set when I update rules.
These rules are an attempt to train/guide the AI. For using a code specialized AI, I am shocked by how sloppy it is. Is this a Cursor topic or a Claude topic? This is a small learning project with maybe 1k lines of code. I have no idea how enterprises use this at scale (yet they do).
To piggy back on the prior topic. While I believe that development with Cursor can create working results quickly, when digging into the details of the code I am certain it is not the most maintainable. One can say that since AI works so quickly, it’s fine if it’s a bit convoluted. Alternatively, this can be seen as AI selling more AI.
If the User Rule limit is important for consistency, the user settings should minimally warn the user or limit the total.

Again, thanks for all the help.

deanrie · June 1, 2026, 6:04pm

Thanks for coming back with the results and the detailed thoughts. I’ll go through the open items.

On the docs, fair point. The User Rules section is really thinner than the rest, I’ll pass this to the docs team so there’s more guidance on the difference between User vs Project Rules and what’s best to keep where.

The main thing I didn’t fully cover before is “don’t update any documents” and why 30 to 40 minutes turned into 90. Honestly, that line was a one off chat message, not a rule, so it isn’t persistent. But it’s not only that. Negative constraints like “don’t do X” are generally harder for models to follow than positive instructions, and in long chats, auto summary tends to wash them out of context first. We haven’t changed Sonnet 4.6 on the server and we didn’t touch routing, but adherence to these in chat constraints can really vary from session to session. The most reliable way to lock this in is a Project Rule with alwaysApply: true plus Tool Permissions = Ask, so file edits always require confirmation. That removes most of the “it went and started editing again” cases.

On the number of User Rules, there’s no hard numeric limit in the product right now, so the sweet spot is empirical. The best practices guideline is keep rules focused, split big rules into smaller ones, and don’t duplicate them. The docs mention a limit of about 500 lines per rule. The more you keep in permanent context, the higher the chance some of it drops out in a long chat. So yeah, your intuition is right, fewer but sharper rules.

Sharing Project Rules across projects, you’re right. There isn’t a built in mechanism for “one rules library for all projects” today, and that’s friction. Your idea about a warning or a limit on the number of User Rules in settings makes sense.

Is it Cursor or Claude. Honestly it’s both. The model reasons and writes code, but Cursor controls context, the harness, and what gets sent to the model and when. So “set up the rules correctly plus Ask for edits” usually gives the biggest stability improvement, regardless of the model.

If you try this week with consolidated rules and Tool Permissions = Ask, let me know how it goes. If “don’t touch documents” still breaks with that setup, send the steps and the Request ID, and we’ll dig into the specific chain.

Mark_Klimp · June 1, 2026, 6:35pm

Rules | Cursor Docs seems to have a solution for sharing rules across projects.

We seem to be on different pages with the “don’t update” example. It was two statements later it went and started updating. Not some long chat. I start nearly every chat with “let’s talk about X but please don’t update any documents”. It’s been good about asking if it should go and update. At the time of the original post, it was hastily running ahead, ignoring the "don’t update " request and making many updates. So for nearly 5 weeks I experienced the chat request working and then had multiple experiences where this no longer worked.

Again, maybe this is User Rule pollution confusing the agent. Maybe the run away hook was corrupting the agent. I’ll try the things we discussed and see if it helps.

Mark_Klimp · June 5, 2026, 7:59pm

@deanrie I thought I would provide an update after a couple days of using the updated rule structure.

I put all of the rules into:

I am importing them for each project. Based on some research, it seemed like fewer .mdc files with more rules were better than many small files. I did keep one user rule and that is “Do not use em dash in visible content.”

I added a rule that states “If a request would cause a significant update to one or more files, ask before updating the files.”

I am using Sonnet 4.6. Overall, sessions are productive and the update rule seems to be holding.

If my experience changes I’ll provide another update.

In summary I think the Ruies documentation could use the following updates:

Update User Rules to explain that they should be minimally used and Project Rules are preferred. Explain that using more than X User Rules may cause context loss.
Better highlight the best practice of a rules repository and importing these rules into projects.
Explain how project rules are used for workspaces and that importing rules into each project in the workspace is the best practice (because the agent then only uses one instance of the rules). Other options, such as local .mdc files in multiple project repos, all come with costs/risks.
Provide guidance on .mdc file structure (e.g. fewer files with more rules).

I used Auto agent to review my changes. Anecdotally, the agent only agreed that I was doing the right thing once I pointed them towards this forum thread. To me, that alone highlights the need for updates to the Rules document.

Thanks for relieving my frustration.

deanrie · June 6, 2026, 11:28am

Glad it’s working reliably now. Consolidating in Project Rules plus a rule that asks for confirmation before meaningful edits is the most dependable way to keep those constraints in context, and it’s nice to see 90 minutes turn back into a normal pace.

Your docs list is spot on. I’ll pass it to the docs team exactly like this:

User Rules should be used minimally. Project Rules are preferred, plus a warning about context loss when there are too many rules
a best practice for a rules repository and importing into projects
how rules work in a workspace, import into each project so the agent uses one consistent set
a guide for .mdc structure, fewer files, more rules

That’s exactly the gap we talked about at the start of the thread. The User Rules section is thinner than the others.

And thanks for the GitHub link to the rules repo, it’s a great example setup for anyone who finds the thread.

If the “don’t update” pattern breaks again somewhere even with alwaysApply: true and Tool Permissions = Ask, send the repro steps and the Request ID and we’ll look at the exact chain.

Topic		Replies	Views
Glob-scoped rules should trigger on AI file access, not just editor tabs — plus MCP/CLI tool triggers Feature Requests mcp , rules , context	6	122	April 28, 2026
Cursor Agent (Composer/Auto) "Reward Hacked" its own QA rubric to pass evaluation Bug Reports worktrees , subagents	3	110	July 22, 2026
Agent audit regression: Composer 2.5 Fast PASSes on behavioral-only runs while edge suites find blockers Bug Reports rules , composer	1	26	July 24, 2026
Cursor hidden rules Bug Reports auto-run , rules , composer	5	121	May 28, 2026
Composer having a run on neglecting rules Bug Reports review , rules , composer	1	63	May 28, 2026

Agent model change around the weekend? Frustration rising

Related topics