"Supervisory" agent to guide "worker" agent

The Problem: The Claude 3.5 Sonnet model, used in Composer Agent mode, is incredibly capable, but still needs human supervision. Not because it can’t do the work, but because it needs simple reminders like “check your work” or “what did I say earlier?” to stay on track. Even with features like auto-fix lints, it often needs an explicit prompt to address them.

Key Insight: The most striking thing about these interventions is how minimal they are. We’re not providing new information or complex guidance—we’re just acting as a basic checkpoint system, asking questions like:

  • “Did we fix the problem?”
  • “Are all lints resolved?”
  • “What was our original goal?”

The Solution: A Supervisory Agent Layer in Cursor that would:

  1. Receive the user’s objective upfront
  2. Monitor the worker agent’s progress
  3. Provide those same simple prompts that humans currently do

Implementation Note: While Claude 3.5 Sonnet could be used for the supervisor role, an even simpler model could likely handle these oversight tasks effectively. The supervisor’s job is fundamentally about maintaining context and performing basic checks—not complex reasoning.

Impact: This would transform Cursor from a powerful-but-supervised tool into a truly autonomous coding assistant, freeing developers to focus on higher-level decisions rather than basic oversight.

P.S. I’ve already tried Devin, which is presumed to have some level of this capability built in—and I was unimpressed. Cursor is already way ahead, and this proposed enhancement would leapfrog it beyond all else.

P.P.S. Please remember to “vote” above if you want to see this feature as well!

21 Likes

Only after posting this did the forum suggest some other actually-similar posts:

@lukemmtt Totally agree, this would be a game changer for Cursor!

3 Likes

@saketsarin @maxritter @funkenstrahlen @alejoh90 @dglewis @emmanuelkuebu

Thanks for all the positive reactions all! Please consider hitting the “vote” button as well to give this post more visibility :heart:

2 Likes

Yes. Great

1 Like

Yes there are definitely cases where claude thinks it has found an issue which is an non issue. Some more structured approach guiding the evaluation process helps. cursor rule files for example, or otherwise local notes/details/requirements in regular .md files help to guide it.

sure eventually there will be a manager for agents, and while Cursor has lots of those rules baked into their request processing it does not help when an AI gets confused by excessive context length with unrelated info or traps like easy hanging fruit solutions that are not issues etc.

Good example is if you use ‘non-standard’ ports for any 3rd party tools in your code. when it finds the config file while checking for issues it suddenly says that the wrong port is an issue, while it proved just a bit before that the port actually works. But if you tell it that its an custom port which works it helps it steer back.

1 Like

Aider showed that this is how you achieve the best AI coder agent.

Also, an agent that implements this flow would be absolutely game changer for Cursor I think.

So yeah, great suggestion!

1 Like

@danperks Since this idea seems to have gotten some traction, I would be grateful to hear your take on it. Any internal discussion about this idea or similar?

Thanks for the suggestion!

Its a really interesting idea and we have some experiments in this area that we’re working on, but nothing I can share just yet. Really appreciate you taking the time to write this up - its great to see the community thinking about ways to improve the product

4 Likes

it’s a good idea, this can be developed further.

cursor can be very different. printing code with artificial intelligence can become even better. otherwise, as you said, we are constantly trying to keep the system awake contextually. this is frankly very difficult and after a while it already gets confused and we have to open new chats.

this controller can even record the conversations locally and when we open a new chat, we can select the chat record we want and the same topic can be continued.

1 Like

This layer of (overseer) orchestration is the only thing keeping me from trusting YOLO mode. I still wouldn’t 100% trust it, but it would be a huge promotion, taking the junior level mindset of agent, regardless of model used, to a more senior member. The layers and layers of rules that we have to put in place for guardrails continually compete for context space and that balancing act is a large part of our labor. Adding hierarchical supervisory roles (yes, I’m suggesting more than one) is necessary to enable chunking the context. Immediate up vote.

1 Like

Great points @dglewis, this nails it.

In time, larger context windows (i.e. “attention spans”) and better adherence of models to provided guidelines & rules might make this supervisory feature redundant or irrelevant.

But until then, rather than playing this game of reminding the agent and trying to be clever with .cursorrules and prompts and waiting for Anthropic and OpenAI to bless us with infinite context, the practical solution to unlock Cursor’s potential is right here, and 100% feasible today with existing models.