"Supervisory" agent to guide "worker" agent

lukemmtt · February 9, 2025, 7:08pm

The Problem: The Claude 3.5 Sonnet model, used in Composer Agent mode, is incredibly capable, but still needs human supervision. Not because it can’t do the work, but because it needs simple reminders like “check your work” or “what did I say earlier?” to stay on track. Even with features like auto-fix lints, it often needs an explicit prompt to address them.

Key Insight: The most striking thing about these interventions is how minimal they are. We’re not providing new information or complex guidance—we’re just acting as a basic checkpoint system, asking questions like:

“Did we fix the problem?”
“Are all lints resolved?”
“What was our original goal?”

The Solution: A Supervisory Agent Layer in Cursor that would:

Receive the user’s objective upfront
Monitor the worker agent’s progress
Provide those same simple prompts that humans currently do

Implementation Note: While Claude 3.5 Sonnet could be used for the supervisor role, an even simpler model could likely handle these oversight tasks effectively. The supervisor’s job is fundamentally about maintaining context and performing basic checks—not complex reasoning.

Impact: This would transform Cursor from a powerful-but-supervised tool into a truly autonomous coding assistant, freeing developers to focus on higher-level decisions rather than basic oversight.

P.S. I’ve already tried Devin, which is presumed to have some level of this capability built in—and I was unimpressed. Cursor is already way ahead, and this proposed enhancement would leapfrog it beyond all else.

P.P.S. Please remember to “vote” above if you want to see this feature as well!

lukemmtt · February 9, 2025, 7:20pm

Only after posting this did the forum suggest some other actually-similar posts:

maxritter · February 10, 2025, 10:19am

@lukemmtt Totally agree, this would be a game changer for Cursor!

lukemmtt · February 10, 2025, 2:39pm

@saketsarin @maxritter @funkenstrahlen @alejoh90 @dglewis @emmanuelkuebu

Thanks for all the positive reactions all! Please consider hitting the “vote” button as well to give this post more visibility

ktidev · February 10, 2025, 2:50pm

Yes. Great

condor · February 10, 2025, 6:35pm

Yes there are definitely cases where claude thinks it has found an issue which is an non issue. Some more structured approach guiding the evaluation process helps. cursor rule files for example, or otherwise local notes/details/requirements in regular .md files help to guide it.

sure eventually there will be a manager for agents, and while Cursor has lots of those rules baked into their request processing it does not help when an AI gets confused by excessive context length with unrelated info or traps like easy hanging fruit solutions that are not issues etc.

Good example is if you use ‘non-standard’ ports for any 3rd party tools in your code. when it finds the config file while checking for issues it suddenly says that the wrong port is an issue, while it proved just a bit before that the port actually works. But if you tell it that its an custom port which works it helps it steer back.

irian-codes · February 13, 2025, 4:42pm

Aider showed that this is how you achieve the best AI coder agent.

Also, an agent that implements this flow would be absolutely game changer for Cursor I think.

So yeah, great suggestion!

lukemmtt · February 16, 2025, 4:24pm

@danperks Since this idea seems to have gotten some traction, I would be grateful to hear your take on it. Any internal discussion about this idea or similar?

danperks · February 16, 2025, 5:00pm

Thanks for the suggestion!

Its a really interesting idea and we have some experiments in this area that we’re working on, but nothing I can share just yet. Really appreciate you taking the time to write this up - its great to see the community thinking about ways to improve the product

mehmet-py · February 17, 2025, 6:27am

it’s a good idea, this can be developed further.

cursor can be very different. printing code with artificial intelligence can become even better. otherwise, as you said, we are constantly trying to keep the system awake contextually. this is frankly very difficult and after a while it already gets confused and we have to open new chats.

this controller can even record the conversations locally and when we open a new chat, we can select the chat record we want and the same topic can be continued.

dglewis · February 19, 2025, 6:59pm

This layer of (overseer) orchestration is the only thing keeping me from trusting YOLO mode. I still wouldn’t 100% trust it, but it would be a huge promotion, taking the junior level mindset of agent, regardless of model used, to a more senior member. The layers and layers of rules that we have to put in place for guardrails continually compete for context space and that balancing act is a large part of our labor. Adding hierarchical supervisory roles (yes, I’m suggesting more than one) is necessary to enable chunking the context. Immediate up vote.

lukemmtt · February 19, 2025, 7:08pm

Great points @dglewis, this nails it.

In time, larger context windows (i.e. “attention spans”) and better adherence of models to provided guidelines & rules might make this supervisory feature redundant or irrelevant.

But until then, rather than playing this game of reminding the agent and trying to be clever with .cursorrules and prompts and waiting for Anthropic and OpenAI to bless us with infinite context, the practical solution to unlock Cursor’s potential is right here, and 100% feasible today with existing models.

dglewis · March 1, 2025, 4:29pm

I got the bright idea that maybe by calling out - via MCP - to another model (attempted with ollama locally using mistral) I could maybe improve on this, but was not successful.

What gave me the idea was OpenTools | The open MCP server registry, but I couldn’t get it working - at least, not yet.

Anyway, figured I’d update the thread in case someone finds merit in this and can take the idea across the finish line.

dglewis · March 4, 2025, 8:35pm

Another bonus to having an intermediary supervisor is that you can converse in parallel while other tasks are going on. Right now, when the agent is processing and I need to add more context or steer it mid-stride, it causes an interrupt. A supervisory layer would smooth this out and keep the flow going.

dglewis · March 11, 2025, 12:47pm

In Manus, they named their meta agent, with the supervisory role, the executor agent.

Multi-agent implementation is one of Manus’s key features. When messaging with Manus, you only communicate with the executor agent, which itself doesn’t know the details of knowledge, planner, or other agents. This really helps to control context length.

foxman · August 17, 2025, 4:58pm

ask agent to create a plan/file and implementt the whole plan. it will follow its own plan/file. no need of supervisor.

dglewis · August 21, 2025, 8:47pm

Creating a plan in the form of a file or set of files for guiding context (e.g. prd.md, tech-stack.md, project-plan.md, standards.md, etc…) is a long standing and well normalized workflow with AI and all IDE’s. Here, the notion of a supervisor extends the oversight to continually steer and ensure execution effectively meets an outcome based set of goals that had been predefined in the documentation set. Sufficiently simple task lists would not require a supervisor.

Topic		Replies	Views
Coding agents—brains without judgment Discussions	1	114	May 11, 2025
Cursor needs to stop guessing names and check first Feedback	8	271	December 31, 2024
Cursor for Product Manager Feature Requests	1	318	August 13, 2025
Watchdog for agent mode Feature Requests	0	169	January 3, 2025
Experience Sharing Discussions	0	145	March 19, 2025

"Supervisory" agent to guide "worker" agent

Related topics