Cursor 2.2: Multi-Agent Judging

Colin · December 10, 2025, 6:30pm

New in Cursor 2.2! · Full changelog · Main announcement

When running multiple agents, Cursor now automatically evaluates all runs and recommends the best solution.

How it works

After all parallel agents finish, Cursor evaluates each solution and picks a winner. The selected agent gets a comment explaining why it was chosen.

This helps when you’re exploring different approaches to the same problem. Instead of manually comparing outputs, you get a recommendation with reasoning.

Judging only happens after all parallel agents have completed.

We’d love your feedback!

Is the reasoning offered by the “judge” agent helpful?
Does this change how you use parallel agents? Are you more likely to use them?
What improvements would you suggest?

If you’ve found a bug, please post it in Bug Reports instead, so we can track and address it properly, but also feel free to drop a link to it in this thread for visibility.

jeff.willert.se · December 10, 2025, 6:55pm

Glad this was added! I commented to Cursor on X that I’d love to see this taken a step further and have the judge actually review each of the independent implementations and pick and choose pieces from each. There’s no reason to think that Plan A would be better than Plan B across all dimensions. It would be great to critique each one and come up with a “best of both worlds approach”, especially if we do this during the planning phase prior to implementation.

StergiosDikos · December 10, 2025, 9:45pm

It doesn’t work when Bedrock is toggled on.

altrusl · December 11, 2025, 6:24am

I do the following:

Run the same task with 2–3 LLMs and save the results in separate MD files. Then, run a strong LLM on those results, asking it to extract the best parts from all of them.

This approach seems to work better than that multi-agent judging.

joeo · December 11, 2025, 3:34pm

Hey @altrusl , could you share your prompt for this?

user616 · December 11, 2025, 3:58pm

I nerver used Multi-Agent mode , I always choose the best agent model .

charles · December 11, 2025, 8:02pm

This is really interesting, but how do you set this up/do this? I don’t see any controls or options. Would be great in the chat window to run a prompt and instead of enter be able to just a alternate key to launch the prompt against multiple models all at once. (maybe this is hidden somewhere?)

ymoisan · December 11, 2025, 9:28pm

Same question. How does one do multi agents ?

stfnfrnk · December 12, 2025, 12:14pm

This is great. Could you share a bit more what reasoning is used to evaluate the different approaches, Colin?

E.g., does it look at scope of changes (and prefers minimal change?), coding style, solution design, maybe cost of the model (so we learn for the next time), …

David_Golverdingen · December 12, 2025, 12:35pm

I often make a plan with my main agent. Then review by 2 or 3 other agents if am not fully sure and give all feedback back to main agents to pick wat is valid and best. So please make a multi review judge pick best option, and apply this also for multi implementation judge pick best. For example even when Opus comes up with a goo plan, almost always Codex or Gemini has some useful feedback on it. Now I have to do to much copy page action, other option would be to drag other agent chat into you main chat like you can with files

eric1 · December 12, 2025, 5:28pm

That’s a great new feature. I do something similar but more manual with codex, four implementations/four PRs, then give an AI parameters for judging.

I also sometimes orchestrate different agents pick apart and debate prospective approaches, using gh issue threads

Colin · December 12, 2025, 5:33pm

Hey @charles and @ymoisan

Great question, and one I had myself trying it out today.

After some time, a +1 appears on the chat that has been chosen as the best.

We are aware there are some improvements to be made here. I’ve sent this feedback on (along with my own)

Colin · December 12, 2025, 5:36pm

The judge analyzes the logic behind each proposed solution and explores the codebase to confirm they’re correct. It doesn’t specifically optimize for code size, style, architectural choices, or cost.

If you disagree with the reasoning (“best” solution can be a little subjective), we’d love to hear feedback!

charles · December 12, 2025, 7:01pm

No, more basic @Colin , how do I even use multiple agents? I see no way to use this or tell an agent to create multiple sub agents. The basic functionality is hidden?

Wbaker7702 · December 13, 2025, 4:09am

Glad to see what the app is doing after I made it

Colin · December 13, 2025, 12:01pm

You should see a 1x appear next to the model name if you move your mouse into the chat window (just try hovering over the model name, you should see it appear)

meebox · December 13, 2025, 2:23am

Where does the bug appear (feature/product)?

Cursor IDE

Describe the Bug

The multi-agent judging not shown. It does work on my Mac but not on Windows.

Steps to Reproduce

Just run any multi-model agent.

Expected Behavior

Show the multi-agent judging thumb-up icon in one of the model.

Screenshots / Screen Recordings

Operating System

Windows 10/11

Current Cursor Version (Menu → About Cursor → Copy)

Version: 2.2.20 (user setup)
VSCode Version: 1.105.1
Commit: b3573281c4775bfc6bba466bf6563d3d498d1070
Date: 2025-12-12T06:29:26.017Z
Electron: 37.7.0
Chromium: 138.0.7204.251
Node.js: 22.20.0
V8: 13.8.258.32-electron.0
OS: Windows_NT x64 10.0.26200

For AI issues: which model did you use?

Composer1 and Grok Code

For AI issues: add Request ID with privacy disabled

7ad8c2bd-62a3-4538-87f3-3a356eca1311

Does this stop you from using Cursor

No - Cursor works, but with this issue

Jannie_Gill · December 13, 2025, 6:49am

I hit this too - for me it was just a UI refresh issue. Reloading the app (or restarting Cursor) made the multi-agent judging panel show up again. Also worth double-checking you’re on the latest version, since this seems a bit flaky right now.

charles · December 14, 2025, 10:22pm

Nope, no such pulldown ?

vibe-qa · December 15, 2025, 1:26am

A few issues with the process of multi-agent workflow.

sometimes they generate wildly different file names and structures
sometimes they generate the same filename
comparison is difficult for a human reviewer
presenting the result from one model to other models is difficult as the changes flow only back into the original folder
sometimes the models have different ideas, findings and knowledge, it is hard to manually steal their homework and share it between the models so they can copy and remix
sending a new query to all the agents at once after the initial query, is unstable and difficult
what happens to the forks after we close the chat session, are they going to use up space forever?

Topic		Replies	Views
How to access multi-agent judging? Help	2	76	January 11, 2026
Cursor 2.0.x - what are parallel agents doing? Help	11	3055	November 17, 2025
Multi Agent Planning (LLM Council) Discussions	3	248	January 7, 2026
Multi-Agent "Developer + Reviewer Mode" in CLI Feature Requests cli	2	235	December 13, 2025
What’s the point of multiple agents? Discussions	16	3218	November 17, 2025