I’m having multiple agents/LLMs audit my code. Using the “Plan” mode, however, seems to complicate things further:
While most agents file the plan in ~/.cursor/plans, others file them in their subtree.
Asking agents to rename files also has mixed results: some rename the file in place, others delete the original and create a new one, and GPT-5.3 Codex says it renamed the file but actually didn’t!
Opening the plans in the Cursor IDE is nothing short of a nightmare because you need to select each model card on the right (to select that specific model’s chat), then click on the file reference… which will sometimes open the file in a new tab, but most of the time replaces the previous file: the whole point is to compare files.
Does anyone know of a more efficient and less maddening way to do this?
Hey, interesting workflow. A couple things that might help:
For comparing plans across models: Have you tried the built-in “Use Multiple Models” feature? It sends the same prompt to multiple models at once and shows the results side by side in separate worktrees. It requires an open Git repo as the workspace root. This may work better than manually running separate chats.
About tabs being replaced: This is standard VS Code behavior with “preview” tabs. You can disable it in Settings > Workbench > Editor: Enable Preview by unchecking it. After that, each click opens a pinned tab instead of replacing the preview. Or double-click a file link to open it as a pinned tab.
About where plan files are saved: The docs confirm that by default plans are saved in your home directory at ~/.cursor/plans. Different models may not always follow the same convention. Try adding explicit instructions to the prompt, like “Save the plan to ~/.cursor/plans/[model-name]-plan.md”, to keep it consistent. Cursor Rules .cursor/rules can also help standardize this across sessions.
About the renaming issue in GPT-5.3 Codex: There are known issues with GPT Codex models in Plan mode. The team is aware of various quirks with this model during planning.
What Cursor version are you on, and which specific models are you comparing? That can help narrow this down.
fwiw i ran a small comparison between plan-first and just letting the agent go directly. plan-first used fewer tokens and caught more edge cases. the tricky part is exactly what you’re hitting though, comparing outputs across models is painful when the IDE keeps replacing your tabs.
the preview tab tip from @deanrie is the fix for the tab replacement. i had the same problem until i turned that off in settings.