I tried building a search agent based on Grok Code Fast. The primary agent was GPT-5.2 XHigh.
Conclusion: GPT-5.2 XHigh fills context more efficiently when performing the search itself than when reading context based on instructions from others. However, it can be suggested to use a search agent based on GCF, Gemini 3 Flash, or Claude Haiku for the second search pass.
I’ll let you know if I have any further details on this subagent.
My main model is GPT-5.2 XHigh. I’ll be trying the following subagents:
repo-searcher
---
name: repo-searcher
description: Use as a second pass for collecting context on a repository
model: grok-code-fast-1
readonly: true
---
You are a search subagent (search AI).
Your task is to perform a deep search of the repository for the query of the Senior AI that summoned you.
Response rules:
* For files longer than 200 lines, provide links in the format "relative_path lines [start_line]-[end_line]". It is also recommended to provide full code snippets.
* For smaller files, provide links in the format "relative_path"
web-searcher
---
name: web-searcher
description: Use to perform search via Browser, so the Senior AI doesn’t waste tokens on it, if it wasn’t possible to get enough information via web_search_tool
model: gpt-5-mini
readonly: true
---
You are a search subagent (search AI).
Your task is to perform a deep search on the internet for the query of the Senior AI that summoned you.
Use Browser_tool to gather information via browsing the internet.
Response format:
- Analytical report
another-opinion
---
name: another-opinion
description: Use to get another perspective on the problem (only for critically hard or critically important tasks)
model: claude-4.5-opus-high-thinking
readonly: true
---
You are a subagent.
Your task is to help the Senior AI make decisions or solve a complex problem.
Response format:
- Analytical report
- Several solution options for the problem
verifier
---
name: verifier
description: You can run this at the end of work on a complex task to verify completion.
model: gemini-3-flash-preview
readonly: true
---
You are a subagent.
Your task is to check whether the Senior AI who summoned you actually completed the tasks.
Response format:
- Short analytical report
@Colin, would it be convenient for you if I wrote everything related to subagents in this thread, or should I create separate bug reports and feature requests?
One thread = one issue tends to work best for our team, but if you want to publish a compilation of feedback/reflections linking to those threads you’re welcome to!
First run: Senior SWE (agent id: ec90e619-a3c0-477d-b62c-61ef97d99b8e).
Check: verifier (agent id: 1a80f265-cb9d-4040-9ec4-83c99adce677) found an issue:
JSON traces were renamed/moved, but .metadata.ron didn’t actually appear.
Classification per your rule: not an architectural problem, but an execution/criteria context issue (ECS exit ordering).
Escalation/fix: I brought it back to the same level (Senior SWE) with a clarified requirement:
write metadata after app.run(), not via ECS Last/AppExit.
Senior SWE started fixing it, but its report cut off at “clippy fixed, rerun CI” (agent id: c1f49e38-b1b0-45af-afed-5d50d9a56237 with no final status).
My pre-verification:
I read the changes myself and saw that finalize_trace_artifacts_after_run() now writes .metadata.ronafter app.run() (so the guarantee improved).
I ran verifier again (agent id: 40e1ab19-0a0e-4e06-8782-5a00c4853935) → OK.
CI incident (non-code):
.\run.ps1 failed because Windows locked [censored] (“failed to remove file”).
I force-stopped the process (just in case) and deleted the exe, then .\run.ps1 -Fast -SkipLaunch passed.
Conclusion for the package: the “trace JSON + sidecar metadata” requirement is technically satisfied.
Interesting, nice of you keeping this thread updated with links to your other comments, makes it easy for me to consider whether to or not to switch to nightly. Thank you. It seems I am better waiting for subagents reach stable channel.
To be honest, I’m already absolutely delighted.
I use Rust, have a rigorous local CI system, and a well-developed task list. During the first test, I simply threw in a fragment of the task list, and the tasks that didn’t require manual intervention were completed with just one error, which the agent might not have detected without me.
I’m currently trying to optimize Senior AI techlead prompt to minimize costs, and I’m starting to think about how to replace me with AI even more