I do not make it auto!
I choose Opus 4.6, why IDE choose composer 2 which is Chinese model!
Steps to Reproduce
In Chat , choose Opus 4.6:
The tokens counting is just worker, not the project key points, not goal; AGENT-CONSTRAINTS: if all tokens number is greater than 70% window then must split the plan or task, if prediction all tokens number is greater than 80% window then must start new sub-agent of `Viber Coding`! The project is for migration which context describes in @.cursor/plans/label-x_migration_build_c656181b.plan.md and @label-x-work/docs/MIGRATION_SCAN.md @label-x-work/docs/FRONTEND_API.md @label-x-work/docs/BACKEND_API.md @label-x-work/docs/DATA_MODELS.md ; then do each iteration for the iteration module by module , function by funtion! Each iteration: you should plan very carefully, including module / function / complete / unit test / integration test and must update them to @label-x-work/docs/GAP_ANALYSIS.md as GAP, and `integration tests` to @label-x-work/docs/INTEGRATION_TESTS_FUNCTIONS_GAP.md as GAP !!!
This is expected behavior. When you select Opus 4.6, that model is used for the main agent. Subagents (Explore, Browser, Bash) that get automatically spawned during agent tasks use Composer 2 by default. Your Opus 4.6 requests are routing correctly for the primary agent work.
To have subagents use your selected model instead of Composer, enable Max Mode in the chat settings. More details in this related thread where a staff member covers the workaround options.
Same issue, using composer for planning instead of Opus. If it’s doing this it should also be reflected in the request amounts.
To add to this: It appears now as say you start plan with Opus, this takes it’s normal request. Then if a Composer 2 subagent is started by the agent, this takes and additional 2 requests for some reason? This is new behaviour, is this intended or not? This is basically a 2x price increase if so.
@ibutab - built-in subagents (Explore, Bash, Browser) use Composer by default for speed and cost efficiency. Each subagent invocation consumes tokens independently from your main agent request.
How this affects billing depends on your plan:
Current usage-based plans: Composer usage draws from the separate Auto + Composer pool (on individual plans) at much lower rates than Opus.
Legacy request-based plans: Subagent invocations count against your request quota. This is what typically causes the “extra requests” you’re seeing.
To have subagents use your selected model instead of Composer, enable Max Mode in chat settings. You can also create custom subagents with model: inherit to always match the parent model.