Plan Mode - Subagents - Reduce costs and save time

Feature request for product/service

Chat

Describe the request

Allow the Agent to spawn Subagents to perform subtasks.
This could potentially reduce costs of some large requests by 75%.

The plan mode works great, and if prompted, the Agent will make a plan that is divided into subtasks as well as an order of operations for those tasks. I find that large requests like “Convert this project from Python to Node.js” really benefit from breaking it into small parts. While the Agent will make a great plan, and divide it into separate roles for different agents to work on in parallel, I must start and manage each agent individually for the overall task.

I would like the Agent to be able to create subtasks and start additional agents to work on them in parallel. Ideally it would check on the outcome, but that might be an issue with context size. It would still work if the subagents were started and there was no check on their eventual output (depending on the task, and if appropriate testing was in place).

This would help keep down costs for large requests, and speed up results because of Agents working in parallel. Each Agent only needs the context for its individual part.

Here is some math courtesy of Sonnet 4.5:

Cost Savings Analysis

Current Approach (Single Agent):

  • Each subtask requires the full project context
  • For a project with 100K tokens of context and 10 subtasks:
    • Total token usage: 10 × 100K = 1,000K tokens (input)
    • At $3/million tokens (Sonnet): $3.00

Proposed Approach (Parallel Subagents):

  • Main agent creates plan: 100K tokens (context)
  • Each subagent only needs relevant subset: ~15K tokens average
  • 10 subagents working in parallel:
    • Main agent: 100K tokens
    • Subagents: 10 × 15K = 150K tokens
    • Total: 250K tokens (input)
    • At $3/million tokens: $0.75

Cost Savings: 75% reduction (for tasks that can be well-isolated)

Time Savings Analysis

Current Approach (Sequential):

Task 1 → Task 2 → Task 3 → ... → Task 10
Time = T₁ + T₂ + T₃ + ... + T₁₀
If average task = 2 minutes: 10 × 2 = 20 minutes

Proposed Approach (Parallel):

Planning: 1 minute
Parallel execution: max(T₁, T₂, ..., T₁₀) = ~2 minutes
Total: 3 minutes

Time Savings: 85% reduction (20 min → 3 min)

Mathematical Model

Cost Formula

Sequential:

Cost_sequential = N × C_total × P

Parallel with Subagents:

Cost_parallel = C_total × P + Σ(C_i × P)
where C_i = context for subtask i (typically C_i ≈ 0.1-0.2 × C_total)

Savings Ratio:

Savings = 1 - (C_total + Σ C_i) / (N × C_total)
       = 1 - (1 + Σ(C_i/C_total)) / N

For N=10 subtasks with C_i ≈ 0.15 × C_total:
Savings = 1 - (1 + 10×0.15) / 10 = 1 - 2.5/10 = 0.75 = 75%

Time Formula

Sequential:

Time_sequential = Σ T_i (sum of all task times)

Parallel:

Time_parallel = T_planning + max(T₁, T₂, ..., T_N) + T_overhead

Speedup Factor:

Speedup = Σ T_i / (T_planning + max(T_i))

Assuming uniform task times (T_i ≈ T_avg):
Speedup = N × T_avg / (T_planning + T_avg)

For N=10, T_avg=2min, T_planning=1min:
Speedup = 20 / 3 ≈ 6.7x faster

Example Scenarios

Project Type Tasks Context Reduction Cost Savings Time Savings
Python → Node.js 10 85% 74% 85%
Large Refactor 15 80% 68% 88%
Multi-service Update 8 90% 80% 82%
Database Migration 12 75% 64% 87%

Key Variables

  • N: Number of subtasks
  • C_total: Full project context (tokens)
  • C_i: Context per subtask (typically 10-20% of total)
  • P: Price per token
  • T_i: Time per task
  • Parallelization factor: Limited by number of subtasks and independence

This math demonstrates significant savings for large, decomposable tasks, making a strong case for the Subagent feature.