Sub-agent ROI: When Spawning Pays Back and When It Doesn't
Abstract
Sub-agent invocation — spawning a child agent to handle a bounded subtask — is increasingly common in solo-operator AI workflows, yet the conditions under which it produces a net benefit remain poorly characterized. We present a retrospective audit of n=180 sub-agent spawns drawn from 60 days of single-operator agent work across three active ventures. Each spawn was classified by task archetype (bounded research, parallelizable work, context-protection, and simple single-question) and annotated with an estimated overhead cost in tokens and wall-clock time. The headline result: 61% of observed spawns produced positive ROI (95% CI: 53–68%), but that aggregate masks sharp archetype-level divergence. Bounded-research spawns paid back at 84% (95% CI: 74–91%), parallelizable-work spawns at 79% (95% CI: 63–90%), context-protection spawns at 71% (95% CI: 56–83%), and single-question spawns at only 19% (95% CI: 9–33%). The dominant failure mode across negative-ROI spawns was sub-agent re-payment of priming cost already borne by the parent: the child reconstructs context the parent had already assembled, producing total cost exceeding the task value. A secondary failure mode was spawning for tasks below the 5k-token break-even threshold, where orchestration overhead exceeds any parallelism or context-isolation gain. We propose a four-condition routing rule, validated against a held-out set of 40 additional spawns, that reduces negative-ROI spawns from 39% to 14%. The routing rule and classification instrument are released under MIT license.
1. Introduction
Hierarchical decomposition of complex tasks into bounded subtasks is a long-studied strategy in both human organizations and autonomous systems. In reinforcement learning, the options framework [1] and related hierarchical machine architectures [2] formalize when it is advantageous to delegate to a sub-policy: broadly, when the subtask is long-horizon relative to the primary task and when the sub-policy can be executed with limited information about the parent’s global state. The same logic applies, imprecisely but usefully, to sub-agent spawning in operator-AI workflows.
In practice, however, single operators running agent-mediated engineering work tend to spawn sub-agents based on intuition rather than explicit criteria. The result is a characteristic pattern: over-spawning on small tasks where the coordination overhead exceeds any benefit, and under-spawning on genuinely independent research tasks where a bounded sub-agent would have prevented context contamination. The hidden cost structure of this over-spawning parallels the “glue code” and pipeline-complexity debt documented by Sculley et al. [4] — a tax that accrues invisibly until it becomes large enough to diagnose.
We present a retrospective audit of n=180 sub-agent spawns drawn from 60 days of single-operator agent work across three active ventures. Each spawn was classified by task archetype and annotated with an estimated overhead cost. The headline result: 61% of observed spawns produced positive ROI (95% CI: 53–68%), but that aggregate masks sharp archetype-level divergence — from 84% for bounded-research tasks to 19% for single-question tasks. A four-condition routing rule, validated on a held-out set of 40 additional spawns, reduced negative-ROI spawns from 39% to 14%.
The rest of TPL-2026-006 is for subscribers.
Sub-agent ROI: When Spawning Pays Back and When It Doesn't
- Every Expert-tier lesson — diagnostic prompts, transcripts, prompt kits, full homework
- Every research paper — methodology, figures, tables, reproducibility appendices
- New Expert lessons + papers as they ship (quarterly cadence)
- Foundations + Operating lessons stay free; bundles on GitHub stay free; this tier is the deep stuff
Free while the early catalog ships. Paid tier comes later — subscribe now and you’re grandfathered in.