TPL-2026-009·preprint·2026-04-30

Plan-Mode Efficacy on Time-to-Merge: A Cross-Venture Study (n=42 tasks)

operationsplan-modeproductivitymethodologycross-venture

Abstract

Plan mode — the agent-harness convention of producing and approving a written implementation plan before any code is written — is widely advocated as a discipline for non-trivial coding work, but its quantitative effect on time-to-merge has rarely been measured outside anecdote. We instrument 42 tasks shipped over a 10-week window across four ventures (Quantum Caddy, Mile High Golf, Parley, and TruPath cross-portfolio work), pair-matched by file-touch count and stack, and compare plan-mode tasks against direct-edit tasks on time-to-merge, post-merge rework, and operator-rated quality. Plan mode reduces median time-to-merge by 31% on tasks touching more than three files (n=22; p=0.011) but shows a null effect on tasks touching one to three files (n=20; p=0.74). The crossover point is between three and four files touched. Operator-rated post-merge rework drops from 38% of tasks to 14% in the plan-mode group on the >3-file bucket. We argue that plan mode is a structural intervention against context-window thrash, not a general-purpose productivity ritual, and that operators applying it uniformly to small tasks pay a real overhead with no measurable return.

1. Introduction

Plan mode — the agent-harness convention of producing and approving an explicit, written implementation plan before any code is written — is one of the most-advocated, least-measured disciplines in agent-mediated software work. The Anthropic Claude Code product documentation describes plan mode as a tool for “non-trivial coding work” [8], but offers no quantitative threshold. Operators who adopt the discipline tend to apply it either uniformly (a costly habit on small tasks) or never (forfeiting the benefit on large ones). Both regimes are inefficient if the true cost-benefit curve has a non-trivial threshold.

We are interested in two questions. First, does plan mode reduce time-to-merge in practice, and by how much? Second, is the effect uniform across task sizes, or does a crossover exist below which the planning step is net-negative? The second question matters because every minute of planning is a minute of operator and agent time that could be spent on other work; if the benefit is concentrated in a particular task-size regime, operators can target the discipline rather than apply it uniformly.

The literature on planning before coding is old and largely qualitative. Brooks [1] argued that throwing away the first design is itself a form of planning. Parnas and Clements [2] proposed faking a rational design process — writing a plan after the fact that the work could have followed — as a substitute for the difficulty of planning ahead. Kahneman’s work on intuition vs. explicit decision rules [3] predicts that explicit rules outperform intuition in high-uncertainty regimes and underperform it in low-uncertainty regimes — a structural argument for a crossover threshold of the kind we observe.

This paper measures the threshold quantitatively for one specific class of work: agent-mediated coding tasks shipped through a small portfolio (Quantum Caddy, Mile High Golf, Parley, and TruPath cross-portfolio work) over 10 weeks. The within-organization design controls for many process and cultural variables that confound multi-organization studies of coding-tool effectiveness [5]. The cost is external validity: a single operator’s habits are baked into the data, and the result must be replicated elsewhere before being treated as a property of plan mode rather than a property of this operator’s working style.

Subscribers only · continued

The rest of TPL-2026-009 is for subscribers.

Plan-Mode Efficacy on Time-to-Merge: A Cross-Venture Study (n=42 tasks)

  • Every Expert-tier lesson — diagnostic prompts, transcripts, prompt kits, full homework
  • Every research paper — methodology, figures, tables, reproducibility appendices
  • New Expert lessons + papers as they ship (quarterly cadence)
  • Foundations + Operating lessons stay free; bundles on GitHub stay free; this tier is the deep stuff

Free while the early catalog ships. Paid tier comes later — subscribe now and you’re grandfathered in.

Cite as: TruPath Labs Research (2026). Plan-Mode Efficacy on Time-to-Merge: A Cross-Venture Study (n=42 tasks). TruPath Labs Preprint TPL-2026-009. trupathventures.net/labs/research/plan-mode-efficacy