Token Economics of Agent-Mediated Engineering: Per-Artifact Cost Distributions
Abstract
Operators running agent-mediated engineering workflows routinely lack visibility into the per-artifact token cost of shipped work, treating the monthly API bill as a single aggregate figure. We instrument 92 days of single-operator agent usage across three concurrent ventures, computing per-shipped-artifact token cost, cache hit ratio, and the rate of "uncommitted-output days" (sessions consuming >5,000 tokens with no shipped artifact). Median cost per shipped artifact was 6,840 tokens (IQR 3,210-14,520) and showed strong dependence on cache hit ratio (Spearman ρ = -0.71, p < 0.001). Three discrete behavioral interventions — session batching, brief-drift detection, and a hard 30,000-token session cap — reduced median per-artifact cost by 38% over a 4-week intervention period.
1. Introduction
Operators of agent-mediated engineering workflows pay for tokens but rarely instrument what they get for them. The monthly bill from a model provider is a single aggregate that masks substantial heterogeneity: some sessions ship artifacts efficiently, others consume thousands of tokens with nothing committed. The same operator on the same project can have a 4× difference in per-artifact cost across different weeks, and without instrumentation cannot identify which behaviors drive the gap.
We instrument 92 days of single-operator usage across three concurrent ventures. The operator’s plan provides cached and uncached input tokens at known per-token rates [7], and the agent harness logs every session’s input/output/cached token counts. We compute three diagnostic metrics — cost per shipped artifact, cache hit ratio, and uncommitted-output day rate — over the period, identify a strong relationship between cache hit ratio and per-artifact cost, and run a 4-week intervention period to measure the effect of three discrete behavioral changes.
Our contributions:
- Per-artifact token cost as a measurement primitive — distinct from gross usage and from cache-hit ratio.
- An empirical demonstration that cache hit ratio correlates strongly with per-artifact cost (Spearman ρ = -0.71), making cache hygiene the highest-leverage cost lever.
- An ablation showing that three discrete interventions (session batching, brief-drift detection, hard session cap) each contribute meaningfully to the 38% cost reduction, with batching the largest single component.
The rest of TPL-2026-003 is for subscribers.
Token Economics of Agent-Mediated Engineering: Per-Artifact Cost Distributions
- Every Expert-tier lesson — diagnostic prompts, transcripts, prompt kits, full homework
- Every research paper — methodology, figures, tables, reproducibility appendices
- New Expert lessons + papers as they ship (quarterly cadence)
- Foundations + Operating lessons stay free; bundles on GitHub stay free; this tier is the deep stuff
Free while the early catalog ships. Paid tier comes later — subscribe now and you’re grandfathered in.