
How I track token spend and find the waste
Cache hit ratio, tokens per shipped artifact, uncommitted-output days. Three ratios I run weekly.
Ratios beat the bill
For a long time I just looked at the monthly API bill. Under my threshold I felt fine. Over it I panicked. That's the wrong signal. Two people with the same bill can be in completely different states. One shipping efficiently, one bleeding. The bill won't tell you which.
What I actually use now are ratios. Cache hit ratio tells me whether my sessions are warm or cold. Tokens per shipped artifact tells me whether the spend is producing anything. Uncommitted-output day count tells me whether my briefs are drifting. None of these show up on the bill.
I run all three weekly. Ten minutes. They catch waste 2-4 weeks before the bill does.
Three diagnostic metrics
Each catches a different waste pattern. Skip one and you miss the matching pattern entirely.
| Metric | Healthy range | What it catches |
|---|---|---|
| Cache hit ratio | >60% | Cold-cache thrash — sessions too fragmented to reuse context |
| Tokens per shipped artifact | Trending flat or down vs baseline | Per-unit cost drift — agents getting less efficient over time |
| Uncommitted-output days | <1 per week | Brief drift — sessions that consume tokens but ship nothing |
Cache hit ratio is the cheapest one to fix. Brief drift is the hardest. Token-per-artifact drift sits in between. I run them in cost-of-fix order so I bank the easy wins first.
The rest of Expert · Lesson 03 is for subscribers.
Tracking token spend and identifying waste
- Every Expert-tier lesson — diagnostic prompts, transcripts, prompt kits, full homework
- Every research paper — methodology, figures, tables, reproducibility appendices
- New Expert lessons + papers as they ship (quarterly cadence)
- Foundations + Operating lessons stay free; bundles on GitHub stay free; this tier is the deep stuff
Free while the early catalog ships. Paid tier comes later — subscribe now and you’re grandfathered in.