Architecture● live

Agent Hierarchies

Authority levels, the mothership pattern, generation/evaluation separation. The structural decisions most multi-agent setups skip until they break.

Five authority levels from Observer to CEO — each with distinct write-access rules
Generation/evaluation separation: why the agent that built it can't grade it

GitHub publication pending

I ran 14 agents for four days in March before cutting to 6. The cut made the system faster, not slower. The 8 agents I removed weren't bad — they were too narrow, and narrow agents in a poorly structured hierarchy create routing overhead that compounds.

The structural decisions I got wrong are predictable in hindsight. No authority levels — every agent had implicit full write access. No generation/evaluation separation — the agent that built the thing was the same agent that declared it done. No mothership — everything routed to me as the de facto integrator.

These three decisions are independent, and each is worth building before you hit the failure.

Authority levels

Not every agent should be able to write everywhere. The difference between a well-scoped specialist and a runaway agent is often a single "Never" line that wasn't written.

Five levels:

Observer — read-only. No writes. The right pattern for QA agents, audit agents, evaluators. An evaluator that can edit code isn't evaluating — it's cleaning up after itself, which changes what it reports.

Scoped writer — writes to specific paths only. The right starting point for every specialist. SCRIBE in my system writes only to 01-TruPath-Labs/. AXIOM writes to src/ and scripts/, never to .claude/hooks/ or agent files.

Director — writes broadly within domain, escalates on cost thresholds, legal risk, and architecture decisions. The appropriate level for mature, battle-tested specialists.

Chief of Staff — routes across all domains. Handles routing, synthesis, and brief. Escalates to CEO for anything irreversible or cross-venture.

CEO — no restrictions. That's you.

Start at scoped writer. Promote when the agent has demonstrated it won't exceed its scope.

The "Never" row in every agent file is the most important line in the file: never edit agent files, hooks files, or CLAUDE.md without explicit CEO instruction. An agent that can edit its own constraints can remove its own enforcement.

The mothership pattern

For a multi-venture portfolio, one agent holds cross-venture context: the routing table, the morning brief, the escalation chain, cross-venture coordination. In my system that's APEX.

What the mothership owns: routing, synthesis, escalation, coordination.

What the mothership doesn't own: domain expertise, execution, specialist memory. The specialists hold their own domains. The mothership connects them.

The common failure is making the mothership a god-agent — give it everything, because it's the smart one. The practical result is that the mothership becomes slow and unreliable on domain-specific work, and specialists feel redundant. The mothership's value is routing and synthesis, not expertise.

Generation/evaluation separation

The agent that builds the thing should not evaluate the thing.

This isn't about trust. It's about blind spots. An agent that built a schema migration has a mental model of how it's supposed to work. That mental model filters what it tests. Conditions that contradict the model get rationalized or skipped.

An independent evaluator — fresh to the output, without the builder's assumptions — tests differently. It encounters the output the way the next agent will: without prior context, without shortcuts.

In QC: AXIOM builds. BEACON evaluates. BEACON is read-only. It cannot edit code, cannot approve its own findings. AXIOM reads the BEACON report and implements changes. BEACON re-runs the criteria independently before signing off.

For single-operator setups without a dedicated evaluator: start a fresh session. Give the output to the new session without telling it what the builder intended. Ask it to evaluate against the sprint criteria. The gaps it finds are the gaps a downstream agent would find.

This is the single highest-leverage structural change in any multi-agent system. The first time you run an independent evaluation and it surfaces a gap the builder missed, the principle will never need arguing again.

— Michael, from the lab

← All bundles Subscribe to Field Notes