Operating · Lesson 10 — The routing model — one chief of staff, N specialists
O10Operating
Operating · Lesson 10● live

The routing model

One chief of staff, N specialists. Why six is right and fourteen is too many.

14 min read · 30 min applycloses Operating tier

What routing is for

Most multi-agent setups feel slower than they should. The reason is rarely the agents — it’s the routing. With no clear default agent and no explicit triggers, every request requires a who-handles-this decision before any productive work happens. The mental overhead scales linearly with agent count and dominates the specialization payoff after ~6 agents.

The routing model is the structure that makes the multi-agent system fast: one chief-of-staff handles routing by default. Specialists fire on observable triggers OR when the chief explicitly hands off. The operator routes only when neither path applies.

This lesson is the structure that closes the Operating tier. Foundations 02 taught how to define one specialist; this lesson teaches how to compose specialists into a system that scales without overhead.

The 14-to-6 story

When I first set up multi-agent routing, I went big. 14 agents. One per venture, plus specialists per domain, plus a chief-of-staff. Every reasoning mode I could name got its own agent.

The system felt slower than my old one-agent setup. Not because the agents were worse — they were better-scoped — but because every request had a who-handles-this decision attached. Half my mental cycles went to routing instead of work.

I cut to 6. The same work happened faster. The remaining 8 agents weren’t bad ideas; they were too narrow. They handled too few requests to justify the routing decision. The chief-of-staff effectively handled most of their work anyway, just with extra overhead.

The lesson: more specialists ≠ more capacity. The routing overhead dominates after ~6 agents for a solo operator. The right roster is small, dense, distinct.

Three routing failures

The patterns that make multi-agent setups feel slower than they should. Hover any card to see the diagnosis.

01

The flat army

claim looks likeAll agents at the same level. Operator routes manually each time. No chief-of-staff.
what’s missingOperator becomes the routing layer. Every request requires a who-handles-this decision before work starts. Mental overhead scales linearly with agent count.
the moveOne default agent (chief-of-staff) handles routing. Specialists fire on triggers OR when the chief explicitly hands off. Operator routes only when chief doesn't.
02

Split on company, not on reasoning style

claim looks likeOne agent per venture. "VENTURE-A handles all venture A work. VENTURE-B handles all venture B work."
what’s missingEach venture-agent has too much surface area — engineering AND fundraising AND ops AND legal. The agent's reasoning style has to switch every turn. Memory bleeds across reasoning modes.
the moveSplit agents on the seam where reasoning style changes, not on the seam where company changes. Engineering specialist for technical reasoning across ventures. Legal/IP specialist for legal reasoning across ventures.
03

Authority ambiguity

claim looks likeSpecialist drafts a fix. Doesn't know whether to ship it or surface for approval. Either freezes (asks for every micro-decision) or overreaches (commits to things it shouldn't).
what’s missingL1 vs L4 vs L5 isn't documented. No written rule for what each agent can decide. The first time it bites, you lose either time (over-asking) or trust (over-acting).
the moveAuthority levels in the agent.md file. L5 chief routes; L4 specialists own their domain end-to-end with operator approval on big decisions; L3 specialists advise but don't ship. Three levels are usually enough.

The fix in all three: split agents on reasoning style, route through a default chief-of-staff, document authority explicitly. Each fix is small; together they make a 6-agent system feel faster than a 1-agent system.

The split-test

The diagnostic for whether your roster is the right size and shape:

The split-test
Look at my agent roster.
For each agent, ask:
- What's the ONE reasoning style this agent embodies?
  (engineering, legal, ops, content, etc.)
- What share of recent requests have routed to this agent?
- If any agent gets >40% of requests, it's too broad — split it.
- If any agent gets <5% of requests, it's too narrow — fold it into
  another or kill it.

Output a table with: agent, reasoning style, recent share %,
recommendation (KEEP, SPLIT, FOLD, KILL).

Healthy roster: 5-7 agents, each at 10-25% of routes, no agent
above 40% (except the chief-of-staff handling fall-through).

Surfaces over-broad agents (>40% of routes) and under-used agents (<5%) — two signs the roster needs adjustment.

Healthy roster signals:

  • 5-7 total agents for a solo operator. 9-11 for a small team.
  • Each agent embodies one reasoning style (engineering, legal, content, etc.) — not one venture.
  • No agent >40% of routes except the chief-of-staff handling fall-through.
  • No agent <5% of routes. Below that threshold, the agent is too narrow — fold or kill.

Run the split-test monthly during the first six months. The roster shape stabilizes after that, but new ventures or new reasoning styles can shift the balance.

Annotated transcript

A real conversation: operator coming off a 14-agent setup, working through what to keep. Click any pinned annotation under an agent reply to expand the “why this turn matters” commentary.

From 14 agents to 6

Five operator turns, four agent turns. Result: the structural insight (split on reasoning, not on company) and the authority levels.

  1. operator

    I had 14 agents. The system felt slower than 1 agent did. I cut to 6 and it sped up. Why?

  2. agent

    Routing overhead. With 14 agents, every request had a who-handles-this decision in front of the work. Some routes were obvious; many were ambiguous. The ambiguous ones cost mental cycles before any productive work happened.

    With 6 agents, the agent-per-domain mapping is dense enough that 80%+ of requests have an obvious home. The remaining 20% land on the chief-of-staff by default. The cost of routing drops to near-zero.

  3. operator

    How did you decide which 6 to keep?

  4. agent

    I split on the seam where reasoning style changes, not on the seam where company changes. The reasoning styles in my work:

    1. Coordination + routing — chief-of-staff.
    2. Engineering reasoning — technical, code, architecture. Cross-venture.
    3. Business + investor reasoning — narrative, pitch, fundraising.
    4. Operational reasoning — venue, logistics, vendors.
    5. Legal/compliance reasoning. Cross-venture.
    6. Content/voice reasoning — newsletter, brand.

    Every request maps to one of those reasoning styles. Cross-venture specialists are valuable because legal reasoning is the same shape regardless of which venture surfaces it. The same lawyer-brain works across all of them.

  5. operator

    How does the chief-of-staff know when to route to who?

  6. agent

    Trigger words in the routing table. CLAUDE.md has a section like:

    | Agent | Activate when |
    |-------|---------------|
    | APEX | Default. "morning brief", "catch me up", general |
    | ENGINEERING | code, debug, build, architecture, deploy |
    | VELOCITY | pitch, investor, fundraising, narrative |
    | SUMMIT | venue, lease, SBA, logistics |
    | CIPHER | contract, agreement, compliance |
    | SCRIBE | newsletter, blog, content, voice |

    Chief reads this on every session. When you say "draft a pitch deck for VC," trigger words "pitch" and "VC" route to VELOCITY. When you say "morning brief," no trigger fires; default APEX handles.

    For edge cases the chief escalates: "This touches engineering AND legal — do you want ENGINEERING or CIPHER first?" Operator decides; chief routes; work proceeds.

  7. operator

    Authority levels — how do you set them?

  8. agent

    Three levels are usually enough.

    L5 — Chief of Staff (APEX). Routes everything. Drafts most outputs. Never ships without operator approval. Authority to call "this needs ENGINEERING" without asking, but not to commit to anything irreversible.

    L4 — Specialist Director (CIPHER, SUMMIT, VELOCITY). Owns their domain end-to-end. Can decide most things within domain. Surfaces big decisions (>$10K, irreversible, brand-level) for operator approval. Veto authority within their domain — can refuse work that violates their domain rules.

    L3 — Advising Specialist (ENGINEERING in some shops, SCRIBE in mine). Reads, drafts, surfaces. Doesn't ship without explicit operator OK. Used when the cost of a wrong decision is high enough that you want all proposals reviewed.

    Document the authority in each agent.md file. Make it explicit: "can decide X. cannot decide Y. surfaces Z for operator review." Three lines. Resolves 90% of authority confusion.

The structure — five questions every routing model answers

  1. Who’s the default? One agent (chief-of-staff). Handles every request unless a specialist’s trigger fires. Owns routing by default.
  2. What are the reasoning styles? The seam where agents split. Engineering. Legal. Content. Ops. Investor. List 4-6 distinct ones for your work.
  3. What triggers each specialist? Observable phrases or keywords. pixel: for design. cipher: or words like “patent” / “NDA” for legal. Triggers go in the CLAUDE.md routing table.
  4. What’s the authority level? L5 chief (routes, drafts, no ship). L4 specialist (owns domain, decides most things, surfaces big calls). L3 advisor (drafts only, never ships without explicit OK). Three levels usually cover it.
  5. What’s the escalation path? Specialist surfaces big decisions to chief-of-staff. Chief surfaces ambiguous routes to operator. Operator is the final escalation. Two hops max from any specialist to the operator.

Document all five in your CLAUDE.md routing table + each agent.md file. Total writing time: ~30 minutes for a 6-agent setup. Saves hours of routing-overhead per week.

Three diagrams

The default-routing flow
Agent reports
"Done. Here's the status."
Apply the diagnostic
"Walk every claim. Test method?"
All claims have
observable test methods
Some claims marked
INCOMPLETE
Agent says it
cannot verify
Run the tests yourself.
Pass or revise.
Block on those criteria.
Re-plan around the gaps.
Revise the criterion or
run the tests yourself.
Reasoning-style splits (6 agents)
"done"
Shape claim
Inherited assumption
Schema-quiet mismatch
"240 files saved at 2304×1296"
"Auto-labels look good per baseline"
"Labels uploaded successfully"
Did anyone open one?
Was the assumption re-verified?
Were the IDs actually right?
Need:
visual inspection step in contract
Need:
explicit re-validation of every inherited claim
Need:
cross-system schema check, not upload-success
The escalation path (specialist → chief → operator)
Sprint requested
Draft testable success criteria
Builder + Evaluator co-sign. Every row has a "How to test" cell.
Build against contract
Builder self-eval
runs each test method
Evaluator independent eval
fresh eyes, same tests
Ship
Return to Builder

Prompt kit

Three prompts for designing and maintaining a routing model. Save in your CLAUDE.md or a personal snippets file.

The split-test
Look at my agent roster.
For each agent, ask:
- What's the ONE reasoning style this agent embodies?
  (engineering, legal, ops, content, etc.)
- What share of recent requests have routed to this agent?
- If any agent gets >40% of requests, it's too broad — split it.
- If any agent gets <5% of requests, it's too narrow — fold it into
  another or kill it.

Output a table with: agent, reasoning style, recent share %,
recommendation (KEEP, SPLIT, FOLD, KILL).

Healthy roster: 5-7 agents, each at 10-25% of routes, no agent
above 40% (except the chief-of-staff handling fall-through).
Audit a routing decision after the fact
Walk the last 5 requests in this session.
For each, tell me:
- Which agent should have handled it (per the routing table)?
- Which agent actually did?
- If they differ, was it a routing miss or a routing-table gap?

If the chief-of-staff routed wrong, the routing table needs
clearer triggers. If the chief routed correctly but the specialist
took the wrong action, the agent.md authority section needs
clarification.
Find duplicate or overlapping agents
Read the agent.md files for all agents.
For any two agents, ask:
- Do their scopes overlap? List the overlap.
- Do their triggers overlap? Could a single request fire both?
- If yes, the routing table will produce ambiguous routes.

Suggest: merge the agents, redraw the scope boundary, or add
explicit precedence ("if both fire, X wins").

Apply this — design your roster

30-minute exercise. List your reasoning styles. Apply the split-test. Aim for 5-7. Document authority.

Design or audit your agent roster

Each step takes 5-10 minutes. Progress saves automatically.

0/5
  1. 01List your current agents (or, if you have just one, list the reasoning styles you switch between manually).Engineering. Legal. Content. Ops. Investor narrative. Whatever your work actually has.
  2. 02Apply the split-test — does any reasoning style appear in 2+ agents? Or get >40% of routes?Either case is a sign your roster needs adjustment.
  3. 03Aim for 5-7 agents covering distinct reasoning styles. Cut anything redundant; merge anything narrow.If you've never had agents, start by defining 3 — chief-of-staff + 2 specialists in your highest-volume reasoning styles. Add more later.
  4. 04Add the routing table to your CLAUDE.md (Foundations 01) with explicit trigger words per agent.Triggers should be observable phrases or keywords. "design questions" is too vague. "pixel:", "audit the design", "check spacing" are concrete.
  5. 05Document authority levels (L3/L4/L5) in each agent.md file.Three lines per agent: can decide X, cannot decide Y, surfaces Z for review. Most authority confusion ends here.
Operating tier · what's next

After this lesson