Operating · Lesson 04 — Hooks — mechanical enforcement vs. promised future discipline
O04Operating
Operating · Lesson 04● live

Hooks — mechanical enforcement

Why hooks beat “agents must remember to.”

18 min read · 60 min applyprereq: Operating 01

What hooks are

A hook is a script that fires automatically at a specific moment in your workflow. Pre-commit hooks fire before a git commit completes. Pre-tool-use hooks fire before Claude Code runs a particular tool. Session-start hooks fire when a new agent session begins.

The reason hooks matter: they enforce mechanically what would otherwise depend on agent or human discipline. A rule like “don’t commit secrets” can either live as a memory entry (which the agent reads, then sometimes forgets) or as a pre-commit hook (which scans the diff and blocks the commit if it finds a secret pattern). The hook works every time. The memory entry works most of the time.

Mechanical enforcement is always cheaper than promised future discipline, for both agents and humans. This lesson is about how to convert promises into hooks, and how to keep the hook layer healthy.

The story

Early in this work I had a memory entry: “Cipher (the legal/IP agent) reviews every commit that touches authentication code.” The rule was correct. The agent read it on every session start.

It worked maybe 60% of the time.

The other 40%: I’d be deep in a session, refactor an auth helper, commit, and only later notice — Cipher hadn’t flagged the commit. The agent had been heads-down on the actual code work and the memory entry didn’t fire as a precondition; it just sat there as background context.

The fix took five minutes: a pre-commit hook that scanned staged file names for an auth pattern and blocked the commit if any matched, with a message saying “Cipher review required.” From that point forward, the rule fired 100% of the time.

That ratio — ~60% with discipline-based enforcement, ~100% with mechanical enforcement — is the entire argument for hooks. The structural fix is always cheaper to maintain than ongoing vigilance.

Three ways hooks fail

Hooks are simple in concept and easy to get wrong in practice. Hover any card to see the diagnosis.

01

The promise hook

claim looks likeMemory entry: "User prefers atomic commits." Or: "Cipher reviews all auth changes."
what’s missingNothing fires. The agent reads the preference at session start, then proceeds to make the same kind of mistake the rule was supposed to prevent. A promise without enforcement is noise.
the moveIf you can express the rule as a precondition ("don't proceed if X"), write it as a PreToolUse hook. If you can express it as an audit ("flag if Y after the fact"), write it as a PostToolUse hook. Otherwise delete the promise.
02

The over-eager block

claim looks likePre-commit hook blocks every commit that touches /agents/ — including yours.
what’s missingThe hook fires too aggressively. Now legitimate work routes through manual override. Within a week the override IS the workflow and the hook is a tax, not a guardrail.
the moveHooks should fire on rare, specific patterns. If you find yourself overriding a hook >1x/week, the hook's predicate is wrong. Tighten the condition or kill the hook.
03

The unscoped pattern

claim looks likePostToolUse hook routes file changes to the right specialist — but the file glob is too broad and routes everything to one agent.
what’s missingThe hook becomes the chief-of-staff. The actual chief-of-staff stops getting routed work. Specialists never see their domain because the hook captures everything.
the moveScope every hook predicate to the narrowest specific pattern that captures the intent. If you can't write the exact glob/file pattern, you don't yet know what the hook should catch.

The fix in all three cases: scope the hook to the narrowest, most specific predicate that captures the intent. If the hook fires too often, you’ll override it. If it fires too rarely, it’s noise. Tight predicates produce reliable hooks.

The slippage test

The diagnostic for evaluating any existing hook (or proposed one):

The slippage test
Look at this hook.
- What's the specific failure mode it prevents?
- If the hook were removed today, what would slip through tomorrow?
- Is the predicate (the condition that makes it fire) tight enough
  that it almost never false-positives?
- Is the predicate loose enough that the failure mode actually
  triggers it reliably?

If you can't answer all four, the hook needs revision.

Forces every hook to justify itself in terms of a specific failure mode it prevents.

Four questions every hook must answer:

  • What does it prevent? If you can’t name the specific failure mode in one sentence, the hook is too vague.
  • What slips through if removed? If nothing observably bad happens, the hook is overhead.
  • Is the predicate tight enough? Override rate > once per week means the predicate catches too much.
  • Is the predicate loose enough? If the failure mode keeps slipping through despite the hook existing, the predicate doesn’t catch what it should.

Hooks are most valuable when they enforce specific, recurring, costly failure modes. They’re overhead when they’re aspirational guardrails for things that haven’t actually broken.

Annotated transcript

A real conversation that converts a chronically-violated memory rule into a working pre-commit hook. Click any pinned annotation to expand the “why this turn matters” commentary.

Promise rule → enforced hook

Six operator turns, four agent turns. Result: a 12-line bash hook that fires 100% of the time.

  1. operator

    Cipher (legal/IP agent) is supposed to review every commit that touches authentication code. Half the time he doesn't. I keep adding the rule to memory and it doesn't stick.

  2. agent

    That's a promise hook. Memory entries don't enforce — they describe. The rule "Cipher reviews auth changes" lives in memory but nothing gates on it. The fix is to convert the promise into a hook: a pre-commit script that fires when staged files match an auth pattern, blocks the commit, and routes to Cipher.

  3. operator

    OK. What's the simplest version of that hook?

  4. agent

    Five-line bash script at `.claude/hooks/pre-commit-cipher.sh`: ``` #!/usr/bin/env bash staged=$(git diff --cached --name-only) if echo "$staged" | grep -qE '(auth|token|session|password)'; then echo "❌ Auth changes detected — Cipher review required." echo " Files: $staged" exit 1 fi exit 0 ``` Makes it executable, drop it in `.git/hooks/pre-commit`. Now any commit touching a file whose name matches the auth pattern gets blocked at the git layer, not in conversation.

  5. operator

    But what if I'm legitimately editing auth code and Cipher already reviewed it? Do I have to delete the hook every time?

  6. agent

    Add an opt-out clause. Conventional pattern: a tag in the commit message lets you bypass. ``` msg=$(cat .git/COMMIT_EDITMSG 2>/dev/null) if echo "$msg" | grep -q 'cipher-reviewed'; then exit 0 # bypass — operator confirmed Cipher reviewed fi ``` Added before the auth check. Now you commit with `git commit -m "refactor auth - cipher-reviewed"` and the hook honors the operator's explicit bypass. Tag stays in the commit history as evidence.

  7. operator

    What if Cipher needs to AUDIT after the fact instead of blocking before?

  8. agent

    PostToolUse hook. Same pattern, different timing. Fires after a tool call (or commit) succeeds. Doesn't block — it logs and surfaces. Useful for things like: flag every test that took >30s for slowness audit. Track every external API call. Notify when a file in a watched directory changes. The hook's job is signal, not blocking. Rule of thumb: PreToolUse if a wrong action is unrecoverable (commit pushed, money spent, customer notified). PostToolUse if the cost is detectability (track this, surface that). Use PreToolUse sparingly. PostToolUse can be cheaper because it never breaks the workflow.

The hook lifecycle — pick the right moment

Picking the wrong hook moment is the single most common bug. Each fires at a distinct point. Each serves a different purpose.

HookFiresUse forExample
SessionStartWhen a new agent session beginsLoading context, printing the day’s plan, validating the envPrint today’s sprint contract + uncommitted changes
UserPromptSubmitBefore the agent processes operator inputAdding context, redacting secrets, blocking dangerous promptsBlock prompts that contain raw API tokens
PreToolUseBefore a specific tool runs (Bash, Edit, etc.)Hard guardrails — actions whose wrong outcome is costly or unrecoverableBlock any commit that contains sk-, rpa_, or other secret patterns
PostToolUseAfter a tool call succeedsAudit, logging, downstream notifications, cleanupLog every external API call to logs/network.log
StopWhen a session endsCleanup, checkpoint save, summary generationWrite a session summary to 10-Session-Logs/

Rule of thumb: PreToolUse for blocking, PostToolUse for auditing. Use PreToolUse sparingly because it interrupts workflow. Use PostToolUse generously because it’s nearly free.

Your first hook — the secret scanner

The single highest-value first hook for any project: a pre-commit secret scanner. Catches secrets before they end up in git history (where they’re much harder to remove).

#!/usr/bin/env bash
# .claude/hooks/pre-commit-secret-scan.sh
# Blocks commits containing secret-shaped strings.

set -euo pipefail

SECRET_PATTERNS=(
    'sk-[A-Za-z0-9]{20,}'           # OpenAI keys
    'rpa_[A-Za-z0-9]{30,}'          # Anthropic keys
    'hf_[A-Za-z0-9]{30,}'           # HuggingFace keys
    'ghp_[A-Za-z0-9]{30,}'          # GitHub tokens
    'eyJ[A-Za-z0-9_-]{20,}\.eyJ'   # JWT tokens
)

# Scan ONLY newly-added lines in the staged diff.
# (Removals don't matter — those secrets are leaving the repo.)
diff_added=$(git diff --cached -U0 | grep -E '^\+' | grep -v '^+++' || true)

# Allow opt-out via pragma comment on the same line.
diff_added=$(echo "$diff_added" | grep -v 'pragma: allowlist secret' || true)

for pattern in "${SECRET_PATTERNS[@]}"; do
    if echo "$diff_added" | grep -qE "$pattern"; then
        echo "❌ Possible secret detected matching: $pattern"
        echo "   If this is a test fixture, add: # pragma: allowlist secret"
        echo "   Otherwise: scrub the secret and re-stage."
        exit 1
    fi
done

exit 0

Save at .claude/hooks/pre-commit-secret-scan.sh, make executable (chmod +x), then symlink or copy to .git/hooks/pre-commit.

Now any commit that includes a secret-shaped string blocks at the git layer. The pragma allowlist lets you commit deliberate test fixtures. The structure scales: add patterns to the array as you discover new secret formats.

Three diagrams

Promise → hook conversion
Agent reports
"Done. Here's the status."
Apply the diagnostic
"Walk every claim. Test method?"
All claims have
observable test methods
Some claims marked
INCOMPLETE
Agent says it
cannot verify
Run the tests yourself.
Pass or revise.
Block on those criteria.
Re-plan around the gaps.
Revise the criterion or
run the tests yourself.
Where each hook moment fires in the lifecycle
"done"
Shape claim
Inherited assumption
Schema-quiet mismatch
"240 files saved at 2304×1296"
"Auto-labels look good per baseline"
"Labels uploaded successfully"
Did anyone open one?
Was the assumption re-verified?
Were the IDs actually right?
Need:
visual inspection step in contract
Need:
explicit re-validation of every inherited claim
Need:
cross-system schema check, not upload-success
The Pre vs Post decision tree
Sprint requested
Draft testable success criteria
Builder + Evaluator co-sign. Every row has a "How to test" cell.
Build against contract
Builder self-eval
runs each test method
Evaluator independent eval
fresh eyes, same tests
Ship
Return to Builder

Prompt kit

Three prompts for designing, auditing, and maintaining hooks. Save in your CLAUDE.md or a personal snippets file.

The slippage test
Look at this hook.
- What's the specific failure mode it prevents?
- If the hook were removed today, what would slip through tomorrow?
- Is the predicate (the condition that makes it fire) tight enough
  that it almost never false-positives?
- Is the predicate loose enough that the failure mode actually
  triggers it reliably?

If you can't answer all four, the hook needs revision.
Convert a memory rule into a hook
I have this rule in memory: "<paste the rule>"

Walk through:
1. What concrete event would this rule prevent?
2. Can I detect that event observably (file pattern, command, output)?
3. Is the right moment Pre (before the action) or Post (after)?
4. What's the smallest hook script that enforces it?

Output the hook script + the install instructions.
If the rule isn't enforceable as a hook, tell me why and suggest
either rewriting the rule or moving it to CLAUDE.md as a project
rule (which IS load-bearing on agent behavior, even if not enforced
mechanically).
Audit existing hooks for over-block / over-scope
Read the hooks in .claude/hooks/ and .git/hooks/.
For each hook, tell me:
- How often it has fired in the last 30 days (check logs if available)
- How many of those firings were legitimate blocks vs operator overrides
- If override rate > once per week, the hook's predicate is wrong

Suggest tightening the predicate, narrowing the file scope, or
killing the hook entirely.

Apply this — write your first hook

60-minute exercise. The first hook is the slowest. Subsequent ones take 10-15 minutes each.

Build your first hook

Each step takes ~10 minutes. Progress saves automatically.

0/5
  1. 01Pick one rule you keep having to remind Claude (or yourself) about.Examples: "don't commit secrets," "check CV licensing before merging," "don't auto-bump dependencies."
  2. 02Write the smallest possible hook that enforces it. Five lines of bash usually suffices.Use the pattern from the transcript: detect → echo → exit 0 (allow) or exit 1 (block).
  3. 03Install at .claude/hooks/<name>.sh and (if it's a git hook) symlink to .git/hooks/.Versioned in repo. Reviewable in code. Survives a fresh clone.
  4. 04Test it: trigger the failure case deliberately. Watch the hook fire.If it doesn't fire, the predicate is wrong. If it fires too aggressively, the predicate is too broad. Iterate.
  5. 05Add an opt-out clause if the hook will sometimes need legitimate bypass.Use a commit-message tag or environment variable that's auditable. Never just delete the hook.
Operating tier · what's next

After this lesson