Codex — Production Guide
Audience: the operator (Tyler), or a future Claude session running the production swarm.
Purpose: how to advance the Fast Track curriculum efficiently, without per-batch hand-holding.
TL;DR
Three slash commands and four scripts. You invoke /produce-next (or /audit-next when the queue is empty), Claude runs the swarm, the integrator handles validation and commits. Repeat until done.
The system
Source of truth: manifests/production/plan.json
Built by scripts/build_production_plan.py. Parses every plans/fasttrack/*.md audit, dedups gap units, marks them shipped or queued by inspecting content/, and topologically sorts by prereqs.
Key fields:
units[<id>]— every unit (shipped + queued) with status, source audits, prereqs.queue_topo_order— the list of queued unit ids in dependency order.audits_done— completed audits with P1/P2/P3/P4 counts.books_unaudited— FT booklist rows that lack an audit file.counts— top-line burn-down numbers.
Re-run build_production_plan.py whenever a new audit lands or a unit ships, to refresh the queue.
Producer: scripts/produce_next.py
python3 scripts/produce_next.py # 1 brief
python3 scripts/produce_next.py --batch 4 # 4 briefs (parallel-safe)
python3 scripts/produce_next.py --id 05.09.03 # target a specific unit
Picks the next queued unit (topo order; with --batch N, picks N units from distinct chapters to avoid file collisions) and prints a complete agent brief: target file path, frontmatter scaffold, peer references, hard constraints, continuity-metric requirements, and a math-content seed pulled from the audit's punch-list excerpt.
The brief is meant to be pasted directly into an Agent({subagent_type: 'general-purpose', prompt: ..., run_in_background: true}) call.
Integrator: scripts/integrate_unit.py
python3 scripts/integrate_unit.py 05.09.03 # validate + register
python3 scripts/integrate_unit.py 05.09.03 --commit # + commit
python3 scripts/integrate_unit.py 05.09.03 --commit --push # + push to Forgejo
After an agent writes the unit file, the integrator:
- Locates the file by id.
- Runs the validator. Aborts with exit 1 if not 27/27.
- Adds a stub catalog entry to
docs/catalogs/CONCEPT_CATALOG.mdif missing. - Adds prereq + successor edges to
manifests/deps.json(idempotent). - Marks the unit shipped in the production plan.
- Runs
gen_placeholders.pyto back-fill any missing SVGs. - Runs
measure_continuity.pyand confirms thresholds hold. - Optionally commits + pushes.
Auditor: scripts/audit_next_book.py
python3 scripts/audit_next_book.py # next unaudited book
python3 scripts/audit_next_book.py --ft-entry 1.04
Picks the next book from the unaudited list and emits an audit-agent brief in the same shape as the existing 9 audits. After the agent writes plans/fasttrack/<slug>.md, re-run build_production_plan.py to ingest the new gap units.
Dashboard: scripts/dashboard.py
python3 scripts/dashboard.py
Prints burn-down, next 5 queued units, last 5 commits, validator + continuity headlines, and a session-count estimate.
Slash commands
/produce-next— full produce-validate-integrate-commit-push cycle. Pass--batch 4as$ARGUMENTSfor parallel runs./audit-next— generate the next audit, run the agent, refresh the plan./dashboard— one-screen status.
These live at .claude/commands/*.md.
Recommended cadence
Once you trust the swarm:
- Daily (or every other day): one
/produce-next --batch 4cycle. Adds ~4 units. ~30-45 min wall-clock if everything goes smoothly. ~1-1.5M agent tokens. - Weekly or when the queue runs low: one
/audit-nextcycle to add a new book's gap units to the queue. - After every 5-10 produced units: spot-check the §M Master prose for templated voice. If you see Cannas-style boilerplate creeping in, schedule a deepening pass.
For autonomous-style operation (you're not at the keyboard), use the loop skill:
/loop 30m /produce-next
This runs /produce-next every 30 minutes for as long as the session is alive. Caveats: each loop iteration consumes meaningful compute; agent failures still need manual recovery; and the session-end happens when you close it. Treat this as "supervised autopilot," not "set and forget."
Quality gates
The integrator enforces:
- 27/27 validator — automatic. Failed integration is rolled back.
- All 6 continuity thresholds — measured on every integrate. A regression triggers a warning (not a hard fail).
- Catalog entry exists — auto-stub if missing; refine manually later.
- Deps.json updated — auto.
The integrator does NOT enforce:
- Master prose §M standard — these are the human-judgment items in
docs/specs/QUALITY_RUBRIC.md§M. Templated voice creeps in occasionally; review periodically. - Mathematical correctness — the validator catches phrasing slop, not math errors. Spot-check by reading.
- Pedagogical coherence — cross-references should make sense beyond just resolving syntactically.
Cost expectations (rough)
Per produce-batch of 4 agents:
- Token cost: ~1-1.5M (each agent ~250-400k).
- Wall-clock: ~5-15 min for agents + ~2-3 min orchestrator overhead per integration.
Per audit:
- Token cost: ~250-500k.
- Wall-clock: ~5-15 min agent.
Per session (a 4-batch run + 1 audit + integration):
- Token cost: ~5-7M agent + ~500k orchestrator.
- Wall-clock: 45-90 min.
To finish the FT punch-list (estimate 400-600 net-new units + 80 audits):
- ~100-150 produce-batches + 80 audit batches.
- Sustained at one session/day: ~6-8 months of calendar time.
- Total agent compute: roughly $5-15K depending on prevailing rates.
Failure modes and recovery
Agent timeout / stream-idle
Has happened in this session. Recovery: launch a manual replacement agent with the same brief, OR write the unit by hand. After 3+ failures in a row, stop launching — infrastructure is degraded.
Validator fails on a fresh unit
The most common failures (in order of frequency):
- Prohibited phrasing (
trivial,non-trivial,obviously,clearly). Rewrite using vanishes / identity / vanishing / etc. - Beginner has
\partial— rewrite as "the boundary" in prose. - Paragraph >250 words — split.
- Catalog id missing — integrator stubs it; re-validate.
- Pending prereq — set
pending_prereqs: trueif peers are in flight.
Continuity threshold regression
Usually the new unit's Bridge or Synthesis is missing the required forward-promise / synthesis-claim phrasings. Inspect the metrics with python3 scripts/measure_continuity.py and add the missing phrases.
Numbering collision
Two audits suggested the same unit id. Re-run build_production_plan.py (it dedups) and pick distinct ids manually for the agents.
Catalog drift
If the plan reports queued units that are actually shipped, the catalog id has drifted from the file id. Run build_production_plan.py — it rebuilds from the file system, not the catalog.
Backlog (future scaffolding work)
Things this guide doesn't yet automate:
- Per-chapter editorial polish queue — identify units with templated Bridge/Synthesis (regex-detect) and schedule deepening agents per chapter.
- Lean re-build automation —
scripts/lean_health.pywould catch Mathlib-rename breakages and propose fixes. - Real-diagram prioritisation — most units still have abstract auto-generated SVGs; a queue ranked by view-count (once site analytics ship) would prioritise the highest-impact diagrams.
- Reviewer-attestation tracking — REVIEWER_PLAN.md flags this as the actual scaling bottleneck. The integrator should write
human_reviewer: TBDrows to a queue, and an attestation tool should let the reviewer mark units approved.
These are separate scaffolding sessions when the present queue gets exhausted.
File map
scripts/
build_production_plan.py # build/refresh the master plan
produce_next.py # next-unit-brief generator
integrate_unit.py # validate + register + commit
audit_next_book.py # next-audit-brief generator
dashboard.py # one-screen status
validate_unit.py # (existing) per-unit validator
validate_all.py # (existing) corpus validator
measure_continuity.py # (existing) continuity scoring
gen_placeholders.py # (existing) placeholder SVG filler
add_continuity_layer.py # (existing) one-time continuity-layer pass
manifests/
production/plan.json # master plan (machine-readable)
deps.json # unit DAG
units/ # per-unit pipeline-status manifests
campaign.json # multi-strand orchestration ledger
docs/
plans/PRODUCTION_PLAN.md # human-readable rollup of plan.json
plans/PRODUCTION_GUIDE.md # this file
catalogs/CONCEPT_CATALOG.md # canonical concept registry
catalogs/FASTTRACK_BOOKLIST.md # 92-row book-list
plans/fasttrack/ # audit files (one per book)
.claude/commands/ # slash commands