Factory Forensic: Why Ralph Can't Build Productively

Autonomous agent pipeline audit — April 9, 2026

This page documents a forensic analysis of the Forge autonomous agent factory (Ralph). The audit revealed 5 systemic defects that explain why the factory produces work but can't reliably deliver value. All numbers verified against live Supabase queries and filesystem checks.

The Numbers

MetricValueWhat It Means
Ralph tasks total639Factory has been busy
Failed tasks158 (25%)1 in 4 tasks fails
Failed with NO outcome158/158 (100%)Every failure is silent — no record of what went wrong
Completed with NO outcome304/339 (90%)No proof of what was built
Verify items needs_work6Jason flagged as not actually working
Unmerged branches235Work stuck in branches nobody merges
Pending tasks28Queue loaded, nobody sure what's real
Ghost tasks in queue~12 of 28 (43%)Already done or duplicates
Reconcile timerINACTIVESelf-awareness system wasn't running
SESSION-QUEUE staleness39 days oldestNobody cleans the queue

The 5 Factory Defects

1 Blind Factory — Ralph has no eyes

304/339 completed = no outcome • 158/158 failed = no outcome • 89% unauditable

The verification loop EXISTS (SQL table with 63 rows, generate-test-guide.sh at 390 lines, dashboard verify page live) but only 56 of 639 tasks have verification entries. Tasks are marked “complete” with no proof artifact, no commit hash, no test result.

FIX: ralph.sh MUST write outcome before marking complete/failed.

2 Dead Letters — Work stuck in branches

235 unmerged branches • Work exists but isn't on main

Ralph builds on feature branches (correct per governance), but merge/deploy is a separate step nobody consistently does. Each branch is a dead letter — written, sealed, never delivered.

FIX: Auto-merge on verify — “Verified” click triggers deploy.sh for that branch.

3 Stale State — Map doesn't match territory

SESSION-QUEUE: 39-day items • Reconcile timer: was INACTIVE • MEMORY.md: truncating

When checking “what to build next” the answer is stale — the system almost rebuilt something that already exists (Human Verification Loop was 100% built but plan said “not started”).

FIX: Reconcile timer restarted. Add freshness checks to queue reads.

4 Silent Failures — No feedback loop

56/639 tasks have verify entries • 583 in the dark • Zero auto-rework from failures

Ralph fails → no outcome → no rework card → task stays “failed” forever → nobody learns why. The correction loop exists but only fires on 56 tasks.

FIX: generate-test-guide.sh fires on ALL completions, not just some.

5 Ghost Queue — Dead work clogs pipeline

12 of 28 pending = ghosts • 6 perceive tasks done on main • 6+ duplicates

Ralph picks up done/duplicate tasks, sees work exists, cycles. Burns compute on nothing.

FIX: Queue hygiene + already-done detection in ralph.sh.

Queue Snapshot (Pre-Cleanup)

6 perceive merge — DONE on main, marked complete during audit
7 IHub Phase 1 (#766-771) — legitimate, freshly queued
6 knowledge service (#803-815) — 3 duplicate pairs, cancelled during audit
3 KFS standalone (#660-662) — legitimate Phase 2-3
3 /design tasks (#506, 508, 509) — weeks old, still valid
3 clawdrouter (#800-801, 809-810) — 1 pair cancelled during audit

Cleanup Executed During Audit

5 perceive tasks → completed. 5 duplicates → cancelled. Queue: 29 → 19 real tasks. Reconcile timer restarted. Backup at /opt/forge/backups/queue/pending-snapshot-20260409.json

Stale JASON-DEPs

Neural Registry — “blocked on Jason running SQL”
3 of 7 tables already exist. Real blocker: expired Supabase access token (March 10).
One 5-minute token refresh at supabase.com/dashboard/account/tokens unblocks everything.
Expert Knowledge Graph — “blocked on Brian's docs”
PRD is “Ready to Build.” Brian's docs needed for TEST data, not code.
Infrastructure pipeline is buildable now without the docs.
NowPage MCP — “blocked on env vars”
MCP is deployed, working, and in active daily use.
Queue entry is 30 days stale. Move to Done.

Proposed Solution: 5 Mechanical Fixes

  1. 1
    Queue cleanup — mark done tasks complete, cancel duplicatesDONE
  2. 2
    Restart reconcile timer — systemd timer installed and enabledDONE
  3. 3
    Ralph outcome enforcement — write outcome before status change in ralph.shREADY NOW
  4. 4
    Verify coverage — generate-test-guide.sh on ALL completionsREADY NOW
  5. 5
    Supabase token refresh — new token at supabase.com/dashboard/account/tokensJASON-DEP

The Governance Paradox

“We can't build productively until we fix the factory” — but fixing the factory IS building.

The paradox resolves when factory fixes are MECHANICAL (code changes to ralph.sh, queue PATCH calls, timer restarts) not DOCUMENTARY (more rules, more governance docs, more alignment briefs).

The 5 fixes above are wrenches, not blueprints.

Methodology: All numbers verified via live Supabase API (ralph_queue: 639 rows, verification_checklist: 56 rows) and filesystem (git branch --no-merged: 235, scripts/ and services/ reads) during session 346754ca.