integration guides ralph loop

jpicklyk edited this page May 4, 2026 · 2 revisions

Autonomous Drain — Ralph Loop

Prerequisites: Plugin: Skills and Hooks installed and working · Node 18+ available on the host · Recommended: Note Schemas configured for the work types you intend to drain

Cross-references: Quick Start · Workflow Guide §10 — Claim Mechanism · Consumer-Mode Deployment · API Reference — claim_item · Self-Improving Workflow

What You Get

A queue-drain pattern that walks unclaimed work items end-to-end without human steering between items
Fresh context per iteration — each item runs in its own claude -p process with its own git worktree
Circuit breakers on two axes (consecutive gate failures, consecutive errors) plus a hard iteration cap
Schema-driven termination — what "done" means is whatever the item's schema declares
Smart worktree cleanup — runs that produced no commits get auto-removed; runs with real diffs are preserved for inspection
The first claim-mode skill shipped in the public plugin

What Ralph Loop Is

Ralph is a queue-drain pattern named after Geoff Huntley's canonical loop. The shape:

while queue has items:
 spawn fresh process → claim one item → drive it through its schema → exit

The value comes from carving work into independent context windows. Each iteration is a new OS process — fresh memory, fresh worktree, isolated logs, clean exit code. Context never accumulates across iterations.

This is fundamentally different from a slash-command loop running inside a single Claude session. That pattern accumulates context across iterations even with compaction, which is the variant Huntley publicly criticized when other Ralph implementations shipped it. The Ralph loop in this plugin is the script-driver-spawning-fresh-processes form.

The loop driver is a launcher, not a planner. It does not decide which items to work on — that's the filter you give it. It does not decide what "done" looks like — that's the item's schema. It does not orchestrate dependencies — claim-time contention handles concurrency. The driver's only job is: spawn iteration, parse outcome, decide whether to continue.

Architecture

Four artifacts cooperate to make one iteration work:

Artifact	Role
`scripts/ralph-loop.mjs`	Loop control: spawns `claude -p` per iteration, parses outcomes, runs circuit breakers, manages worktree cleanup. Schema-agnostic — has no opinion on what items contain
`skills/ralph/SKILL.md`	Launcher: invoked as `/task-orchestrator:ralph` from a Claude Code session. Helps you pick filter and bounds, previews the queue, emits the exact `node ralph-loop.mjs` command
`skills/ralph/iteration-prompt.md`	Per-iteration agent workflow: claim → invoke `/schema-workflow` → commit → emit `RALPH_OUTCOME` marker
`output-styles/ralph-iteration.md`	Per-iteration mode (passed to each `claude -p` via `--settings`): suppresses orchestrator chrome, encodes iteration discipline (schema is contract, no auto-memory writes, no further dispatch)

The launcher skill and the iteration prompt are decoupled. The skill helps a human assemble a command; the prompt is what the spawned iteration agent reads. Either can be invoked directly without the other.

Quick Start

Drain the bug-fix backlog with defaults:

/task-orchestrator:ralph tag=bug-fix

The skill walks you through filter, bounds, and queue preview, then emits the command to run. Copy it into a separate terminal:

node claude-plugins/task-orchestrator/scripts/ralph-loop.mjs --filter "tag=bug-fix"

Each iteration spawns a fresh claude -p --worktree=ralph-<id> process and works one item end-to-end. The loop exits when the queue is empty, the iteration cap is hit, or a circuit breaker trips.

To preview the iteration command without running it:

node claude-plugins/task-orchestrator/scripts/ralph-loop.mjs --dry-run --filter "tag=bug-fix"

Configuration

Filter expressions

Filter expressions narrow which queue items are eligible. Keys combine with AND (space-separated):

Key	Meaning
`tag=<value>`	Items whose `tags` field contains `<value>` (substring match)
`type=<value>`	Items with this exact `type`
`priority=<value>`	Items at this priority — `high`, `medium`, or `low`
`parentId=<uuid-or-prefix>`	Only descendants of this container; full UUID or 4+ char hex prefix

Examples:

# All high-priority bug fixes
--filter "tag=bug-fix priority=high"
# Quick fixes only
--filter "type=quick-fix"
# Everything in a specific container
--filter "parentId=89d02e32"
# Anything claimable (no filter — drain whatever is at the top)
# Omit the --filter flag entirely

If the filter is empty, the loop will pick up any unclaimed queue item. Filter scoping is the primary safety mechanism — set it precisely.

Loop bounds

Flag	Default	Purpose
`--max <n>`	`10`	Hard iteration cap; loop exits after this many regardless of outcome
`--gate-budget <n>`	`3`	Consecutive `gate-blocked` outcomes before the loop exits
`--error-budget <n>`	`2`	Consecutive `error` outcomes before the loop exits
`--budget <usd>`	`5`	Per-iteration USD cap, passed to `claude --max-budget-usd`
`--ttl <seconds>`	`1800` (30 min)	Claim TTL per iteration; range 60–86400
`--model <name>`	`sonnet`	Model for the iteration agent
`--actor <id>`	`ralph-<pid>-<timestamp>`	Actor id used for `claim_item`
`--base-ref <ref>`	`origin/main`	Upstream ref for cleanup "ahead of base" detection — set to `origin/master`, `origin/develop`, etc. for projects with a different default branch
`--cleanup-on-terminal`	`true`	Smart-cleanup worktrees after `terminal` and `no-item` outcomes
`--no-cleanup`	—	Disable smart cleanup; preserve all worktrees regardless of state

The dual circuit breakers are critical — single-condition exits are the most-warned-against Ralph pitfall. Heuristic exit (queue empty) and explicit budget exits combine to cap blast radius even when the heuristic misfires.

Iteration Lifecycle

Each iteration is exactly one item's journey through its schema. The driver script does not understand the schema — it just spawns the process and parses the outcome.

Driver spawns iteration:
 claude -p --worktree=ralph-<pid>-<ts>-<iter> \
 --settings '{"outputStyle":"task-orchestrator:ralph-iteration"}' \
 --permission-mode bypassPermissions \
 --max-budget-usd <budget> \
 --output-format json \
 --model <model> \
 "<rendered iteration prompt>"
 |
 v
Iteration agent (single claude -p):
 1. Query queue items matching filter (priority-ordered)
 2. claim_item on top candidate (try next on already_claimed)
 3. Invoke /schema-workflow with claimed UUID
 — fills required notes per schema guidance
 — does the work each note describes (code changes, research, etc.)
 — runs verification the schema specifies
 — advances item through phase gates
 4. Commit any file changes
 5. Emit final message: RALPH_OUTCOME: {"status":"<status>","itemId":"<uuid>",...}
 |
 v
Driver parses outcome:
 — Updates counters and circuit breakers
 — Renames worktree to ralph-<short-uuid>-<iter> if outcome carried itemId
 — Runs smart cleanup if outcome is terminal or no-item
 — Decides whether to continue or exit

The iteration agent never calls complete_tree or advances beyond what the schema declares. If the schema's terminal phase prescribes git push or gh pr create, the iteration follows it. If the schema's terminal phase is just "fill the final note", that's the entire end state.

Outcomes & Circuit Breakers

The iteration agent emits RALPH_OUTCOME: {...} as its final message. The driver maps each status to circuit-breaker behavior:

Status	Meaning	Circuit-breaker effect
`terminal`	Item reached terminal role per its schema	Counter `✓`; resets gate-failure and error counters
`gate-blocked`	A required note couldn't be filled autonomously (e.g., needs external input the iteration can't get)	Counter `⊘`; increments consecutive gate-failure counter
`error`	Tool error, build failure, claim error, budget cap hit	Counter `✗`; increments consecutive error counter
`skip`	All candidates already claimed (contention) or item became terminal during a race	Counter `—`; no counter changes
`no-item`	No items match the filter — queue drained	Loop exits cleanly

Circuit breakers are consecutive, not cumulative. A single error followed by a successful iteration resets the error counter. This prevents an old failure from haunting the rest of the drain — but back-to-back failures still trip the breaker as intended.

Driver exit codes

Code	Meaning
`0`	Loop completed normally (queue empty, iteration cap reached, or gate-failure budget exhausted)
`2`	Loop aborted because the consecutive-error budget was exhausted
`64`	CLI argument error
`70`	Could not read iteration prompt
`130`	Interrupted (SIGINT received and forwarded to iteration)
`143`	Terminated (SIGTERM received and forwarded to iteration)

Individual errored iterations during an otherwise healthy drain do not change the loop-level exit code — they are visible in the summary instead. Exit 2 is reserved for the aborted-on-error-budget case so callers (CI jobs, /loop invocations, fleet supervisors) can distinguish "completed with some errors" from "loop gave up".

Worktree Lifecycle

Each iteration creates a fresh git worktree under .claude/worktrees/. Naming and cleanup are designed to make preserved worktrees self-explanatory.

Naming

The temporary worktree name during iteration is ralph-<pid>-<timestamp>-<iter> — guaranteed unique even under concurrent ralph-loops. After the iteration completes and the outcome carries an itemId, the driver renames it to ralph-<short-uuid>-<iter> so preserved worktrees are traceable to the item they were working on.

Renaming runs after the iteration's claude process has exited, so file locks aren't a concern (including on Windows). On rename failure the temp name is kept and the loop continues — the rename is purely cosmetic.

Smart cleanup

After each terminal or no-item outcome, the driver evaluates whether to remove the worktree. The heuristic preserves worktrees that have something worth inspecting:

Worktree state	Action
Uncommitted changes present	Preserved — `uncommitted changes present`
Commits ahead of `--base-ref`	Preserved — `<n> commit(s) ahead of <base-ref>`
Base ref not resolvable	Preserved — `could not compare against <base-ref>` (safer error)
Clean: no uncommitted changes, zero commits ahead of base	Removed

gate-blocked, error, and skip outcomes always preserve their worktree regardless — debugging context matters. Pass --no-cleanup to opt out of cleanup entirely.

The --base-ref default is origin/main. For projects whose default branch is master, develop, or lives on a non-origin remote, set it explicitly:

node ralph-loop.mjs --filter "tag=bug-fix" --base-ref origin/develop

Composition

With `/loop` for autonomous cadence

/loop schedules the driver to run on a recurring interval. Each invocation is independent — the driver exits cleanly, /loop waits, then re-runs. If the queue is empty, the run takes seconds and exits with no-item.

/loop 30m node claude-plugins/task-orchestrator/scripts/ralph-loop.mjs --filter "tag=bug-fix priority=high"

This is the canonical Ralph deployment for "drain whatever appears in the queue, indefinitely". CI pipelines, scheduled tasks, and other agents can post items to the queue throughout the day; /loop handles them on the next tick.

With note schemas

The iteration agent invokes /schema-workflow to drive note-fill and phase advancement. Each note's guidance field tells the agent what to write. This means schemas configured in .taskorchestrator/config.yaml directly shape what an iteration does:

A bug-fix schema with code-change notes and a review phase yields iterations that edit files, run tests, fill review notes, and commit
A research-note schema with a single findings note yields iterations that gather context and fill that one note
An agent-observation schema with a single observation-detail note yields iterations that record observations and exit

The driver doesn't need to know any of this — /schema-workflow reads the schema at runtime and adapts.

With claim mode

Ralph is the first claim-mode skill in the public plugin. Each iteration calls claim_item with a fresh actor id (ralph-<pid>-<timestamp> by default), holds the claim under a TTL, and lets the TTL expire on crash recovery. Multiple ralph-loops can run concurrently against the same queue — claim contention is handled at MCP write time, not in the driver.

For deployments with identity verification (multi-tenant or cross-org), set --actor explicitly so iteration writes carry a stable actor identity. See Consumer-Mode Deployment for actor_authentication configuration.

When to Use Ralph

Ralph is one valid mode of operation alongside default orchestration (Tier 5) and self-improving workflow (Tier 6). Pick by the work shape, not by sophistication:

Scenario	Mode
Single feature, multiple coordinated tasks, dependencies between them	Tier 5: Orchestrator — needs cross-task coordination
Backlog of independent items, no cross-item dependencies, want fresh context per item	Ralph — fresh-context isolation matters; cross-item context would be noise
Need to drain the queue overnight or on a schedule	Ralph + `/loop` — the canonical autonomous deployment
Single complex problem requiring deep collaborative iteration	Tier 5 or interactive — Ralph's per-iteration scope is wrong for one big problem
Bulk research or note-fill across many items, no code changes	Ralph — schema decides what each iteration does; could be all note-fill
Multi-agent fleet with identity verification, capacity tuning, audit logging	Consumer-Mode Deployment — Ralph is one possible client of the consumer-pattern contract

The signal that Ralph is the wrong tool: you find yourself wanting iterations to share state. That means the work has cross-item context that doesn't fit the fresh-context-per-iteration model. Switch to Tier 5.

Operational Notes

Permission mode

Each iteration runs with --permission-mode bypassPermissions. In claude -p (non-interactive) mode there is no UI prompt to approve MCP or tool calls — unpermitted calls auto-deny and the iteration aborts. Without bypass, no iteration can complete its work autonomously.

The risk surface is bounded by four layers working together:

Worktree boundary — --worktree confines file edits to a single isolated tree
MCP ACL — the MCP server's own permission model still controls valid TO operations
Budget cap — --max-budget-usd per iteration limits API spend
Schema-scoped iteration prompt — the iteration agent cannot dispatch subagents, cannot enter plan mode, must emit RALPH_OUTCOME as its final message

For deployments needing stricter permission control, swap --permission-mode bypassPermissions for --allowed-tools with an explicit allowlist in the script.

Signal forwarding

The driver forwards SIGINT and SIGTERM to the currently running iteration child, then exits with codes 130 (SIGINT) or 143 (SIGTERM). This ensures Ctrl+C at the loop driver doesn't leave an orphaned claude -p process running to its budget cap.

Streaming output

The driver streams iteration stdout to its own stdout in real time, so you see the iteration's progress as it works. The final RALPH_OUTCOME: marker is parsed from the last buffered chunk after the iteration exits.

Claim TTL

The default TTL of 1800 seconds (30 min) gives a non-trivial iteration headroom. For longer iterations, raise --ttl; the cap is 86400 (24 hours). On crash recovery, claims expire automatically — no manual cleanup needed.

Troubleshooting

`error: could not read iteration prompt`

The script expects iteration-prompt.md at ../skills/ralph/iteration-prompt.md relative to its own location. If the plugin layout shifted or the script was copied elsewhere, both files must move together. Verify both exist at:

claude-plugins/task-orchestrator/scripts/ralph-loop.mjs
claude-plugins/task-orchestrator/skills/ralph/iteration-prompt.md

Every iteration: `iteration agent exited cleanly without RALPH_OUTCOME marker`

The iteration agent isn't following the prompt — it's exiting without emitting the outcome marker. Likely the agent ran out of budget, hit a tool restriction, or ignored the prompt's exit instructions.

Run with --dry-run and inspect the rendered prompt
If the prompt looks correct, raise --budget (some iterations need more headroom)

If it persists, run one iteration manually outside the loop to debug interactively:

claude -p --worktree=test-1 --permission-mode bypassPermissions "$(cat skills/ralph/iteration-prompt.md)"

Every iteration: `skip` with "all candidates already claimed"

Stale claims from a crashed previous run, or another worker is already draining. The TTL is the recovery mechanism — wait for expiry, or release explicitly:

query_items(operation="search", claimStatus="claimed")
# Identify stale claims by actor pattern
claim_item(releases=[{itemId: "<uuid>", actor: {...}}])

Cleanup never removes a worktree even on clean terminal

Likely cause: --base-ref doesn't resolve. If origin/main is your branch but the script can't reach origin (offline, no remote), the rev-list comparison fails and the worktree is preserved by safety policy.

Fix: ensure git fetch origin succeeds before the loop runs, or set --base-ref to a locally-resolvable ref.

Worktrees accumulate over long deployments

gate-blocked and error outcomes always preserve their worktree. Over a multi-day deployment, these accumulate. Periodic cleanup is manual:

git worktree list | grep ralph-
# inspect, then
git worktree remove .claude/worktrees/ralph-<id>

For deployments that explicitly do not need preserved-for-debugging worktrees, pass --no-cleanup ... wait, that's the opposite. There is currently no flag for "always cleanup including failure cases" — the assumption is failures are worth keeping. If your deployment disagrees, schedule a separate cleanup task.

Design Notes

Why a script driver, not a slash-command loop

The fresh-context-per-iteration property is the entire point of Ralph. A slash-command loop running inside one Claude session accumulates context across iterations even with compaction — that's the variant Geoff Huntley publicly criticized. The script-driver pattern preserves fresh context cleanly: each iteration is a new OS process with no memory of previous ones.

Why a Node script, not bash

Cross-platform: every Claude Code install has Node. Stdlib-only — no npm install, no plugin dependencies. The script uses node:util.parseArgs, node:child_process.spawn, and node:fs/promises. Targets Node 18+ to match Claude Code's own minimum.

Why a dedicated `ralph-iteration` output style

Iterations run under their own output style passed via claude --settings. The default workflow-orchestrator output style is shaped for interactive orchestration — tier classification, delegation tables, plan-mode discipline, the workflow-analyst footer — all of which are wrong for a single-item per-iteration agent. The Ralph style suppresses that chrome and authoritatively encodes per-iteration rules: schema is the contract, no auto-memory writes, no further dispatch, RALPH_OUTCOME as the final message.

Why no PR creation in the script

The end state of an iteration is determined by the item's schema, not by Ralph. A bug-fix schema's terminal phase might prescribe push and PR creation; an agent-observation schema's "done" might be filling a single note. The script doesn't assume a code-change workflow — it just runs iterations and captures outcomes. Workflow logic lives in the schema.

Why the `RALPH_OUTCOME:` marker, not exit codes

An agent inside claude -p doesn't directly control the parent process's exit code — that's Claude Code's harness. Structured stdout output is the cleanest channel: the script extracts the marker from the agent's final message via balanced-brace JSON scanning, parses it, and decides loop control. This also makes outcomes inspectable in iteration logs after the fact.

Why `bypassPermissions` was chosen over `--allowed-tools`

The first end-to-end smoke test failed at claim_item because claude -p auto-denies MCP tool calls without a permission prompt. Bypass is currently the simplest path to autonomous operation given the four-layer risk-bounding (worktree, MCP ACL, budget, schema-scoped prompt). Deployments needing tighter control can swap in --allowed-tools with an explicit allowlist — the script's args array is the integration point.

Navigation

Getting Started

Integration Guides

Reference

Operations

Fleet Deployment

Project

integration guides ralph loop

Autonomous Drain — Ralph Loop

What You Get

What Ralph Loop Is

Architecture

Quick Start

Configuration

Filter expressions

Loop bounds

Iteration Lifecycle

Outcomes & Circuit Breakers

Driver exit codes

Worktree Lifecycle

Naming

Smart cleanup

Composition

With /loop for autonomous cadence

With note schemas

With claim mode

When to Use Ralph

Operational Notes

Permission mode

Signal forwarding

Streaming output

Claim TTL

Troubleshooting

error: could not read iteration prompt

Every iteration: iteration agent exited cleanly without RALPH_OUTCOME marker

Every iteration: skip with "all candidates already claimed"

Cleanup never removes a worktree even on clean terminal

Worktrees accumulate over long deployments

Design Notes

Why a script driver, not a slash-command loop

Why a Node script, not bash

Why a dedicated ralph-iteration output style

Why no PR creation in the script

Why the RALPH_OUTCOME: marker, not exit codes

Why bypassPermissions was chosen over --allowed-tools

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Navigation

Clone this wiki locally

With `/loop` for autonomous cadence

`error: could not read iteration prompt`

Every iteration: `iteration agent exited cleanly without RALPH_OUTCOME marker`

Every iteration: `skip` with "all candidates already claimed"

Why a dedicated `ralph-iteration` output style

Why the `RALPH_OUTCOME:` marker, not exit codes

Why `bypassPermissions` was chosen over `--allowed-tools`