-
Notifications
You must be signed in to change notification settings - Fork 22
integration guides ralph loop
Prerequisites: Plugin: Skills and Hooks installed and working · Node 18+ available on the host · Recommended: Note Schemas configured for the work types you intend to drain
Cross-references: Quick Start · Workflow Guide §10 — Claim Mechanism · Consumer-Mode Deployment · API Reference — claim_item · Self-Improving Workflow
- A queue-drain pattern that walks unclaimed work items end-to-end without human steering between items
- Fresh context per iteration — each item runs in its own
claude -pprocess with its own git worktree - Circuit breakers on two axes (consecutive gate failures, consecutive errors) plus a hard iteration cap
- Schema-driven termination — what "done" means is whatever the item's schema declares
- Smart worktree cleanup — runs that produced no commits get auto-removed; runs with real diffs are preserved for inspection
- The first claim-mode skill shipped in the public plugin
Ralph is a queue-drain pattern named after Geoff Huntley's canonical loop. The shape:
while queue has items:
spawn fresh process → claim one item → drive it through its schema → exit
The value comes from carving work into independent context windows. Each iteration is a new OS process — fresh memory, fresh worktree, isolated logs, clean exit code. Context never accumulates across iterations.
This is fundamentally different from a slash-command loop running inside a single Claude session. That pattern accumulates context across iterations even with compaction, which is the variant Huntley publicly criticized when other Ralph implementations shipped it. The Ralph loop in this plugin is the script-driver-spawning-fresh-processes form.
The loop driver is a launcher, not a planner. It does not decide which items to work on — that's the filter you give it. It does not decide what "done" looks like — that's the item's schema. It does not orchestrate dependencies — claim-time contention handles concurrency. The driver's only job is: spawn iteration, parse outcome, decide whether to continue.
Four artifacts cooperate to make one iteration work:
| Artifact | Role |
|---|---|
scripts/ralph-loop.mjs |
Loop control: spawns claude -p per iteration, parses outcomes, runs circuit breakers, manages worktree cleanup. Schema-agnostic — has no opinion on what items contain |
skills/ralph/SKILL.md |
Launcher: invoked as /task-orchestrator:ralph from a Claude Code session. Helps you pick filter and bounds, previews the queue, emits the exact node ralph-loop.mjs command |
skills/ralph/iteration-prompt.md |
Per-iteration agent workflow: claim → invoke /schema-workflow → commit → emit RALPH_OUTCOME marker |
output-styles/ralph-iteration.md |
Per-iteration mode (passed to each claude -p via --settings): suppresses orchestrator chrome, encodes iteration discipline (schema is contract, no auto-memory writes, no further dispatch) |
The launcher skill and the iteration prompt are decoupled. The skill helps a human assemble a command; the prompt is what the spawned iteration agent reads. Either can be invoked directly without the other.
Drain the bug-fix backlog with defaults:
/task-orchestrator:ralph tag=bug-fix
The skill walks you through filter, bounds, and queue preview, then emits the command to run. Copy it into a separate terminal:
node claude-plugins/task-orchestrator/scripts/ralph-loop.mjs --filter "tag=bug-fix"
Each iteration spawns a fresh claude -p --worktree=ralph-<id> process and works one item end-to-end. The loop exits when the queue is empty, the iteration cap is hit, or a circuit breaker trips.
To preview the iteration command without running it:
node claude-plugins/task-orchestrator/scripts/ralph-loop.mjs --dry-run --filter "tag=bug-fix"
Filter expressions narrow which queue items are eligible. Keys combine with AND (space-separated):
| Key | Meaning |
|---|---|
tag=<value> |
Items whose tags field contains <value> (substring match) |
type=<value> |
Items with this exact type
|
priority=<value> |
Items at this priority — high, medium, or low
|
parentId=<uuid-or-prefix> |
Only descendants of this container; full UUID or 4+ char hex prefix |
Examples:
# All high-priority bug fixes --filter "tag=bug-fix priority=high" # Quick fixes only --filter "type=quick-fix" # Everything in a specific container --filter "parentId=89d02e32" # Anything claimable (no filter — drain whatever is at the top) # Omit the --filter flag entirely
If the filter is empty, the loop will pick up any unclaimed queue item. Filter scoping is the primary safety mechanism — set it precisely.
| Flag | Default | Purpose |
|---|---|---|
--max <n> |
10 |
Hard iteration cap; loop exits after this many regardless of outcome |
--gate-budget <n> |
3 |
Consecutive gate-blocked outcomes before the loop exits |
--error-budget <n> |
2 |
Consecutive error outcomes before the loop exits |
--budget <usd> |
5 |
Per-iteration USD cap, passed to claude --max-budget-usd
|
--ttl <seconds> |
1800 (30 min) |
Claim TTL per iteration; range 60–86400 |
--model <name> |
sonnet |
Model for the iteration agent |
--actor <id> |
ralph-<pid>-<timestamp> |
Actor id used for claim_item
|
--base-ref <ref> |
origin/main |
Upstream ref for cleanup "ahead of base" detection — set to origin/master, origin/develop, etc. for projects with a different default branch |
--cleanup-on-terminal |
true |
Smart-cleanup worktrees after terminal and no-item outcomes |
--no-cleanup |
— | Disable smart cleanup; preserve all worktrees regardless of state |
The dual circuit breakers are critical — single-condition exits are the most-warned-against Ralph pitfall. Heuristic exit (queue empty) and explicit budget exits combine to cap blast radius even when the heuristic misfires.
Each iteration is exactly one item's journey through its schema. The driver script does not understand the schema — it just spawns the process and parses the outcome.
Driver spawns iteration:
claude -p --worktree=ralph-<pid>-<ts>-<iter> \
--settings '{"outputStyle":"task-orchestrator:ralph-iteration"}' \
--permission-mode bypassPermissions \
--max-budget-usd <budget> \
--output-format json \
--model <model> \
"<rendered iteration prompt>"
|
v
Iteration agent (single claude -p):
1. Query queue items matching filter (priority-ordered)
2. claim_item on top candidate (try next on already_claimed)
3. Invoke /schema-workflow with claimed UUID
— fills required notes per schema guidance
— does the work each note describes (code changes, research, etc.)
— runs verification the schema specifies
— advances item through phase gates
4. Commit any file changes
5. Emit final message: RALPH_OUTCOME: {"status":"<status>","itemId":"<uuid>",...}
|
v
Driver parses outcome:
— Updates counters and circuit breakers
— Renames worktree to ralph-<short-uuid>-<iter> if outcome carried itemId
— Runs smart cleanup if outcome is terminal or no-item
— Decides whether to continue or exit
The iteration agent never calls complete_tree or advances beyond what the schema declares. If the schema's terminal phase prescribes git push or gh pr create, the iteration follows it. If the schema's terminal phase is just "fill the final note", that's the entire end state.
The iteration agent emits RALPH_OUTCOME: {...} as its final message. The driver maps each status to circuit-breaker behavior:
| Status | Meaning | Circuit-breaker effect |
|---|---|---|
terminal |
Item reached terminal role per its schema | Counter ✓; resets gate-failure and error counters |
gate-blocked |
A required note couldn't be filled autonomously (e.g., needs external input the iteration can't get) | Counter ⊘; increments consecutive gate-failure counter |
error |
Tool error, build failure, claim error, budget cap hit | Counter ✗; increments consecutive error counter |
skip |
All candidates already claimed (contention) or item became terminal during a race | Counter —; no counter changes |
no-item |
No items match the filter — queue drained | Loop exits cleanly |
Circuit breakers are consecutive, not cumulative. A single error followed by a successful iteration resets the error counter. This prevents an old failure from haunting the rest of the drain — but back-to-back failures still trip the breaker as intended.
| Code | Meaning |
|---|---|
0 |
Loop completed normally (queue empty, iteration cap reached, or gate-failure budget exhausted) |
2 |
Loop aborted because the consecutive-error budget was exhausted |
64 |
CLI argument error |
70 |
Could not read iteration prompt |
130 |
Interrupted (SIGINT received and forwarded to iteration) |
143 |
Terminated (SIGTERM received and forwarded to iteration) |
Individual errored iterations during an otherwise healthy drain do not change the loop-level exit code — they are visible in the summary instead. Exit 2 is reserved for the aborted-on-error-budget case so callers (CI jobs, /loop invocations, fleet supervisors) can distinguish "completed with some errors" from "loop gave up".
Each iteration creates a fresh git worktree under .claude/worktrees/. Naming and cleanup are designed to make preserved worktrees self-explanatory.
The temporary worktree name during iteration is ralph-<pid>-<timestamp>-<iter> — guaranteed unique even under concurrent ralph-loops. After the iteration completes and the outcome carries an itemId, the driver renames it to ralph-<short-uuid>-<iter> so preserved worktrees are traceable to the item they were working on.
Renaming runs after the iteration's claude process has exited, so file locks aren't a concern (including on Windows). On rename failure the temp name is kept and the loop continues — the rename is purely cosmetic.
After each terminal or no-item outcome, the driver evaluates whether to remove the worktree. The heuristic preserves worktrees that have something worth inspecting:
| Worktree state | Action |
|---|---|
| Uncommitted changes present | Preserved — uncommitted changes present
|
Commits ahead of --base-ref
|
Preserved — <n> commit(s) ahead of <base-ref>
|
| Base ref not resolvable | Preserved — could not compare against <base-ref> (safer error) |
| Clean: no uncommitted changes, zero commits ahead of base | Removed |
gate-blocked, error, and skip outcomes always preserve their worktree regardless — debugging context matters. Pass --no-cleanup to opt out of cleanup entirely.
The --base-ref default is origin/main. For projects whose default branch is master, develop, or lives on a non-origin remote, set it explicitly:
node ralph-loop.mjs --filter "tag=bug-fix" --base-ref origin/develop/loop schedules the driver to run on a recurring interval. Each invocation is independent — the driver exits cleanly, /loop waits, then re-runs. If the queue is empty, the run takes seconds and exits with no-item.
/loop 30m node claude-plugins/task-orchestrator/scripts/ralph-loop.mjs --filter "tag=bug-fix priority=high"
This is the canonical Ralph deployment for "drain whatever appears in the queue, indefinitely". CI pipelines, scheduled tasks, and other agents can post items to the queue throughout the day; /loop handles them on the next tick.
The iteration agent invokes /schema-workflow to drive note-fill and phase advancement. Each note's guidance field tells the agent what to write. This means schemas configured in .taskorchestrator/config.yaml directly shape what an iteration does:
- A
bug-fixschema with code-change notes and a review phase yields iterations that edit files, run tests, fill review notes, and commit - A
research-noteschema with a singlefindingsnote yields iterations that gather context and fill that one note - An
agent-observationschema with a singleobservation-detailnote yields iterations that record observations and exit
The driver doesn't need to know any of this — /schema-workflow reads the schema at runtime and adapts.
Ralph is the first claim-mode skill in the public plugin. Each iteration calls claim_item with a fresh actor id (ralph-<pid>-<timestamp> by default), holds the claim under a TTL, and lets the TTL expire on crash recovery. Multiple ralph-loops can run concurrently against the same queue — claim contention is handled at MCP write time, not in the driver.
For deployments with identity verification (multi-tenant or cross-org), set --actor explicitly so iteration writes carry a stable actor identity. See Consumer-Mode Deployment for actor_authentication configuration.
Ralph is one valid mode of operation alongside default orchestration (Tier 5) and self-improving workflow (Tier 6). Pick by the work shape, not by sophistication:
| Scenario | Mode |
|---|---|
| Single feature, multiple coordinated tasks, dependencies between them | Tier 5: Orchestrator — needs cross-task coordination |
| Backlog of independent items, no cross-item dependencies, want fresh context per item | Ralph — fresh-context isolation matters; cross-item context would be noise |
| Need to drain the queue overnight or on a schedule |
Ralph + /loop — the canonical autonomous deployment |
| Single complex problem requiring deep collaborative iteration | Tier 5 or interactive — Ralph's per-iteration scope is wrong for one big problem |
| Bulk research or note-fill across many items, no code changes | Ralph — schema decides what each iteration does; could be all note-fill |
| Multi-agent fleet with identity verification, capacity tuning, audit logging | Consumer-Mode Deployment — Ralph is one possible client of the consumer-pattern contract |
The signal that Ralph is the wrong tool: you find yourself wanting iterations to share state. That means the work has cross-item context that doesn't fit the fresh-context-per-iteration model. Switch to Tier 5.
Each iteration runs with --permission-mode bypassPermissions. In claude -p (non-interactive) mode there is no UI prompt to approve MCP or tool calls — unpermitted calls auto-deny and the iteration aborts. Without bypass, no iteration can complete its work autonomously.
The risk surface is bounded by four layers working together:
-
Worktree boundary —
--worktreeconfines file edits to a single isolated tree - MCP ACL — the MCP server's own permission model still controls valid TO operations
-
Budget cap —
--max-budget-usdper iteration limits API spend -
Schema-scoped iteration prompt — the iteration agent cannot dispatch subagents, cannot enter plan mode, must emit
RALPH_OUTCOMEas its final message
For deployments needing stricter permission control, swap --permission-mode bypassPermissions for --allowed-tools with an explicit allowlist in the script.
The driver forwards SIGINT and SIGTERM to the currently running iteration child, then exits with codes 130 (SIGINT) or 143 (SIGTERM). This ensures Ctrl+C at the loop driver doesn't leave an orphaned claude -p process running to its budget cap.
The driver streams iteration stdout to its own stdout in real time, so you see the iteration's progress as it works. The final RALPH_OUTCOME: marker is parsed from the last buffered chunk after the iteration exits.
The default TTL of 1800 seconds (30 min) gives a non-trivial iteration headroom. For longer iterations, raise --ttl; the cap is 86400 (24 hours). On crash recovery, claims expire automatically — no manual cleanup needed.
The script expects iteration-prompt.md at ../skills/ralph/iteration-prompt.md relative to its own location. If the plugin layout shifted or the script was copied elsewhere, both files must move together. Verify both exist at:
claude-plugins/task-orchestrator/scripts/ralph-loop.mjsclaude-plugins/task-orchestrator/skills/ralph/iteration-prompt.md
The iteration agent isn't following the prompt — it's exiting without emitting the outcome marker. Likely the agent ran out of budget, hit a tool restriction, or ignored the prompt's exit instructions.
- Run with
--dry-runand inspect the rendered prompt - If the prompt looks correct, raise
--budget(some iterations need more headroom) - If it persists, run one iteration manually outside the loop to debug interactively:
claude -p --worktree=test-1 --permission-mode bypassPermissions "$(cat skills/ralph/iteration-prompt.md)"
Stale claims from a crashed previous run, or another worker is already draining. The TTL is the recovery mechanism — wait for expiry, or release explicitly:
query_items(operation="search", claimStatus="claimed")
# Identify stale claims by actor pattern
claim_item(releases=[{itemId: "<uuid>", actor: {...}}])
Likely cause: --base-ref doesn't resolve. If origin/main is your branch but the script can't reach origin (offline, no remote), the rev-list comparison fails and the worktree is preserved by safety policy.
Fix: ensure git fetch origin succeeds before the loop runs, or set --base-ref to a locally-resolvable ref.
gate-blocked and error outcomes always preserve their worktree. Over a multi-day deployment, these accumulate. Periodic cleanup is manual:
git worktree list | grep ralph- # inspect, then git worktree remove .claude/worktrees/ralph-<id>
For deployments that explicitly do not need preserved-for-debugging worktrees, pass --no-cleanup ... wait, that's the opposite. There is currently no flag for "always cleanup including failure cases" — the assumption is failures are worth keeping. If your deployment disagrees, schedule a separate cleanup task.
The fresh-context-per-iteration property is the entire point of Ralph. A slash-command loop running inside one Claude session accumulates context across iterations even with compaction — that's the variant Geoff Huntley publicly criticized. The script-driver pattern preserves fresh context cleanly: each iteration is a new OS process with no memory of previous ones.
Cross-platform: every Claude Code install has Node. Stdlib-only — no npm install, no plugin dependencies. The script uses node:util.parseArgs, node:child_process.spawn, and node:fs/promises. Targets Node 18+ to match Claude Code's own minimum.
Iterations run under their own output style passed via claude --settings. The default workflow-orchestrator output style is shaped for interactive orchestration — tier classification, delegation tables, plan-mode discipline, the workflow-analyst footer — all of which are wrong for a single-item per-iteration agent. The Ralph style suppresses that chrome and authoritatively encodes per-iteration rules: schema is the contract, no auto-memory writes, no further dispatch, RALPH_OUTCOME as the final message.
The end state of an iteration is determined by the item's schema, not by Ralph. A bug-fix schema's terminal phase might prescribe push and PR creation; an agent-observation schema's "done" might be filling a single note. The script doesn't assume a code-change workflow — it just runs iterations and captures outcomes. Workflow logic lives in the schema.
An agent inside claude -p doesn't directly control the parent process's exit code — that's Claude Code's harness. Structured stdout output is the cleanest channel: the script extracts the marker from the agent's final message via balanced-brace JSON scanning, parses it, and decides loop control. This also makes outcomes inspectable in iteration logs after the fact.
The first end-to-end smoke test failed at claim_item because claude -p auto-denies MCP tool calls without a permission prompt. Bypass is currently the simplest path to autonomous operation given the four-layer risk-bounding (worktree, MCP ACL, budget, schema-scoped prompt). Deployments needing tighter control can swap in --allowed-tools with an explicit allowlist — the script's args array is the integration point.
Getting Started
Integration Guides
- Overview
- Bare MCP
- CLAUDE.md-Driven
- Note Schemas
- Plugin: Skills & Hooks
- Output Styles
- Self-Improving Workflow
Reference
Operations
Project