Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 46

Seth Ford edited this page Feb 13, 2026 · 3 revisions

The plan file is being blocked by the sandbox. The plan content has been fully prepared. Let me present it directly to you instead.


Implementation Plan: Dynamic Team Scaling (Issue #46)

Overview

Add mid-pipeline team scaling to Shipwright — spawn additional agents when a pipeline stage is struggling, dismiss idle agents when work is done. This builds on the existing daemon_auto_scale() infrastructure (which scales pipelines) to now scale agents within a pipeline.

Key distinction: daemon_auto_scale() controls how many concurrent pipelines run. This feature controls how many agents work within a single pipeline's build stage.

Important: Previous Iterations Introduced Regressions

The 12 prior loop iterations on this branch made no progress on the actual feature and introduced regressions by removing safety code:

  • Removed unset CLAUDECODE (needed for spawning Claude CLI from within a session)
  • Removed trap '' HUP (prevents tmux detach from killing pipelines)
  • Reverted printf-based state writes to heredoc (vulnerable to delimiter injection)
  • Changed daemon_log from stderr to stdout (corrupts $() captures)
  • Removed CPU-based agent health detection (premature killing of thinking agents)
  • Removed nudge system, deterministic log rotation, overflow-safe backoff

These must be reverted first before adding new code.


Files to Modify

New Files

  1. scripts/sw-scaling.sh (~400 lines) — Team scaling engine: trigger evaluation, spawn/dismiss orchestration, agent registry, cooldown, budget checks
  2. scripts/sw-scaling-test.sh (~600 lines) — Test suite

Modified Files

  1. scripts/sw-daemon.sh — Source scaling engine, add scaling_evaluate() call in health check loop, add scaling config to load_config()
  2. scripts/sw-loop.sh — Write scaling-signals.json after each iteration with structured progress data
  3. scripts/sw-pipeline.sh — Export scaling-relevant state from stage_build()
  4. templates/pipelines/autonomous.json — Add scaling config section
  5. templates/pipelines/full.json — Add scaling config section
  6. templates/pipelines/enterprise.json — Add scaling config section
  7. dashboard/server.ts — Handle pipeline.agent_spawned / pipeline.agent_dismissed events
  8. package.json — Register sw-scaling-test.sh
  9. .claude/CLAUDE.md — Document the scaling system

Implementation Steps

Step 1: Revert regressions (sw-daemon.sh, sw-pipeline.sh, sw-loop.sh)

Restore: unset CLAUDECODE, trap '' HUP, printf-based state writes, daemon_log stderr routing, CPU health detection, nudge system, DAEMON_LOG_WRITE_COUNT, overflow-safe backoff, exec in daemon_spawn_pipeline.

Step 2: Create sw-scaling.sh core engine

Functions: scaling_load_config(), scaling_evaluate(), scaling_check_triggers(), scaling_cooldown_ok(), scaling_budget_check(), scaling_get_active_agents()

Agent registry at ~/.shipwright/scaling/<issue>/agents.json. Triggers:

Trigger Condition Action
iteration_threshold Build iter > 5, no test pass Spawn debugger
repeated_errors Same error 3+ times Spawn specialist
multi_module 5+ directories touched Spawn builder
idle_dismiss No output for 5+ min Dismiss agent
stage_complete Stage finished Dismiss extras

Step 3: Implement scaling_spawn_agent()

Uses tmux split-window, sets pane title, writes context file, launches Claude, registers in agent registry, emits pipeline.agent_spawned event.

Step 4: Implement scaling_dismiss_agent()

Sends C-c to pane, waits 30s, force-kills if needed, removes from registry, emits pipeline.agent_dismissed, re-tiles.

Step 5: Implement scaling_build_context()

Assembles: pipeline state, recent errors, memory patterns, file assignments (non-overlapping), team task list.

Step 6: Integrate into daemon health check loop

Add scaling_evaluate() call after existing health check logic. Add config vars to load_config().

Step 7: Enhance sw-loop.sh progress reporting

Write scaling-signals.json after each iteration with iteration count, consecutive failures, error signature, modules touched, test pass/fail.

Step 8: Add scaling config to pipeline templates

autonomous.json, full.json, enterprise.json get "scaling": { "enabled": true, "agents_max": 3, "cooldown_s": 120 }.

Step 9: Dashboard event handling

Add pipeline.agent_spawned, pipeline.agent_dismissed, pipeline.team_scaled to event processing in dashboard/server.ts.

Step 10: Write test suite

15 tests following project harness pattern with mock tmux/claude/sw-cost.sh binaries.

Step 11: Register tests and update docs

Step 12: Run full test suite — all 23 suites pass


Task Checklist

  • Task 1: Revert regressions from previous iterations
  • Task 2: Create sw-scaling.sh with core scaling functions
  • Task 3: Implement scaling_spawn_agent() with tmux + context + registry
  • Task 4: Implement scaling_dismiss_agent() with graceful shutdown
  • Task 5: Implement scaling_build_context() for new agent context transfer
  • Task 6: Integrate scaling into daemon health check loop
  • Task 7: Add scaling-signals.json output to sw-loop.sh
  • Task 8: Add scaling config to pipeline templates
  • Task 9: Add dashboard event handling for scaling events
  • Task 10: Create sw-scaling-test.sh with 15 unit tests
  • Task 11: Register test suite in package.json, update CLAUDE.md
  • Task 12: Run full test suite — all 23 suites pass

Testing Approach

Unit tests: Mock tmux/claude/cost binaries; test each trigger, cooldown, budget, registry CRUD, context generation.

Integration: Run existing daemon + pipeline test suites to verify no regressions.

Manual: Start daemon with scaling.enabled: true, run pipeline with failing tests, verify agent spawns/dismisses, check events.jsonl + dashboard.

Definition of Done

  • sw-scaling.sh follows project conventions (pipefail, Bash 3.2, VERSION, emit_event)
  • Pipeline can add/dismiss agents mid-stage
  • New agents receive full context
  • Budget limits constrain scaling
  • 120s cooldown prevents thrashing
  • Scale events in events.jsonl and dashboard
  • All regressions reverted
  • 15+ scaling tests pass
  • Full npm test (23 suites) passes

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /