Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Design 305

ezigus edited this page Apr 4, 2026 · 1 revision

Architecture Decision Record

feat(ruflo): 01 of 08 - adapter detection, scaffolding, and MCP lifecycle management


Context

Shipwright is a shell-based orchestration framework where all new capabilities integrate as library modules. Ruflo is an optional MCP (Model Context Protocol) server that the pipeline should leverage when available, but the system must function identically when Ruflo is absent—no pre-installation, no breaking changes.

Constraints:

  • Shell-only integration (all other lib/ modules are bash)
  • Backwards compatible: existing installations without Ruflo must work unchanged
  • Detection must be reasonably fast (command-v before slow npx fallback)
  • Error handling: detection/startup failure should disable Ruflo gracefully, not block the pipeline
  • Module pattern: must follow Shipwright's existing guard pattern (_MODULE_LOADED sentinel)

Decision

Implement a non-invasive, optional adapter module that detects Ruflo availability, manages MCP server lifecycle, and disables itself silently if Ruflo is unavailable or fails to start.

Core Design

  1. Detection Strategy (Two-Phase)

    • Fast path: command -v ruflo (local binary check, ~1ms)
    • Fallback: npx -y ruflo@latest mcp status (npm fallback, ~5-10s if needed)
    • Result: Sets RUFLO_AVAILABLE=true|false for rest of pipeline
  2. Lifecycle Integration

    • Init: Called once at run_pipeline() start → detects, starts MCP server in background, exports RUFLO_AVAILABLE
    • Cleanup: Called in cleanup_on_exit() trap → exports memory state, kills MCP PID
    • Memory import/export are stubs for Issue #2
  3. Error Handling (Fail-Open Pattern)

    • Circuit Breaker: ruflo_with_timeout() disables Ruflo for remainder of pipeline on any timeout/failure
    • All guards: Source line, function calls, and cleanup use || true or || return 0 — zero pipeline breakage
    • Exported state: RUFLO_AVAILABLE is exported so subshells inherit the availability decision
  4. Integration Points (Minimal)

    • Source adapter module after other lib imports (~line 135 in sw-pipeline.sh)
    • Call ruflo_init inside run_pipeline() after audit_init (~line 1562)
    • Call ruflo_cleanup early in cleanup_on_exit() (~line 675)

Module Structure

scripts/lib/ruflo-adapter.sh:
 ├─ _RUFLO_ADAPTER_LOADED guard (prevent double-source)
 ├─ RUFLO_AVAILABLE=false (default, safe state)
 ├─ Fallback helpers (warn, info, emit_event for bootstrap)
 ├─ ruflo_detect() → phase-1 detection, sets RUFLO_AVAILABLE
 ├─ ruflo_init() → start MCP server, export state
 ├─ ruflo_cleanup() → export memory, kill PID
 ├─ ruflo_with_timeout() → circuit-breaker for long operations
 ├─ ruflo_available() → boolean check for use in conditionals
 └─ Memory stubs (for Issue #2)

Alternatives Considered

Alternative A: Blocking Detection (Fail-Closed)

Fail the pipeline if Ruflo is not available.

  • Pros: Clear dependency contract, prevents "Ruflo not running" surprises later
  • Cons: Breaks all existing installations without pre-install step, adds mandatory setup burden
  • Decision: Rejected — violates non-invasive principle. Ruflo is optional, not required.

Alternative B: Synchronous MCP Start (Wait for Readiness)

Block pipeline start until MCP server is fully ready and health-checks pass.

  • Pros: Guaranteed MCP availability before pipeline proceeds
  • Cons: Adds 2-5s overhead per pipeline run even when Ruflo is disabled or not installed
  • Decision: Rejected in favor of background start + health check — Background start (8 char space indentation under decision)
    • MCP starts in background during pipeline stages
    • ruflo_available() can be called per-operation for health check
    • No blocking overhead for pipelines that don't use Ruflo features

Alternative C: NPX-First Detection (No Local Binary Check)

Always try npx first, ignore command -v fast path.

  • Pros: Always finds Ruflo if npm is available, simpler code
  • Cons: 5-10s timeout on every pipeline start even when Ruflo is locally installed
  • Decision: Rejected — two-phase detection (fast → fallback) optimizes common case (local install) while supporting npm-only environments.

Alternative D: Node.js Wrapper for Integration

Wrap Ruflo detection and startup in a Node.js script instead of bash.

  • Pros: Better npm/npx integration, easier error handling
  • Cons: Breaks shell-first convention, adds runtime dependency on Node.js in bash context, inconsistent with all other lib/ modules
  • Decision: Rejected — Shipwright is a bash-first system. Shell adapter matches the codebase paradigm.

Component Diagram

┌─────────────────────────────────────────────────────────────────┐
│ sw-pipeline.sh │
│ (orchestrator: sources libs, runs stages, manages lifecycle) │
└────────────────┬────────────────────────────────────────────────┘
 │
 │ source (with guards)
 ├─ lib/github-api.sh
 ├─ lib/github-api-v2.sh
 └─ lib/ruflo-adapter.sh ◄────────────────────────┐
 │ │
 ├─ run_pipeline() │
 │ └─ ruflo_init() ┐ │
 │ ├─ detect() ├─ Sets RUFLO_AVAILABLE │
 │ └─ start MCP │ Exports for subshells │
 │ │
 └─ cleanup_on_exit() │
 └─ ruflo_cleanup() │
 ├─ export memory (stub for Issue #2) │
 └─ kill MCP PID │
 │
┌──────────────────────────────────────────────────────────────────┤
│ scripts/lib/ruflo-adapter.sh │
│ (adapter: detection, lifecycle, circuit-breaker) │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ COMPONENT: Detection Layer │ │
│ │ ────────────────────────────────────────────────────── │ │
│ │ ruflo_detect() │ │
│ │ ├─ command -v ruflo (fast path, ~1ms) │ │
│ │ └─ else: npx -y ruflo@latest mcp status (slow, 5-10s) │ │
│ │ → Sets RUFLO_AVAILABLE=true/false │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ COMPONENT: Lifecycle Management │ │
│ │ ────────────────────────────────────────────────────── │ │
│ │ ruflo_init() │ │
│ │ ├─ calls detect() │ │
│ │ ├─ if available: start MCP in background (&) │ │
│ │ ├─ sleep 2 (wait for readiness) │ │
│ │ ├─ save PID as RUFLO_MCP_PID │ │
│ │ └─ export RUFLO_AVAILABLE (for subshells) │ │
│ │ │ │
│ │ ruflo_cleanup() │ │
│ │ ├─ check if available │ │
│ │ ├─ export memory state (stub) │ │
│ │ └─ kill $RUFLO_MCP_PID (guarded with || true) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ COMPONENT: Circuit-Breaker & Utilities │ │
│ │ ────────────────────────────────────────────────────── │ │
│ │ ruflo_with_timeout(timeout_s, ...cmd) │ │
│ │ ├─ runs command with timeout wrapper │ │
│ │ └─ on timeout: sets RUFLO_AVAILABLE=false (disables) │ │
│ │ │ │
│ │ ruflo_available() │ │
│ │ └─ returns 0 (true) if available, 1 (false) else │ │
│ └──────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ scripts/sw-ruflo-adapter-test.sh │
│ (test harness: mock binaries, no real ruflo needed) │
│ ├─ test 1: module guard prevents double-source │
│ ├─ test 2-3: detection with/without ruflo │
│ ├─ test 4-5: init with/without ruflo │
│ ├─ test 6-7: cleanup behavior │
│ ├─ test 8: circuit-breaker on timeout │
│ ├─ test 9: ruflo_available() exit codes │
│ ├─ test 10: RUFLO_AVAILABLE exported to subshell │
│ └─ test 11: compatible with set -euo pipefail │
└─────────────────────────────────────────────────────────────────┘

Interface Contracts

# ============================================================================
# PUBLIC INTERFACE: Ruflo Adapter Module
# ============================================================================
# DETECTION FUNCTION
# Detects if Ruflo is available (local binary or via npx)
# Idempotent: safe to call multiple times
# Side effect: Sets RUFLO_AVAILABLE (true|false)
# Exit code: 0 if available, 1 if not available
# 
# Usage:
# ruflo_detect
# if [[ $RUFLO_AVAILABLE == "true" ]]; then
# # Ruflo is available
# fi
#
# Errors: Always returns gracefully (no crash); sets RUFLO_AVAILABLE=false on failure
ruflo_detect() 
 -> Exit: 0 (RUFLO_AVAILABLE=true) | 1 (RUFLO_AVAILABLE=false)
 -> Side effect: export RUFLO_AVAILABLE
# INITIALIZATION FUNCTION
# Initializes Ruflo adapter: detects, starts MCP server, exports state
# Idempotent within a pipeline run (second call is no-op)
# Called once at pipeline start inside run_pipeline()
# Exit code: Always 0 (never fails the pipeline)
#
# Usage:
# ruflo_init
# # At this point: RUFLO_AVAILABLE is set, MCP is running in background (if available)
# if ruflo_available; then
# # Can now use Ruflo features
# fi
#
# Errors: All errors caught and suppressed; RUFLO_AVAILABLE=false on any failure
ruflo_init()
 -> Exit: 0 (always; fail-safe)
 -> Side effects:
 - export RUFLO_AVAILABLE (true|false)
 - export RUFLO_MCP_PID (PID if started, empty if not available)
 - Background process: Ruflo MCP server (if available)
# CLEANUP FUNCTION
# Shuts down Ruflo: exports memory state, kills MCP process
# Called in cleanup_on_exit() trap
# Exit code: Always 0 (never fails cleanup)
#
# Usage:
# cleanup_on_exit() {
# ruflo_cleanup
# # ... other cleanup tasks
# }
#
# Errors: Gracefully handles missing PID, already-dead processes
ruflo_cleanup()
 -> Exit: 0 (always; fail-safe)
 -> Side effects:
 - Calls ruflo_export_memory() (stub for Issue #2)
 - Kills process $RUFLO_MCP_PID (safe with stale PID)
 - Clears RUFLO_MCP_PID
# TIMEOUT WRAPPER (Circuit-Breaker)
# Runs a command with timeout; disables Ruflo if timeout exceeded
# Used to protect long-running Ruflo operations from hanging the pipeline
# 
# Usage:
# if ruflo_with_timeout 30 ruflo-command arg1 arg2; then
# # Command succeeded
# else
# # Timeout or error: RUFLO_AVAILABLE now set to false (circuit broken)
# fi
#
# Parameters:
# 1ドル: timeout in seconds (default: 30)
# $@: command and arguments to run
#
# Side effect on timeout: Sets RUFLO_AVAILABLE=false, disables Ruflo for remainder of pipeline
#
# Exit code: 0 (success) | 1 (timeout or command error)
#
# Errors: Timeout sets RUFLO_AVAILABLE=false (disables); command errors propagate
ruflo_with_timeout(timeout_s [default 30], ...cmd)
 -> Exit: 0 (success) | 1 (timeout or error)
 -> Side effect on timeout: export RUFLO_AVAILABLE=false
# AVAILABILITY CHECK (Boolean)
# Checks if Ruflo is available and running
# Use in conditionals to guard Ruflo-dependent code
#
# Usage:
# if ruflo_available; then
# # Ruflo is available, safe to use
# else
# # Skip Ruflo features
# fi
#
# Exit code: 0 (true, Ruflo available) | 1 (false, Ruflo unavailable)
# No side effects
ruflo_available()
 -> Exit: 0 (available) | 1 (unavailable)
 -> No side effects
# MEMORY EXPORT (Stub for Issue #2)
# Exports current memory state to Ruflo
# Currently a no-op stub; will be implemented in Issue #2
# Called in ruflo_cleanup() before MCP shutdown
#
# Future signature:
# ruflo_export_memory() -> Exit: 0 always, Side effect: saves state to Ruflo
#
# Current behavior: Always returns 0 (no-op)
ruflo_export_memory()
 -> Exit: 0 (always)
 -> Current: no-op stub
# MEMORY IMPORT (Stub for Issue #2)
# Loads previous memory state from Ruflo
# Currently a no-op stub; will be implemented in Issue #2
# Called in ruflo_init() after MCP startup
#
# Future signature:
# ruflo_import_memory() -> Exit: 0 always, Side effect: loads state from Ruflo
#
# Current behavior: Always returns 0 (no-op)
ruflo_import_memory()
 -> Exit: 0 (always)
 -> Current: no-op stub
# ============================================================================
# ERROR CONTRACTS
# ============================================================================
# Detection Errors (handled silently):
# - Ruflo binary not found → return 1, RUFLO_AVAILABLE=false
# - NPX command not found → return 1, RUFLO_AVAILABLE=false
# - NPX timeout (>30s) → return 1, RUFLO_AVAILABLE=false
# - stderr from detect → discarded (>&/dev/null)
# Init Errors (never crash pipeline):
# - detect() fails → silent no-op, continue with RUFLO_AVAILABLE=false
# - MCP start fails → log warning, continue with RUFLO_AVAILABLE=false
# - PID capture fails → continue without MCP running
# Cleanup Errors (never crash exit):
# - RUFLO_MCP_PID empty/stale → silent no-op (kill $PID 2>/dev/null)
# - kill fails → suppressed (|| true)
# - export memory fails → suppressed (|| true)
# Preconditions:
# - Module must be sourced before any function call
# - RUFLO_AVAILABLE must be exported (done by ruflo_init)
# Postconditions:
# - After ruflo_init(): RUFLO_AVAILABLE is set and exported (visible in subshells)
# - After ruflo_cleanup(): MCP process is dead, no orphaned processes
# - Pipeline continues even if all Ruflo operations fail

Data Flow

PHASE 1: Pipeline Startup
──────────────────────────
sw-pipeline.sh (main)
 ↓
source lib/ruflo-adapter.sh (line ~135)
 ├─ Check: [[ -f ... ]] && source ... 2>/dev/null || true
 ├─ Load module: sets RUFLO_AVAILABLE=false (safe default)
 └─ Define functions: ruflo_detect, ruflo_init, ruflo_cleanup, etc.
 ↓
run_pipeline()
 ├─ ... other init steps ...
 ├─ audit_init() 
 └─ ruflo_init() ◄────────────────────────────────────────────┐
 ├─ if type ruflo_init >/dev/null 2>&1 (guard check) │
 ├─ ruflo_detect() │
 │ ├─ Try: command -v ruflo >/dev/null (fast path) │
 │ │ → Found: RUFLO_AVAILABLE=true, return 0 │
 │ │ → Not found: try fallback │
 │ └─ Try: npx -y ruflo@latest mcp status (slow path) │
 │ → Success: RUFLO_AVAILABLE=true, return 0 │
 │ → Timeout/Error: RUFLO_AVAILABLE=false, return 1 │
 │ │
 ├─ if [[ $RUFLO_AVAILABLE == "true" ]] │
 │ ├─ Start MCP server in background: ruflo mcp & │
 │ ├─ Capture PID: RUFLO_MCP_PID=$! │
 │ └─ sleep 2 (wait for MCP readiness) │
 │ │
 ├─ ruflo_import_memory() (stub for Issue #2) │
 │ → Currently: return 0 (no-op) │
 │ │
 └─ export RUFLO_AVAILABLE (make visible to subshells) │
 ↓
 (pipeline continues with RUFLO_AVAILABLE set)
PHASE 2: Pipeline Execution (Stages)
──────────────────────────────────────
Each stage can call: if ruflo_available; then ... fi
 ├─ Check: ruflo_available() → exit 0 (true) or 1 (false)
 ├─ If available: Can use Ruflo features
 └─ If unavailable: Skip Ruflo, continue with fallback
PHASE 3: Timeout Protection (Circuit-Breaker)
──────────────────────────────────────────────
Anywhere in pipeline:
 ├─ ruflo_with_timeout 30 long-running-command
 ├─ If timeout exceeded:
 │ ├─ Kill command
 │ └─ Set RUFLO_AVAILABLE=false (disables Ruflo for rest of pipeline)
 └─ Else: Continue normally
PHASE 4: Cleanup (Exit Trap)
──────────────────────────────
cleanup_on_exit()
 ├─ ruflo_cleanup() (called early, before heartbeat stop)
 │ ├─ if [[ $RUFLO_AVAILABLE == "true" ]]
 │ ├─ ruflo_export_memory() (stub for Issue #2)
 │ │ → Currently: return 0 (no-op)
 │ └─ kill $RUFLO_MCP_PID (guarded: 2>/dev/null || true)
 │ → Safe even if PID is stale/empty
 │
 └─ ... other cleanup (heartbeat, tmpdir cleanup) ...

Error Boundaries

Component Errors It Handles Error Mode Propagation Testing
Detection (ruflo_detect) · Binary not found · NPX timeout · Command fails Silently set RUFLO_AVAILABLE=false Never propagates; returns 1 Test 2-3: mock absent/present ruflo
Init (ruflo_init) · Detection fails · MCP start fails · PID capture fails Log warning, continue with disabled state Never blocks pipeline; || return 0 guard Test 4-5: init with/without ruflo
Cleanup (ruflo_cleanup) · RUFLO_MCP_PID stale/empty · Kill command fails · Memory export fails Suppress all errors with || true Never blocks exit trap Test 6-7: cleanup behavior
Timeout Wrapper (ruflo_with_timeout) · Timeout exceeded · Command error Set RUFLO_AVAILABLE=false on timeout (circuit-break), propagate command exit on error Timeout disables Ruflo; command errors bubble up Test 8: circuit-breaker on timeout
Source Line (sw-pipeline.sh) · File not found · Syntax error in module · Source fails Suppress with 2>/dev/null || true Never blocks pipeline startup Test 11: pipefail safety
Function Calls (sw-pipeline.sh) · Function not defined · Function exits non-zero Guarded with type ... >/dev/null 2>&1 + || true Never blocks pipeline N/A (guards prevent error)
Subshell Export (RUFLO_AVAILABLE) · Variable not inherited in subshell · Export timing issue Must be export (not just local) after ruflo_init Tested by spawning subshell in test 10 Test 10: export to subshell

Error Propagation Rules:

  • Detection errors → Silent disable (RUFLO_AVAILABLE=false)
  • Init errors → Silent disable (RUFLO_AVAILABLE=false)
  • Cleanup errors → Logged but never crash exit trap
  • Timeout errors → Circuit-break (disable Ruflo, log, continue)
  • Module source errors → Silent skip (module not loaded, RUFLO_AVAILABLE stays false)

Implementation Plan

Files to Create

1. scripts/lib/ruflo-adapter.sh (~180 lines)

  • Module guard: _RUFLO_ADAPTER_LOADED sentinel
  • Fallback helpers: warn(), info(), emit_event() for cases where helpers.sh isn't yet sourced
  • Constants: RUFLO_AVAILABLE=false (safe default), RUFLO_DETECTION_TIMEOUT=30
  • Core functions:
    • ruflo_detect() — Two-phase detection (command-v → npx fallback)
    • ruflo_init() — Start MCP, export state
    • ruflo_cleanup() — Export memory, kill MCP
    • ruflo_with_timeout() — Circuit-breaker wrapper
    • ruflo_available() — Boolean check
  • Stubs:
    • ruflo_import_memory() — No-op for Issue #2
    • ruflo_export_memory() — No-op for Issue #2

2. scripts/sw-ruflo-adapter-test.sh (~350 lines)

  • Test framework: Mock binaries in temp directory, colored output, pass/fail counters
  • Test 1: Module guard prevents double-sourcing
  • Test 2-3: Detection with/without ruflo binary present
  • Test 4-5: Init behavior with/without ruflo
  • Test 6-7: Cleanup lifecycle (kill PID, no-op when unavailable)
  • Test 8: Circuit-breaker (ruflo_with_timeout on timeout)
  • Test 9: ruflo_available() exit codes
  • Test 10: RUFLO_AVAILABLE exported and visible in subshell
  • Test 11: Compatible with set -euo pipefail (all functions return safely)

Files to Modify

3. scripts/sw-pipeline.sh (3 minimal edits)

Edit 1 (~line 135, after GitHub API modules):

# --- Ruflo Adapter (optional) ---
# shellcheck source=lib/ruflo-adapter.sh
[[ -f "$SCRIPT_DIR/lib/ruflo-adapter.sh" ]] && source "$SCRIPT_DIR/lib/ruflo-adapter.sh" 2>/dev/null || true

Edit 2 (~line 1562, inside run_pipeline(), after audit_init):

 # Initialize ruflo adapter (no-op if unavailable)
 if type ruflo_init >/dev/null 2>&1; then
 ruflo_init || true
 fi

Edit 3 (~line 675, inside cleanup_on_exit(), early in function before heartbeat stop):

 # Cleanup ruflo MCP server
 if type ruflo_cleanup >/dev/null 2>&1; then
 ruflo_cleanup || true
 fi

Dependencies

  • New external deps: None
  • Optional runtime deps: ruflo CLI (local binary) or npx + node (fallback)
  • Shipwright existing deps: Uses warn(), info(), emit_event() (fallback definitions provided in module)
  • Bash version: Bash 3.2+ compatible (no associative arrays, no readarray, etc.)

Risk Areas & Mitigations

Risk Impact Mitigation Verification
Detection timeout (npx 5-10s) Slows pipeline startup if ruflo not locally installed Fast path (command -v) checked first; npx only fallback. Consider adding RUFLO_DETECTION_TIMEOUT=30 env var Test 2-3: mock slow npx, verify fast path used first
MCP start fails Ruflo unavailable mid-pipeline, unexpected behavior All subsequent calls guarded with ruflo_available() check Test 4-5: mock broken MCP, verify RUFLO_AVAILABLE=false
Stale MCP PID after crash Orphaned process not cleaned up PID-based cleanup with kill ... 2>/dev/null || true handles stale PIDs safely Test 6: cleanup with stale PID
RUFLO_AVAILABLE not exported to subshells Subshells see empty variable Explicit export RUFLO_AVAILABLE in ruflo_init() Test 10: spawn subshell, verify $RUFLO_AVAILABLE set
Source error crashes pipeline Module syntax error breaks pipeline startup Source guarded: 2>/dev/null || true suppresses any source failure Test 11: source broken module, verify pipeline continues
set -euo pipefail incompatibility Pipeline crashes on first error in module All functions use || true or || return 0 guards Test 11: run tests under set -euo pipefail
Timeout wrapper not available on macOS timeout command may not exist Use Shipwright's _timeout helper from helpers.sh if available, else fall back to timeout Verify on macOS in testing

Validation Criteria

Unit Test Coverage (11 tests in sw-ruflo-adapter-test.sh)

  • Test 1: Module guard works — source twice, functions defined only once
  • Test 2: Detection with ruflo present (mock binary) → RUFLO_AVAILABLE=true
  • Test 3: Detection without ruflo (no binary) → RUFLO_AVAILABLE=false
  • Test 4: Init with ruflo available → starts MCP, sets RUFLO_AVAILABLE=true, exports PID
  • Test 5: Init without ruflo → silent no-op, RUFLO_AVAILABLE=false
  • Test 6: Cleanup with PID set → kills process, no errors
  • Test 7: Cleanup without PID → silent no-op, no errors
  • Test 8: Circuit-breaker timeout → ruflo_with_timeout 0.1 sleep 2 sets RUFLO_AVAILABLE=false
  • Test 9: ruflo_available() returns 0 when true, 1 when false
  • Test 10: RUFLO_AVAILABLE exported to subshell → (echo $RUFLO_AVAILABLE) shows correct value
  • Test 11: All functions safe under set -euo pipefail → no crashes

Integration Tests

  • Pipeline loads adapter: sw-pipeline.sh sources adapter without error, RUFLO_AVAILABLE set before stages
  • Pipeline continues without ruflo: Full pipeline run with RUFLO_AVAILABLE=false produces identical output/behavior as before adapter existed
  • Cleanup called: Verify ruflo_cleanup() is called in exit trap (check logs or test with spy function)

Manual Acceptance Criteria

  • Detection works both ways:

    • With ruflo installed: source scripts/lib/ruflo-adapter.sh && ruflo_detect && [[ $RUFLO_AVAILABLE == true ]]
    • Without ruflo: source scripts/lib/ruflo-adapter.sh && ! ruflo_detect && [[ $RUFLO_AVAILABLE == false ]]
  • MCP lifecycle works (with mock):

    • ruflo_init starts background process with correct PID ✓
    • ruflo_cleanup kills process cleanly ✓
  • Circuit-breaker disables ruflo:

    • RUFLO_AVAILABLE=true && ruflo_with_timeout 0.1 sleep 2 && [[ $RUFLO_AVAILABLE == false ]]
  • Exported variable visible in subshells:

    • After ruflo_init(): ( [[ -n $RUFLO_AVAILABLE ]] && echo EXPORTED || echo NOT_EXPORTED )EXPORTED
  • No regressions:

    • npm test passes with no new failures ✓
    • Full pipeline run identical with/without adapter ✓

Definition of Done

All items must be verified:

  • Module Creation: scripts/lib/ruflo-adapter.sh written with all 7 functions (5 core + 2 stubs)
  • Test Creation: scripts/sw-ruflo-adapter-test.sh with 11 mock-based tests, all passing
  • Pipeline Integration: 3 edits to sw-pipeline.sh (source line, ruflo_init call, ruflo_cleanup call) — minimal, non-invasive
  • Detection Works: Fast path (command-v) tried before slow path (npx); both return correct exit codes
  • Lifecycle Works: MCP starts in background during init, stops during cleanup, no orphaned processes
  • Circuit-Breaker Works: Timeout in ruflo_with_timeout() sets RUFLO_AVAILABLE=false for remainder of pipeline
  • Export Works: RUFLO_AVAILABLE is exported and visible in subshells after ruflo_init()
  • Pipefail Safe: All functions compatible with set -euo pipefail — no crashes on errors
  • Backwards Compatible: Pipeline runs identically when ruflo is absent (RUFLO_AVAILABLE=false path)
  • Unit Tests Pass: bash scripts/sw-ruflo-adapter-test.sh → 11/11 tests pass
  • No Regressions: npm test passes with no new failures
  • Follows Conventions: Uses Shipwright module guard pattern, function naming, error handling guards

Systematic Debugging: Root Cause Analysis

Status: Design phase (no previous failure to debug). This section documents preemptive failure analysis for the implementation phase.

Potential Failure Modes & Preemption

Hypothesis 1: Detection hangs on npx fallback (High likelihood)

  • Evidence that would confirm: Pipeline startup adds 10+ seconds even with ruflo installed
  • Root cause if true: npx fallback runs even though fast path succeeded
  • Preemption: Always check fast path first; only run npx if command -v ruflo fails
  • Verification in testing: Test 2-3 uses mocks to verify fast path is tried first

Hypothesis 2: MCP not ready before stages run (Medium likelihood)

  • Evidence: Stages fail with "MCP not responding" even though RUFLO_AVAILABLE=true
  • Root cause: Background start doesn't guarantee readiness
  • Preemption: sleep 2 after start; ruflo_available() checks health before use
  • Verification: Test 4 mocks MCP startup and verifies sleep delay

Hypothesis 3: RUFLO_AVAILABLE not visible in subshells (Medium likelihood)

  • Evidence: Stages spawn subshells and see empty $RUFLO_AVAILABLE
  • Root cause: Variable not exported (only set locally)
  • Preemption: Explicit export RUFLO_AVAILABLE in ruflo_init()
  • Verification: Test 10 spawns subshell and checks variable visibility

Hypothesis 4: Cleanup crashes when PID is stale/empty (Low likelihood)

  • Evidence: Pipeline hang on exit due to kill command error
  • Root cause: kill $RUFLO_MCP_PID fails if PID doesn't exist
  • Preemption: Guard with kill ... 2>/dev/null || true
  • Verification: Test 6 cleans up with stale PID

Hypothesis 5: Module source breaks pipeline startup (Low likelihood)

  • Evidence: Pipeline fails to start when adapter file exists
  • Root cause: Syntax error in module, or source failure propagates
  • Preemption: Source guarded with 2>/dev/null || true, module guard with _RUFLO_ADAPTER_LOADED
  • Verification: Test 11 runs module under set -euo pipefail

Evidence Gathered

Artifacts reviewed:

  • Implementation plan (.claude/pipeline-artifacts/plan.md) — Valid: Design decisions are sound, all error boundaries covered, module pattern matches existing code
  • Component diagram in plan — Valid: Accurately represents integration points (3 edits to sw-pipeline.sh)
  • Interface contracts in plan — Valid: Specifies error modes (always return 0 from init/cleanup) and state (RUFLO_AVAILABLE exported)

Code review (hypothetical, pre-implementation):

  • Guard pattern in existing lib modules (e.g., helpers.sh) — Uses _HELPERS_LOADED sentinel ✓
  • Function naming convention (e.g., validate_json()) — Matches pattern, so ruflo_* naming is consistent ✓
  • Error handling pattern (e.g., type ... >/dev/null 2>&1 || true) — Matches proposed guards ✓

Fix Strategy

No fix needed at design phase — The plan is sound. Implementation should follow the 8 step checklist above. Preemptive testing (11 unit tests) will catch failure modes before they reach CI.

If implementation fails, use this recovery strategy:

  1. Check which test failed (e.g., Test 4: init with ruflo)
  2. Isolate the failure to a specific function (e.g., ruflo_detect vs sleep 2 readiness)
  3. Review the corresponding code against the interface contract
  4. Fix the code, NOT the test
  5. Re-run the specific test, then full suite

Verification Plan

Before build stage:

  • Unit tests: bash scripts/sw-ruflo-adapter-test.sh → all 11 pass
  • Module loads: bash -c 'source scripts/lib/ruflo-adapter.sh && echo LOADED' → success
  • Functions exist: bash -c 'source scripts/lib/ruflo-adapter.sh && type ruflo_init' → function

During build stage:

  • Pipeline runs with adapter: ./scripts/sw-pipeline.sh --dry-run (if available)
  • No regressions: npm test passes

After merge:

  • Verify in next pipeline run that MCP starts/stops cleanly
  • Monitor logs for timeout or PID errors

Document Date: 2026年04月04日
Status: Ready for Implementation
Assigned: Design Review Complete

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /