Pipeline Plan 448

ezigus edited this page Apr 30, 2026 · 1 revision

I now have enough context. Here's the implementation plan.

Implementation Plan: Pipeline Cycling Halt (`stuck_cycling`)

Files to Modify

File	Change
`scripts/sw-pipeline.sh`	Add `count_consecutive_test_failures()`, add cycling halt check in `self_healing_build_test()`, expose env var default, add `stuck_cycling` to status display
`scripts/sw-pipeline-test.sh`	Add unit test for counter function + E2E test for `stuck_cycling` exit

Root Cause Analysis

What's happening: self_healing_build_test() runs N build→test cycles (BUILD_TEST_RETRIES=3 default). When it exhausts cycles and returns 1, external automation (daemon, autonomous pipeline) re-invokes the pipeline fresh — resetting STUCKNESS_COUNT, RESTART_COUNT, and EXTENSION_COUNT inside sw-loop.sh. The pipeline-state.md log persists but nothing reads it to count cumulative failures.

Why convergence detection doesn't save us: The same-error ×ばつ 3 detector (consecutive_same_error) resets when error signature changes (e.g., different timestamps, changed assertion counts). Plateau detection resets when prev_fail_count varies. Neither tracks failures across separate self_healing_build_test invocations.

Minimum viable fix: Add a persistent counter that reads pipeline-state.md log history before each build attempt. Since the log survives restarts, it catches cycling across invocations.

Alternatives Considered

Option A — Persistent state file counter (chosen) Parse ### test (ts)\nfailed (...) entries from pipeline-state.md before each build attempt. No new files, reuses existing log grammar.

Trade-offs: + Survives daemon restarts (persistent); + No new file I/O path; + Counter resets naturally when test passes; − Requires parsing the state file on each cycle start (cheap: <1ms sequential read).

Option B — External counter file Write a consecutive-test-failures.txt file in ARTIFACTS_DIR. Increment on test fail, reset on test pass.

Trade-offs: + Simple read/write; − Not automatically cleaned on fresh start; − New artifact that _cleanup_run_artifacts() would need to preserve; − Doesn't help if ARTIFACTS_DIR is wiped on restart.

Option C — Add counter to state file frontmatter Add consecutive_test_failures: N field to pipeline state.

Trade-offs: + Clean data model; − Requires modifying both initialize_state() and write_state() in pipeline-state.sh; − More surface area, higher blast radius.

Decision: Option A is the minimum viable change — no new files, no new state fields, leverages existing log format already parsed elsewhere (see resume_state() recovery logic at line 769).

Risk Analysis

Risk	What Could Break	Mitigation
Bash regex for `BASH_REMATCH`	Fails on bash 3.2 if pattern has groups	Pattern `^###[[:space:]]+([a-z_]+)[[:space:]]+` is POSIX ERE, works in bash 3.2+
State file not yet written	First cycle, state file empty — count returns 0	Guard: `[[ -f "$state_file" ]]` before reading
Review self-healing re-uses `self_healing_build_test`	Second call path at line 1758 also gets the check	Check fires on BOTH paths — correct behavior, no regressions
`SW_PIPELINE_MAX_BUILD_RETRIES=0` disables check	Daemon running with explicit 0 will cycle indefinitely	This is the documented override; user opted in
`stuck_cycling` state blocks `pipeline resume`	Automated restart fails even with `SW_PIPELINE_MAX_BUILD_RETRIES=0`	Resume state does NOT treat `stuck_cycling` as terminal — user overrides via env var and resume proceeds, where the check is bypassed
Log format change in future	Parser stops counting correctly	Parser uses same regex already in `resume_state()` at line 774 — it's stable

Data Flow and Architecture

sw-pipeline.sh::self_healing_build_test()
 │
 ├─► [TOP OF WHILE LOOP] count_consecutive_test_failures($STATE_FILE)
 │ │
 │ └─► reads pipeline-state.md §## Log
 │ parses: ### test (ts)\ncomplete|failed
 │ returns: N (trailing consecutive "failed" count)
 │
 ├─ if N >= SW_PIPELINE_MAX_BUILD_RETRIES (and > 0):
 │ update_status("stuck_cycling", "build")
 │ log_stage("pipeline", "stuck_cycling: ...")
 │ emit_event("pipeline.stuck_cycling", ...)
 │ return 1
 │
 └─ else: run build → run test → mark result → loop
 │
 └─► mark_stage_failed("test")
 log_stage("test", "failed (...)")
 write_state()
 ← persists to pipeline-state.md

Counter reset mechanism: mark_stage_complete("test") calls log_stage("test", "complete (...)"). Parser sees complete → resets trailing count to 0. No explicit reset needed.

Implementation Steps

Step 1 — Add SW_PIPELINE_MAX_BUILD_RETRIES default near line 812 (where BUILD_TEST_RETRIES is set):

SW_PIPELINE_MAX_BUILD_RETRIES=${SW_PIPELINE_MAX_BUILD_RETRIES:-3}

Step 2 — Add count_consecutive_test_failures() function in sw-pipeline.sh, immediately before self_healing_build_test() (around line 1422). The function must be Bash 3.2 compatible (no associative arrays, no ${var,,}):

count_consecutive_test_failures() {
 local state_file="${1:-${STATE_FILE:-}}"
 [[ -z "$state_file" || ! -f "$state_file" ]] && echo 0 && return 0
 local in_log=0 current_stage="" outcomes=""
 while IFS= read -r line; do
 if [[ "$line" == "## Log" ]]; then in_log=1; continue; fi
 [[ "$in_log" -eq 0 ]] && continue
 if [[ "$line" =~ ^###[[:space:]]+([a-z_]+)[[:space:]]+ ]]; then
 current_stage="${BASH_REMATCH[1]}"; continue
 fi
 if [[ "$current_stage" == "test" ]]; then
 if [[ "$line" =~ ^complete ]]; then
 outcomes="$outcomes pass"; current_stage=""
 elif [[ "$line" =~ ^failed ]]; then
 outcomes="$outcomes fail"; current_stage=""
 fi
 fi
 done < "$state_file"
 local count=0 word
 for word in $outcomes; do
 if [[ "$word" == "fail" ]]; then count=$((count + 1))
 elif [[ "$word" == "pass" ]]; then count=0; fi
 done
 echo "$count"
}

Step 3 — Add cycling halt check in self_healing_build_test() at the top of the while loop body (after cycle=$((cycle + 1)), before the build runs, ~line 1483):

# Outer cycling halt: persistent consecutive test failure cap
local _max_build_retries="${SW_PIPELINE_MAX_BUILD_RETRIES:-3}"
if [[ "$_max_build_retries" -gt 0 ]]; then
 local _consec_failures
 _consec_failures=$(count_consecutive_test_failures)
 if [[ "$_consec_failures" -ge "$_max_build_retries" ]]; then
 update_status "stuck_cycling" "build"
 log_stage "pipeline" "stuck_cycling: ${_consec_failures} consecutive test failures (cap=${_max_build_retries}). Override: SW_PIPELINE_MAX_BUILD_RETRIES=0"
 write_state
 error "Pipeline halted: ${_consec_failures} consecutive test failures reached cap of ${_max_build_retries}"
 warn "Override: SW_PIPELINE_MAX_BUILD_RETRIES=0 shipwright pipeline resume"
 emit_event "pipeline.stuck_cycling" \
 "issue=${ISSUE_NUMBER:-0}" \
 "consecutive_failures=${_consec_failures}" \
 "cap=${_max_build_retries}" || true
 return 1
 fi
fi

Step 4 — Add stuck_cycling to status display in pipeline_status() around line 3407:

stuck_cycling) status_icon="${YELLOW}⚠${RESET}" ;;

Step 5 — Add E2E and unit tests to sw-pipeline-test.sh:

Test A (test_count_consecutive_test_failures_parsing): Extract count_consecutive_test_failures into a temp script, feed synthetic state files with known failure patterns (0 failures, 3 failures, pass-then-2-fails, etc.), assert correct counts.

Test B (test_stuck_cycling_halts_after_max_build_retries): Full E2E. Set SW_PIPELINE_MAX_BUILD_RETRIES=2. Use mock sw-loop that always commits but test always fails. Pre-seed state file with 1 prior test failure. Run pipeline; assert stuck_cycling in state file on second test failure.

Step 6 — Register both tests in the main() tests array in sw-pipeline-test.sh.

Task Checklist

Task 1: Add SW_PIPELINE_MAX_BUILD_RETRIES default near line 812 in sw-pipeline.sh
Task 2: Implement count_consecutive_test_failures() function in sw-pipeline.sh before self_healing_build_test()
Task 3: Add cycling halt check inside self_healing_build_test() while loop, before each build attempt
Task 4: Emit pipeline.stuck_cycling event with consecutive count and cap
Task 5: Write diagnostic to state file via log_stage("pipeline", "stuck_cycling: ...")
Task 6: Add stuck_cycling case to pipeline_status() display
Task 7: Write unit test test_count_consecutive_test_failures_parsing — function extraction + synthetic state files
Task 8: Write E2E test test_stuck_cycling_halts_after_max_build_retries — mock always-failing test, pre-seeded state, verify halt
Task 9: Register both tests in main() tests array
Task 10: Run npm test and verify all tests pass

Testing Approach

Test Pyramid:

1 unit test: extracts the parsing function, tests it against 5+ synthetic state files
1 E2E test: runs real sw-pipeline.sh with mocked build/test environment

Unit test coverage targets:

Empty/missing state file → 0 (edge case)
State file with no test entries → 0
1 failure → 1
3 consecutive failures → 3
2 passes then 3 failures → 3 (counter resets on pass)
Pass after failures → 0

E2E critical path: SW_PIPELINE_MAX_BUILD_RETRIES=2 + 1 pre-seeded failure + always-failing test → stuck_cycling status after 1 additional failure (total=2).

Definition of Done

After SW_PIPELINE_MAX_BUILD_RETRIES (default 3) consecutive test failures, pipeline exits with status: stuck_cycling in pipeline-state.md
A diagnostic log entry explains the halt and how to override
SW_PIPELINE_MAX_BUILD_RETRIES=0 disables the cap (loop runs unbounded)
Counter resets to 0 when any test stage succeeds
New unit test in sw-pipeline-test.sh validates the parsing function
New E2E test in sw-pipeline-test.sh validates the full stuck_cycling exit path
npm test passes with no regressions
Both the self_healing_build_test path (line 1959) and the self_healing_review_build_test path (line 1758) are covered (both call self_healing_build_test, so the check fires on both)

Pipeline Plan 448

Implementation Plan: Pipeline Cycling Halt (stuck_cycling)

Files to Modify

Root Cause Analysis

Alternatives Considered

Risk Analysis

Data Flow and Architecture

Implementation Steps

Task Checklist

Testing Approach

Definition of Done

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Implementation Plan: Pipeline Cycling Halt (`stuck_cycling`)