Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 448

ezigus edited this page Apr 30, 2026 · 1 revision

I now have enough context. Here's the implementation plan.


Implementation Plan: Pipeline Cycling Halt (stuck_cycling)

Files to Modify

File Change
scripts/sw-pipeline.sh Add count_consecutive_test_failures(), add cycling halt check in self_healing_build_test(), expose env var default, add stuck_cycling to status display
scripts/sw-pipeline-test.sh Add unit test for counter function + E2E test for stuck_cycling exit

Root Cause Analysis

What's happening: self_healing_build_test() runs N build→test cycles (BUILD_TEST_RETRIES=3 default). When it exhausts cycles and returns 1, external automation (daemon, autonomous pipeline) re-invokes the pipeline fresh — resetting STUCKNESS_COUNT, RESTART_COUNT, and EXTENSION_COUNT inside sw-loop.sh. The pipeline-state.md log persists but nothing reads it to count cumulative failures.

Why convergence detection doesn't save us: The same-error ×ばつ 3 detector (consecutive_same_error) resets when error signature changes (e.g., different timestamps, changed assertion counts). Plateau detection resets when prev_fail_count varies. Neither tracks failures across separate self_healing_build_test invocations.

Minimum viable fix: Add a persistent counter that reads pipeline-state.md log history before each build attempt. Since the log survives restarts, it catches cycling across invocations.


Alternatives Considered

Option A — Persistent state file counter (chosen) Parse ### test (ts)\nfailed (...) entries from pipeline-state.md before each build attempt. No new files, reuses existing log grammar.

Trade-offs: + Survives daemon restarts (persistent); + No new file I/O path; + Counter resets naturally when test passes; − Requires parsing the state file on each cycle start (cheap: <1ms sequential read).

Option B — External counter file Write a consecutive-test-failures.txt file in ARTIFACTS_DIR. Increment on test fail, reset on test pass.

Trade-offs: + Simple read/write; − Not automatically cleaned on fresh start; − New artifact that _cleanup_run_artifacts() would need to preserve; − Doesn't help if ARTIFACTS_DIR is wiped on restart.

Option C — Add counter to state file frontmatter Add consecutive_test_failures: N field to pipeline state.

Trade-offs: + Clean data model; − Requires modifying both initialize_state() and write_state() in pipeline-state.sh; − More surface area, higher blast radius.

Decision: Option A is the minimum viable change — no new files, no new state fields, leverages existing log format already parsed elsewhere (see resume_state() recovery logic at line 769).


Risk Analysis

Risk What Could Break Mitigation
Bash regex for BASH_REMATCH Fails on bash 3.2 if pattern has groups Pattern ^###[[:space:]]+([a-z_]+)[[:space:]]+ is POSIX ERE, works in bash 3.2+
State file not yet written First cycle, state file empty — count returns 0 Guard: [[ -f "$state_file" ]] before reading
Review self-healing re-uses self_healing_build_test Second call path at line 1758 also gets the check Check fires on BOTH paths — correct behavior, no regressions
SW_PIPELINE_MAX_BUILD_RETRIES=0 disables check Daemon running with explicit 0 will cycle indefinitely This is the documented override; user opted in
stuck_cycling state blocks pipeline resume Automated restart fails even with SW_PIPELINE_MAX_BUILD_RETRIES=0 Resume state does NOT treat stuck_cycling as terminal — user overrides via env var and resume proceeds, where the check is bypassed
Log format change in future Parser stops counting correctly Parser uses same regex already in resume_state() at line 774 — it's stable

Data Flow and Architecture

sw-pipeline.sh::self_healing_build_test()
 │
 ├─► [TOP OF WHILE LOOP] count_consecutive_test_failures($STATE_FILE)
 │ │
 │ └─► reads pipeline-state.md §## Log
 │ parses: ### test (ts)\ncomplete|failed
 │ returns: N (trailing consecutive "failed" count)
 │
 ├─ if N >= SW_PIPELINE_MAX_BUILD_RETRIES (and > 0):
 │ update_status("stuck_cycling", "build")
 │ log_stage("pipeline", "stuck_cycling: ...")
 │ emit_event("pipeline.stuck_cycling", ...)
 │ return 1
 │
 └─ else: run build → run test → mark result → loop
 │
 └─► mark_stage_failed("test")
 log_stage("test", "failed (...)")
 write_state()
 ← persists to pipeline-state.md

Counter reset mechanism: mark_stage_complete("test") calls log_stage("test", "complete (...)"). Parser sees complete → resets trailing count to 0. No explicit reset needed.


Implementation Steps

Step 1 — Add SW_PIPELINE_MAX_BUILD_RETRIES default near line 812 (where BUILD_TEST_RETRIES is set):

SW_PIPELINE_MAX_BUILD_RETRIES=${SW_PIPELINE_MAX_BUILD_RETRIES:-3}

Step 2 — Add count_consecutive_test_failures() function in sw-pipeline.sh, immediately before self_healing_build_test() (around line 1422). The function must be Bash 3.2 compatible (no associative arrays, no ${var,,}):

count_consecutive_test_failures() {
 local state_file="${1:-${STATE_FILE:-}}"
 [[ -z "$state_file" || ! -f "$state_file" ]] && echo 0 && return 0
 local in_log=0 current_stage="" outcomes=""
 while IFS= read -r line; do
 if [[ "$line" == "## Log" ]]; then in_log=1; continue; fi
 [[ "$in_log" -eq 0 ]] && continue
 if [[ "$line" =~ ^###[[:space:]]+([a-z_]+)[[:space:]]+ ]]; then
 current_stage="${BASH_REMATCH[1]}"; continue
 fi
 if [[ "$current_stage" == "test" ]]; then
 if [[ "$line" =~ ^complete ]]; then
 outcomes="$outcomes pass"; current_stage=""
 elif [[ "$line" =~ ^failed ]]; then
 outcomes="$outcomes fail"; current_stage=""
 fi
 fi
 done < "$state_file"
 local count=0 word
 for word in $outcomes; do
 if [[ "$word" == "fail" ]]; then count=$((count + 1))
 elif [[ "$word" == "pass" ]]; then count=0; fi
 done
 echo "$count"
}

Step 3 — Add cycling halt check in self_healing_build_test() at the top of the while loop body (after cycle=$((cycle + 1)), before the build runs, ~line 1483):

# Outer cycling halt: persistent consecutive test failure cap
local _max_build_retries="${SW_PIPELINE_MAX_BUILD_RETRIES:-3}"
if [[ "$_max_build_retries" -gt 0 ]]; then
 local _consec_failures
 _consec_failures=$(count_consecutive_test_failures)
 if [[ "$_consec_failures" -ge "$_max_build_retries" ]]; then
 update_status "stuck_cycling" "build"
 log_stage "pipeline" "stuck_cycling: ${_consec_failures} consecutive test failures (cap=${_max_build_retries}). Override: SW_PIPELINE_MAX_BUILD_RETRIES=0"
 write_state
 error "Pipeline halted: ${_consec_failures} consecutive test failures reached cap of ${_max_build_retries}"
 warn "Override: SW_PIPELINE_MAX_BUILD_RETRIES=0 shipwright pipeline resume"
 emit_event "pipeline.stuck_cycling" \
 "issue=${ISSUE_NUMBER:-0}" \
 "consecutive_failures=${_consec_failures}" \
 "cap=${_max_build_retries}" || true
 return 1
 fi
fi

Step 4 — Add stuck_cycling to status display in pipeline_status() around line 3407:

stuck_cycling) status_icon="${YELLOW}${RESET}" ;;

Step 5 — Add E2E and unit tests to sw-pipeline-test.sh:

Test A (test_count_consecutive_test_failures_parsing): Extract count_consecutive_test_failures into a temp script, feed synthetic state files with known failure patterns (0 failures, 3 failures, pass-then-2-fails, etc.), assert correct counts.

Test B (test_stuck_cycling_halts_after_max_build_retries): Full E2E. Set SW_PIPELINE_MAX_BUILD_RETRIES=2. Use mock sw-loop that always commits but test always fails. Pre-seed state file with 1 prior test failure. Run pipeline; assert stuck_cycling in state file on second test failure.

Step 6 — Register both tests in the main() tests array in sw-pipeline-test.sh.


Task Checklist

  • Task 1: Add SW_PIPELINE_MAX_BUILD_RETRIES default near line 812 in sw-pipeline.sh
  • Task 2: Implement count_consecutive_test_failures() function in sw-pipeline.sh before self_healing_build_test()
  • Task 3: Add cycling halt check inside self_healing_build_test() while loop, before each build attempt
  • Task 4: Emit pipeline.stuck_cycling event with consecutive count and cap
  • Task 5: Write diagnostic to state file via log_stage("pipeline", "stuck_cycling: ...")
  • Task 6: Add stuck_cycling case to pipeline_status() display
  • Task 7: Write unit test test_count_consecutive_test_failures_parsing — function extraction + synthetic state files
  • Task 8: Write E2E test test_stuck_cycling_halts_after_max_build_retries — mock always-failing test, pre-seeded state, verify halt
  • Task 9: Register both tests in main() tests array
  • Task 10: Run npm test and verify all tests pass

Testing Approach

Test Pyramid:

  • 1 unit test: extracts the parsing function, tests it against 5+ synthetic state files
  • 1 E2E test: runs real sw-pipeline.sh with mocked build/test environment

Unit test coverage targets:

  • Empty/missing state file → 0 (edge case)
  • State file with no test entries → 0
  • 1 failure → 1
  • 3 consecutive failures → 3
  • 2 passes then 3 failures → 3 (counter resets on pass)
  • Pass after failures → 0

E2E critical path: SW_PIPELINE_MAX_BUILD_RETRIES=2 + 1 pre-seeded failure + always-failing test → stuck_cycling status after 1 additional failure (total=2).


Definition of Done

  • After SW_PIPELINE_MAX_BUILD_RETRIES (default 3) consecutive test failures, pipeline exits with status: stuck_cycling in pipeline-state.md
  • A diagnostic log entry explains the halt and how to override
  • SW_PIPELINE_MAX_BUILD_RETRIES=0 disables the cap (loop runs unbounded)
  • Counter resets to 0 when any test stage succeeds
  • New unit test in sw-pipeline-test.sh validates the parsing function
  • New E2E test in sw-pipeline-test.sh validates the full stuck_cycling exit path
  • npm test passes with no regressions
  • Both the self_healing_build_test path (line 1959) and the self_healing_review_build_test path (line 1758) are covered (both call self_healing_build_test, so the check fires on both)

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /