-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Design 448
Design: Pipeline re-enters build indefinitely after consecutive test-stage failures (no cycling halt)
self_healing_build_test() in scripts/sw-pipeline.sh runs up to BUILD_TEST_RETRIES (default 3) build→test cycles per invocation. When all cycles exhaust and the function returns 1, external orchestration (the autonomous pipeline runner and daemon) re-invokes the pipeline from scratch — resetting all in-memory counters (STUCKNESS_COUNT, RESTART_COUNT, EXTENSION_COUNT in sw-loop.sh). The existing pipeline-state.md log persists across restarts but nothing reads it to detect cumulative failure patterns.
Two existing convergence detectors fail here:
-
Same-error detector (
consecutive_same_error): resets when error signatures differ across cycles (timestamps, assertion counts change) -
Plateau detector: resets when
prev_fail_countvaries
Neither tracks failures across separate self_healing_build_test invocations. The state file log already records ### test (ts)\nfailed (...) entries on every failure — this is the only durable signal that survives restarts.
Constraints:
- Bash 3.2 compatible (no
declare -A, no${var,,}, noreadarray) - No new files —
_cleanup_run_artifacts()must not need updating - No new state schema fields — blast radius on
initialize_state()/write_state()is too high - Counter must reset automatically when a test stage passes (no explicit reset path)
-
SW_PIPELINE_MAX_BUILD_RETRIES=0must be a valid escape hatch for automation
Parse the existing pipeline-state.md log section at the top of each self_healing_build_test() while-loop iteration to count trailing consecutive test stage failures. If the count reaches SW_PIPELINE_MAX_BUILD_RETRIES (default: 3), set status: stuck_cycling, write a diagnostic log entry, emit a structured event, and return 1. No new files, no new state schema fields.
Key design choices:
- Counter lives in the log, not in a variable or file — survives daemon restarts by construction
- Reset is implicit:
mark_stage_complete("test")writescompleteto the log; the parser seescompleteand resets the trailing count to 0 - The check fires before each build attempt, so it catches cycling on the next entry after the cap is reached — not after N+1 failures
- Both call sites (
self_healing_build_testat line ~1483 andself_healing_review_build_testat line ~1758) get the guard automatically since both invokeself_healing_build_test
-
External counter file (
ARTIFACTS_DIR/consecutive-test-failures.txt) — Pros: trivially simple read/write. Cons: not automatically cleaned on fresh pipeline start; requires_cleanup_run_artifacts()to preserve it across restarts (contradicts its purpose); silently wrong if ARTIFACTS_DIR is wiped. -
State file frontmatter field (
consecutive_test_failures: N) — Pros: clean data model, first-class field. Cons: requires modifyinginitialize_state(),write_state(), andresume_state()inpipeline-state.sh; higher blast radius; field must be manually reset on test pass. -
In-memory counter passed as argument — Pros: no I/O. Cons: dies on any restart; doesn't solve the cross-invocation cycling problem at all.
| File | Change |
|---|---|
scripts/sw-pipeline.sh |
Add env default (~line 812), add count_consecutive_test_failures() before self_healing_build_test() (~line 1422), add cycling halt check inside while loop (~line 1483), add stuck_cycling to pipeline_status() display (~line 3407) |
scripts/sw-pipeline-test.sh |
Add test_count_consecutive_test_failures_parsing (unit), test_stuck_cycling_halts_after_max_build_retries (E2E), register both in main()
|
None.
None — parser uses only Bash builtins and [[ =~ ]].
| Area | Risk | Mitigation |
|---|---|---|
BASH_REMATCH regex |
Pattern must be POSIX ERE for bash 3.2 | Use ^###[[:space:]]+([a-z_]+)[[:space:]]+ — validated POSIX ERE |
| State file absent | First cycle: file doesn't exist yet | Guard: `[[ -z "$state_file" |
| Log section boundary | Parser must not count entries outside ## Log
|
in_log flag gated on ## Log header line |
resume_state() compatibility |
Parser uses same log grammar as line ~774 | Reuse identical regex — no divergence risk |
stuck_cycling blocking pipeline resume
|
Automated restart fails with env override | Resume does NOT treat stuck_cycling as terminal; SW_PIPELINE_MAX_BUILD_RETRIES=0 bypasses check |
┌─────────────────────────────────────────────────────┐
│ sw-pipeline.sh │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ self_healing_build_test() │ │
│ │ │ │
│ │ while [cycle < BUILD_TEST_RETRIES]: │ │
│ │ ┌──────────────────────┐ │ │
│ │ │ [NEW] cycling guard │ │ │
│ │ │ count_consecutive_ │◄── reads ──┐ │ │
│ │ │ test_failures() │ │ │ │
│ │ └──────┬───────────────┘ │ │ │
│ │ │ N >= cap? │ │ │
│ │ ├─ YES → stuck_cycling │ │ │
│ │ │ ↓ │ │ │
│ │ │ update_status() │ │ │
│ │ │ log_stage() │ │ │
│ │ │ emit_event() │ │ │
│ │ │ return 1 │ │ │
│ │ │ │ │ │
│ │ └─ NO → run build → run test │ │ │
│ │ │ │ │ │
│ │ mark_stage_failed() │ │ │
│ │ write_state() ───────┘ │ │
│ └────────────────────────────────────────── │ │
│ │ │
│ ┌────────────────────────────────────────┐ │ │
│ │ [NEW] count_consecutive_test_ │ │ │
│ │ failures(state_file) │──┘ │
│ │ │ │
│ │ reads: pipeline-state.md §## Log │ │
│ │ parses: ### test → complete|failed │ │
│ │ returns: N (trailing fail count) │ │
│ └────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ pipeline_status() │ │
│ │ stuck_cycling → ⚠ yellow icon │ │
│ └────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
▼ persists to / reads from ▼
┌─────────────────────────────────────────────────────┐
│ pipeline-state.md │
│ --- │
│ status: stuck_cycling │
│ ... │
│ ## Log │
│ ### test (2026年04月30日T10:00:00Z) │
│ failed (exit 1) │
│ ### test (2026年04月30日T10:05:00Z) │
│ failed (exit 1) ← parser counts these │
│ ### test (2026年04月30日T10:10:00Z) │
│ failed (exit 1) │
└─────────────────────────────────────────────────────┘
# count_consecutive_test_failures # Input: state_file (path) — optional, defaults to ${STATE_FILE:-} # Output: integer N printed to stdout (0 if file absent/empty/no test entries) # Errors: none — always returns 0 on any read failure # Pre: state_file may not exist (handled gracefully) # Post: N = count of trailing consecutive "failed" test entries in ## Log # N resets to 0 on any "complete" test entry count_consecutive_test_failures() { ... } # Returns: 0 always (stdout carries the count) # self_healing_build_test (modified) # Input: (no change to existing signature) # New behavior: calls count_consecutive_test_failures() at top of while loop # halts with status=stuck_cycling if count >= SW_PIPELINE_MAX_BUILD_RETRIES # Errors: returns 1 on stuck_cycling (same as existing exhaustion return) # Events emitted: pipeline.stuck_cycling { issue, consecutive_failures, cap } # Environment contract: # SW_PIPELINE_MAX_BUILD_RETRIES (int, default 3) # 0 → guard disabled, loop runs unbounded # N → halt after N consecutive test stage failures across all invocations
[daemon / autonomous runner]
│
▼
sw-pipeline.sh → self_healing_build_test()
│
▼ (top of while loop, each iteration)
count_consecutive_test_failures(pipeline-state.md)
│
├─── reads §## Log section line-by-line
│ tracks in_log flag, current_stage, outcomes string
│ appends "pass" or "fail" per test entry
│
└─── returns N (trailing consecutive fail count)
│
N < cap ───────┤───────── N >= cap (and cap > 0)
│ │
run build update_status("stuck_cycling")
run test log_stage("pipeline", "stuck_cycling: ...")
│ write_state() → pipeline-state.md
test fails emit_event("pipeline.stuck_cycling", ...)
│ error() + warn() to terminal
mark_stage_failed("test") return 1
log_stage("test","failed") │
write_state() [daemon sees return 1, does not re-invoke]
│
loop continues
| Component | Errors It Handles | Propagation |
|---|---|---|
count_consecutive_test_failures() |
Missing/unreadable state file, empty log section, malformed lines | Returns 0 (safe default), never propagates — caller always gets an integer |
| Cycling halt check | _consec_failures >= cap |
Sets stuck_cycling status, calls return 1 — propagates as normal build failure to caller |
pipeline_status() |
Unknown status values |
stuck_cycling case added; unknown values fall through to existing default |
emit_event() |
Event emission failure | ` |
- After exactly
SW_PIPELINE_MAX_BUILD_RETRIES(default 3) consecutive test-stage failures logged inpipeline-state.md, pipeline exits withstatus: stuck_cycling -
stuck_cyclingis present inpipeline-state.mdafter halt - Diagnostic log entry in
## Logsection names failure count and override command -
pipeline.stuck_cyclingevent emitted withconsecutive_failures,cap,issuefields -
SW_PIPELINE_MAX_BUILD_RETRIES=0disables the guard entirely — loop runs unbounded - Counter resets to 0 after any
teststagecompleteentry in the log -
count_consecutive_test_failuresreturns 0 for: missing file, empty file, no test entries, pass after failures -
shipwright pipeline resumewithSW_PIPELINE_MAX_BUILD_RETRIES=0proceeds paststuck_cyclingstate -
npm testpasses with no regressions
Unit tests: 6 — all in test_count_consecutive_test_failures_parsing
| Case | Input | Expected |
|---|---|---|
| Missing state file | /dev/null/nonexistent |
0 |
| No test entries in log | log with only build entries |
0 |
| Single failure | ### test\nfailed (exit 1) |
1 |
| Three consecutive failures | ×ばつ failed entries | 3 |
| Pass resets counter | 2 failures → 1 pass → 2 failures | 2 |
| Pass after failures | 3 failures → 1 pass | 0 |
E2E tests: 1 — test_stuck_cycling_halts_after_max_build_retries
Setup: SW_PIPELINE_MAX_BUILD_RETRIES=2, mock sw-loop commits but exits 0, mock test always exits 1. Pre-seed state file with 1 prior test failure. Run pipeline. Assert stuck_cycling in state file after 1 additional failure (total=2 = cap).
Coverage targets:
- Parsing function: 100% branch coverage across the 6 unit cases above
- Cycling halt check: E2E covers the cap-reached path; existing tests cover the cap-not-reached path implicitly
-
pipeline_status()display: covered by existing status display tests oncestuck_cyclingcase is added
Critical paths:
- Happy path: test passes on cycle 2 → counter resets → no halt
- Error case 1: test fails N times in one invocation → halt at cap
- Error case 2: test fails across multiple daemon re-invocations → halt at cap (cross-restart persistence)
- Edge case 1:
SW_PIPELINE_MAX_BUILD_RETRIES=0→ no halt, loop continues - Edge case 2: state file absent on first cycle → count=0, loop proceeds normally