-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Plan 172
The core extraction work is already complete (commit 1d85a6a). sw-pipeline.sh was reduced from 3,171 → 708 lines (78% reduction). Both target modules exist:
-
scripts/lib/pipeline-state.sh(612 lines) — state read/write/validate -
scripts/lib/pipeline-stage-executor.sh(645 lines) — stage execution with hooks
Remaining work: Test coverage gaps need closing to meet the >80% criterion.
┌─────────────────────────────────────────────────────┐
│ sw-pipeline.sh (708 lines) │
│ CLI dispatch, sourcing, signal setup │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌───────────┐ │
│ │ commands │ │ orchestration│ │ stages-* │ │
│ │ (836 ln) │→ │ (813 ln) │→ │ (various) │ │
│ └─────────────┘ └──────────────┘ └───────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────┐ │
│ │ pipeline-stage-executor.sh │ ← TARGET MODULE │
│ │ (645 lines) │ │
│ │ • classify_error() │ │
│ │ • run_stage_with_retry() │ │
│ │ • self_healing_build_test()│ │
│ │ • self_healing_review_*() │ │
│ └──────────────┬──────────────┘ │
│ ▼ │
│ ┌──────────────────────────────┐ │
│ │ pipeline-state.sh │ ← TARGET MODULE │
│ │ (612 lines) │ │
│ │ • save_artifact() │ │
│ │ • get/set_stage_status() │ │
│ │ • mark_stage_complete() │ │
│ │ • mark_stage_failed() │ │
│ │ • write_state() │ │
│ │ • resume_state() │ │
│ │ • initialize_state() │ │
│ └──────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
Dependency direction: orchestration → executor → state (inward only, no cycles).
// Artifact persistence save_artifact(name: string, content: string): void // writes to ARTIFACTS_DIR // Stage status CRUD get_stage_status(stage_id: string): string // "pending"|"running"|"complete"|"failed"|"" set_stage_status(stage_id: string, status: string): void // Timing record_stage_start(stage_id: string): void record_stage_end(stage_id: string): void get_stage_timing(stage_id: string): string // formatted "1m30s" get_stage_timing_seconds(stage_id: string): number // raw seconds, 0 if unknown get_slowest_stage(): string // stage_id or "" // Descriptions & progress get_stage_description(stage_id: string): string build_stage_progress(): string // "intake:complete plan:running test:pending" // State transitions (side effects: writes state, emits events, updates GitHub) update_status(status: string, stage: string): void mark_stage_complete(stage_id: string): void // Error: event emit failure (non-fatal) mark_stage_failed(stage_id: string): void // Error: event emit failure (non-fatal) // Persistence initialize_state(): void // Resets all state, calls write_state() write_state(): void // Error: disk space check failure → return 1 resume_state(): void // Error: missing state file → exit 1, missing goal → exit 1 // Validation verify_stage_artifacts(stage_id: string): boolean // 0=ok, 1=missing artifacts persist_artifacts(stage: string, ...files: string[]): void // CI-only, non-fatal // Meta-cognition record_stage_effectiveness(stage_id: string, outcome: string): void get_stage_self_awareness_hint(stage_id: string): string // Logging log_stage(stage_id: string, message: string): void
// Error classification classify_error(stage_id: string): "infrastructure"|"configuration"|"logic"|"unknown" // Execution run_stage_with_retry(stage_id: string): boolean // 0=success, 1=failure // Error: configuration errors → immediate failure (no retry) // Error: repeated logic errors → immediate failure // Self-healing loops self_healing_build_test(): boolean // 0=tests pass, 1=exhausted // Error: infrastructure error (simulator not found) → immediate exit // Error: convergence stuck (same error 3x) → early exit // Error: plateau (no progress 2x) → early exit self_healing_review_build_test(): boolean // 0=review passes, 1=exhausted
CLI args → parse_args → pipeline_start()
→ initialize_state() [state module]
→ run_pipeline() [orchestration]
→ for each stage:
→ record_stage_start() [state]
→ run_stage_with_retry() [executor]
→ stage_<id>() [stage modules]
→ classify_error() on failure [executor]
→ mark_stage_complete/failed() [state]
→ write_state() → STATE_FILE
→ emit_event() → events.jsonl
→ gh_update_progress() → GitHub
| Component | Handles | Propagates |
|---|---|---|
| state | Disk space (write_state), missing artifacts (verify), CI push failures (persist_artifacts) | Missing state file → exit 1 |
| executor | Infrastructure/config/logic classification, convergence detection, retry decisions | Stage failure → return 1 to orchestration |
| orchestration | Stage sequencing, gate enforcement, self-healing loop control | Pipeline failure → write final state, exit |
| Function | Tested | Lines | Priority |
|---|---|---|---|
| get_slowest_stage | No | 15 | Medium |
| build_stage_progress | No | 20 | Medium |
| update_status | No | 5 | Low (simple) |
| write_state | No | 68 | High (core persistence) |
| resume_state | No | 42 | High (crash recovery) |
| mark_stage_complete | No | 94 | High (many side effects) |
| mark_stage_failed | No | 68 | High (many side effects) |
| Function | Tested | Lines | Priority |
|---|---|---|---|
| self_healing_build_test | No | 287 | High (core loop) |
| self_healing_review_build_test | No | 69 | Medium |
| run_stage_with_retry (edge cases) | Partial | 157 | Medium |
- Pros: Smaller units, easier to test
- Cons: More files, more source calls, higher complexity for bash
- Decision: Rejected. The current module boundaries are clean. Adding tests for existing functions is simpler and lower-risk than restructuring.
- Pros: Better bash testing framework, TAP output
- Cons: New dependency, rewrite all existing tests, learning curve
- Decision: Rejected. Existing test-helpers.sh framework works well and is already used by 9 test files. Consistency matters more.
- Pros: Fewer tests to write
- Cons: Won't reach >80% coverage target on complex functions like mark_stage_complete
- Decision: Rejected. Key functions have complex side-effect chains that need verification.
| Risk | Impact | Mitigation |
|---|---|---|
| Tests for mark_stage_complete/failed need many stubs | Medium — fragile test setup | Use existing stub pattern from state test file; stub only external calls |
| self_healing_build_test test could be slow (sleep calls) | Low — test isolation | Override sleep in test, mock stage functions to fail/succeed deterministically |
| write_state test could corrupt real state | Low | Tests already use TEST_TEMP_DIR isolation |
| resume_state test needs valid state file format | Low | Generate state file content in test using known-good format |
| File | Action |
|---|---|
scripts/sw-lib-pipeline-state-test.sh |
Modify — add tests for write_state, resume_state, get_slowest_stage, build_stage_progress, mark_stage_complete, mark_stage_failed |
scripts/sw-lib-pipeline-stage-executor-test.sh |
Modify — add tests for self_healing_build_test convergence detection, run_stage_with_retry edge cases |
No new files needed. No production code changes required.
Add to sw-lib-pipeline-state-test.sh:
-
get_slowest_stage: Set up STAGE_TIMINGS with multiple stages, verify correct stage returned. Test empty case returns "".
-
build_stage_progress: Create minimal PIPELINE_CONFIG JSON, set various stage statuses, verify progress string format.
-
update_status: Call update_status, verify PIPELINE_STATUS and CURRENT_STAGE are set. Verify write_state was called.
-
write_state: Set up all state variables, call write_state, read STATE_FILE and verify YAML frontmatter structure contains all expected fields (pipeline, goal, status, issue, stages).
-
resume_state: Write a known state file, call resume_state (with stubs for gh_init, load_pipeline_config, git), verify all variables are restored correctly. Test error cases: missing file (exit 1), missing goal (exit 1), already-complete pipeline (exit 0).
-
mark_stage_complete: Stub all external calls (emit_event, gh_*, checkpoint, etc.). Call mark_stage_complete, verify: stage status set to "complete", timing recorded, log entry added, write_state called.
-
mark_stage_failed: Same pattern as mark_stage_complete but verify "failed" status and error comment format.
Add to sw-lib-pipeline-stage-executor-test.sh:
-
run_stage_with_retry — plan artifact skip: Create a plan.md with >10 lines, make stage_plan fail. Verify it returns 0 (skip retry because artifact exists).
-
run_stage_with_retry — configuration error escalation: Create a log file with "MODULE_NOT_FOUND" error, make stage fail. Verify it returns 1 immediately (no retry).
-
self_healing_build_test — happy path: Mock stage_build and stage_test to succeed on first try. Verify returns 0.
-
self_healing_build_test — convergence stuck: Mock stage_test to fail with same error 3 times. Verify early exit with return 1.
-
self_healing_build_test — plateau detection: Mock stage_test to fail with same failure count for 2 iterations. Verify early exit.
- Run
npm testto verify no regressions. - Run individual test files to verify new tests pass in isolation.
- Task 1: Add get_slowest_stage and build_stage_progress tests to state test file
- Task 2: Add write_state test — verify YAML output format and all fields
- Task 3: Add resume_state tests — happy path and error cases (missing file, missing goal, complete pipeline)
- Task 4: Add mark_stage_complete test with stubbed externals
- Task 5: Add mark_stage_failed test with stubbed externals
- Task 6: Add update_status test
- Task 7: Add run_stage_with_retry edge case tests (plan artifact skip, config error escalation)
- Task 8: Add self_healing_build_test happy path test
- Task 9: Add self_healing_build_test convergence detection tests (stuck + plateau)
- Task 10: Run full test suite — verify all tests pass with no regressions
-
Unit tests (target: ~25 new tests): Test each function in isolation with mocked dependencies
- State module: ~15 new tests (write_state, resume_state, mark_stage_complete/failed, get_slowest_stage, build_stage_progress, update_status)
- Executor module: ~10 new tests (self_healing_build_test paths, run_stage_with_retry edge cases)
- Integration tests (existing): The 12 tests in sw-pipeline-test.sh cover end-to-end pipeline flows
- E2E tests: Not applicable (shell scripts, no deployment)
- pipeline-state.sh: 80%+ function coverage (from ~60% → ~95%)
- pipeline-stage-executor.sh: 80%+ line coverage (from ~40% → ~80%)
- Overall pipeline module coverage: >80%
- Happy path: write_state produces valid YAML → resume_state restores it correctly (round-trip)
- Error case 1: resume_state with missing/corrupt state file → exits with error
- Error case 2: self_healing_build_test stuck on same error → early convergence exit
- Edge case 1: get_slowest_stage with no timing data → returns ""
- Edge case 2: mark_stage_complete with all optional integrations absent → still succeeds
- sw-pipeline.sh < 1500 lines (already 708)
- pipeline-state.sh exists as separate module (already exists, 612 lines)
- pipeline-stage-executor.sh exists as separate module (already exists, 645 lines)
- All existing tests pass (
npm testgreen) - New tests added: 20+ additional unit tests across both modules
- Function coverage >80% for both modules
- No production code changes (test-only additions)
- Each module can be sourced independently without side effects (include guards verified)