Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 232

ezigus edited this page Mar 27, 2026 · 2 revisions

I'll provide the comprehensive implementation plan here. This analyzes the root cause, proposes solutions, and structures the work:


Implementation Plan: Fix Stale Tasks Injected on Pipeline Resume (Issue #232)

Root Cause Analysis (Systematic Debugging)

Evidence Gathered

Primary Cause - Confirmed:

  • pipeline-tasks.md is written by intake stage (line 451-462 in pipeline-stages-intake.sh) with metadata: Issue: ${GITHUB_ISSUE}
  • File is NEVER removed in initialize_state() (line 461 in pipeline-state.sh) — only clears model-routing.log and .plan-failure-sig.txt
  • File is NEVER validated in resume_state() (line 534-610) — blindly restores state without checking metadata
  • File is NEVER validated in stage_build() before injection (line 172 in pipeline-stages-build.sh) — injects without checking issue number

Secondary Cause - Suspected:

  • TEST_CMD is set to echo FAIL; exit 1 (mentioned in issue as existing in test harness at sw-lib-pipeline-stages-test.sh:251)
  • This value somehow leaks into pipeline TEST_CMD instead of being npm test
  • Causes tests to always fail, triggering unnecessary self-healing cycles

Root Cause Hypothesis (Ranked)

  1. VERY HIGH (99%): Task file not cleared/validated on resume — STATE MANAGEMENT BUG

    • Evidence: Three independent validation gaps (initialize, resume, inject) all missing
    • Fix: Add cleanup + validation at each point
  2. HIGH (85%): TEST_CMD environment variable scoping issue

    • Evidence: Test harness defines echo FAIL; exit 1 at line 251
    • Fix: Isolate test command in local variable, not global
  3. MEDIUM (70%): Resume flow doesn't restore clean environment

    • Evidence: resume_state() never validates metadata matches
    • Fix: Add issue metadata check before restoring

Alternatives Considered

Alternative A: "Clear on Initialize + Validate on Resume" ✓ CHOSEN

Approach:

  1. Add rm -f "$TASKS_FILE" in initialize_state()
  2. Add issue validation in resume_state() — extract issue from file, compare to GITHUB_ISSUE
  3. Add safe injection check in stage_build() before using TASKS_FILE

Pros:

  • Minimal changes (~15 production lines across 2 files)
  • Defense-in-depth: three independent validation points catch stale tasks
  • Very low risk: clear, targeted fixes with no refactoring
  • Easy to test and verify

Cons:

  • Validation logic scattered across 2 files (not centralized)
  • Happens at different times in flow

Trade-offs: Simplicity wins over consolidation; perfect for isolated bug


Alternative B: "Metadata-Scoped Task Validation"

Approach:

  • Store commit hash + tree hash in task file
  • On resume, verify code hasn't changed drastically
  • Selectively inject only relevant tasks

Pros:

  • Intelligent: understands which tasks apply to current code
  • Future-proof for multi-issue scenarios

Cons:

  • Complex (~50+ lines)
  • Overkill: pipelines are single-issue
  • Hard to maintain and test

Trade-offs: Over-engineering; not chosen


Alternative C: "Helper Function Consolidation"

Approach:

  • Create validate_tasks_file() helper
  • Called from initialize_state(), resume_state(), stage_build()

Pros:

  • Single source of truth for validation
  • Easier to extend validation later

Cons:

  • More code (~30 lines for helper)
  • Still adds complexity for simple fix

Trade-offs: Good design, but overkill for localized bug


Task Decomposition (with Dependencies)

Production Code Tasks

  1. Task 1: Clear tasks on initialize (no deps)

    • File: scripts/lib/pipeline-state.sh line 461
    • Add: rm -f "$TASKS_FILE" after existing cleanup
    • Lines: 1
    • Blocks: None (independent)
  2. Task 2: Extract issue from task metadata (dep: Task 1)

    • File: scripts/lib/pipeline-state.sh after line 609 in resume_state()
    • Parse: Extract Issue: #NNN from TASKS_FILE second header section
    • Lines: ~6
    • Blocks: Task 3
  3. Task 3: Validate issue matches on resume (dep: Task 2)

    • File: scripts/lib/pipeline-state.sh after Task 2
    • Logic: If extracted issue ≠ current GITHUB_ISSUE, remove file + warn
    • Lines: ~5
    • Blocks: Task 4
  4. Task 4: Validate before task injection (dep: Task 3)

    • File: scripts/lib/pipeline-stages-build.sh line 172
    • Logic: Before injecting TASKS_FILE, verify issue metadata matches $GITHUB_ISSUE
    • Lines: ~6
    • Blocks: None (defensive check)
  5. Task 5: Investigate and fix TEST_CMD leak (no deps)

    • Files: pipeline-stages-build.sh line 447, sw-lib-pipeline-stages-test.sh line 251
    • Investigate: How does echo FAIL; exit 1 get into TEST_CMD?
    • Fix: Isolate test harness value in local variable
    • Lines: ~3-5
    • Blocks: None (independent)

Test Tasks

  1. Task 6: Unit test — initialize clears tasks (dep: Task 1)

    • File: scripts/sw-lib-pipeline-stages-test.sh
    • Test: Create TASKS_FILE, call initialize_state(), verify removed
    • Lines: ~15
  2. Task 7: Unit test — resume validates issue (dep: Task 3)

    • File: scripts/sw-lib-pipeline-stages-test.sh
    • Test: Create tasks with Issue #154, resume with #232, verify removed
    • Lines: ~25
  3. Task 8: Unit test — injection validates issue (dep: Task 4)

    • File: scripts/sw-lib-pipeline-stages-test.sh
    • Test: Create TASKS_FILE with wrong issue, verify not injected
    • Lines: ~20
  4. Task 9: Integration test — full resume flow (dep: Tasks 1-4)

    • File: scripts/sw-lib-pipeline-stages-test.sh
    • Test: Complete scenario with issue change + resume + build
    • Lines: ~35
  5. Task 10: Run full test suite (dep: Tasks 1-9)

    • Command: ./scripts/sw-pipeline-test.sh
    • Verify: All 50+ existing tests pass (no regressions)

Risk Analysis

Risk Impact Prob. Mitigation
Clear valid tasks Build loses context Very Low Issue mismatch only triggers clear; normal flow leaves tasks intact
Validation logic breaks Pipeline crashes Low Use safe string ops, `
Tasks still injected Stale tasks still visible Very Low Three independent guards make bypass impossible
TEST_CMD leak not fixed Tests keep failing Medium Direct code trace in test harness; verify in pipeline test
Existing tests break Regression Low Full test suite before commit; validation only removes mismatched tasks
Performance impact Slow operations Very Low Only string comparisons + file unlink; no loops/API calls

Mitigation Strategy: Each validation point is independent; one broken check doesn't disable others. Tests catch regressions immediately.


Files to Modify

  1. scripts/lib/pipeline-state.sh (~1200 LOC)

    • initialize_state(): +1 line (rm -f)
    • resume_state(): +11 lines (parse + validate)
    • Impact: 15 lines, very low risk
  2. scripts/lib/pipeline-stages-build.sh (~500 LOC)

    • stage_build(): +6 lines validation before injection
    • Impact: 6 lines, low risk
  3. scripts/lib/pipeline-stages.sh OR pipeline-stages-build.sh

    • TEST_CMD handling: +3-5 lines (if leak found)
    • Impact: Minimal if needed
  4. scripts/sw-lib-pipeline-stages-test.sh (~200 LOC)

    • Add 4 new test cases: ~95 lines total
    • Impact: Test-only, no production risk

Testing Approach

Critical Paths to Test

Happy Path (No Issue Change):

  • Resume pipeline for same issue → tasks remain and are injected ✓

Error Case 1 (Issue Changed):

  • Resume for issue #232 with tasks from #154 → stale tasks removed ✓

Error Case 2 (Malformed Task File):

  • Resume with corrupted TASKS_FILE → gracefully skip without crashing ✓

Edge Case 1 (Missing Task File):

  • Resume when TASKS_FILE doesn't exist → continue normally ✓

Edge Case 2 (Multiple Resumes):

  • Resume → modify code → resume again → no accumulation of stale tasks ✓

Test Coverage Targets

  • Unit tests: 4 focused tests (~70 lines) covering each validation point
  • Integration test: 1 scenario test (~35 lines) covering full flow
  • Regression tests: Full existing test suite (50+ tests) must pass

Definition of Done

  • Code Complete

    • Task 1: initialize_state() removes pipeline-tasks.md
    • Task 2: resume_state() extracts issue from metadata
    • Task 3: resume_state() validates and removes stale tasks
    • Task 4: stage_build() validates before injection
    • Task 5: TEST_CMD leak investigated and fixed
  • Tests Pass

    • Task 6: Initialize clear test passes
    • Task 7: Resume validation test passes
    • Task 8: Injection validation test passes
    • Task 9: Full integration test passes
    • Task 10: All 50+ existing tests pass (no regressions)
  • Acceptance Verified

    • Old tasks not injected when issue changes
    • New tasks ARE injected when issue matches
    • Malformed files don't crash pipeline
    • Resume flow unchanged for valid cases
    • TEST_CMD properly isolated (if issue found)
  • Quality Checks

    • Bash 3.2 compatible (no associative arrays, etc.)
    • Proper error handling (all checks use safe patterns)
    • Logging appropriate (info/warn used correctly)
    • No debug code left in

Execution Order

  1. Task 1 — Clear on initialize (1 line, lowest risk, validates immediately)
  2. Task 2-3 — Validate on resume (11 lines, core fix)
  3. Task 4 — Defensive check in build (6 lines, final guard)
  4. Task 5 — Investigate TEST_CMD (parallel, may be small or larger)
  5. Task 6-9 — Tests in order of dependencies (15+25+20+35 = 95 lines)
  6. Task 10 — Full test suite verification (pass/fail decision point)

Estimated Implementation Time: ~60 lines code + 95 lines tests = 155 total lines. Low complexity, high confidence in solution.

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /