-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Design 232
ADR written to .claude/pipeline-artifacts/design.md.
Key decisions in the ADR:
- Defense-in-depth with 3 independent validation gates — initialize_state (unconditional rm), resume_state (issue-match check), stage_build (injection gate). Each layer independently prevents stale task injection.
-
Test-only approach — production fixes are already committed across 5 PRs. This phase adds a dedicated regression test suite (
scripts/sw-pipeline-tasks-md-test.sh) with 9 tests. -
Conservative error handling — malformed task files (missing
- Issue:header) are treated as stale and removed rather than injected. - Rejected alternatives: single cleanup point (doesn't protect resume path), centralized manager (over-engineering), completion-time cleanup (fragile to crashes). the issue number from the task file's metadata header.
Constraints: Production fixes are already committed across 5 PRs (1a64c9d0, 4a2f315d, 8bfeb411, 70a8aaeb, a730e469). This design focuses exclusively on verification and test coverage — no production code changes.
-
Most likely — Missing validation on resume path (CONFIRMED FIXED):
resume_state()previously did not validate the task file's issue against the currentGITHUB_ISSUE. Stale files survived resume and were injected bystage_build(). Fixed in commits referenced above. -
Issue number format inconsistency:
GITHUB_ISSUEcould contain#232or232, while the task file might store either format. If normalization differed across paths, comparisons would fail silently. Fixed by consistenttr -d '#' | xargsnormalization in all three paths. -
Malformed metadata header: A task file missing the
- Issue:line would bypass all issue-based checks and get injected regardless. Fixed by treating extraction failure as grounds for file removal.
All three cleanup paths are confirmed present with correct logic:
-
pipeline-state.sh:463:
[[ -n "${TASKS_FILE:-}" ]] && rm -f "$TASKS_FILE"— unconditional cleanup on fresh start -
pipeline-state.sh:607-620: Issue mismatch detection via
extract_issue_from_tasks_file(), withrm -fon mismatch or malformed file -
pipeline-stages-build.sh:172-189: Defense-in-depth gate — only injects tasks when
tasks_issue == current_issue, removes file otherwise -
helpers.sh:429-435:
extract_issue_from_tasks_file()usesgrep -m1 -i "^-\{0,1\} *Issue:", strips#, trims whitespace, returns exit 1 on failure -
Normalization: All three paths use identical
echo "${GITHUB_ISSUE:-}" | tr -d '#' | xargs -
Existing tests:
sw-lib-pipeline-stages-test.sh:385-655already covers initialize cleanup, resume mismatch, stage_build injection gate, and normalization variants
Test-only verification approach — validate the existing three-layer defense without modifying production code.
The fix uses a defense-in-depth pattern with three independent validation gates:
initialize_state() -> unconditional rm (layer 1: fresh start)
resume_state() -> issue-match check + rm on mismatch (layer 2: resume)
stage_build() -> issue-match check + rm on mismatch (layer 3: injection point)
Each layer independently prevents stale task injection. Even if one layer is bypassed (e.g., resume skipped), the next layer catches the mismatch.
Issue normalization is centralized in extract_issue_from_tasks_file() for the file side, and uses tr -d '#' | xargs inline for GITHUB_ISSUE at each comparison point. This is intentionally not DRY'd into a helper — the inline normalization is 6 characters and keeping it visible at each comparison point makes the validation logic self-documenting.
Error handling: Malformed files (missing - Issue: header) are treated as stale and removed. This is the correct conservative choice — a task file that can't be attributed to an issue should never be injected.
No new production fix needed. The existing commits implement the correct three-layer defense. The strategy is:
- Extend test coverage in
sw-lib-pipeline-stages-test.sh(tests already exist at lines 385-655) - Add a dedicated test file
scripts/sw-pipeline-tasks-md-test.shfor focused regression coverage - Cover the exact reproduction scenario (issue #154 tasks leaking into #232 pipeline)
- Cover edge cases: unset TASKS_FILE, missing file, 5+ metadata format variations
This differs from a "rewrite" approach because the fixes are already correct — we just need confidence through comprehensive test coverage.
-
Single cleanup point (only in initialize_state) — Pros: simpler, one place to maintain / Cons: doesn't protect resume path, which is the exact bug scenario. Resume skips initialize_state entirely, so stale files would survive.
-
Centralized task file manager class/module — Pros: DRY, single responsibility / Cons: over-engineering for a flat file with 3 consumers. The current inline checks are 5-8 lines each and self-documenting. Adding an abstraction layer would obscure the validation logic at the point where it matters most.
-
Delete task file at pipeline completion instead of checking on resume — Pros: prevents accumulation / Cons: doesn't handle crashes, kill -9, or context exhaustion where cleanup code never runs. The current approach (validate on read) is resilient to incomplete shutdown.
- Files to create:
scripts/sw-pipeline-tasks-md-test.sh— dedicated regression test suite with 9 tests covering the exact bug scenario, metadata format variations, edge cases (unset variable, concurrent access, corrupted file, missing file) - Files to modify: None (production code already fixed)
- Dependencies: None new
- Risk areas:
- Test isolation: each test must use
$TMPDIR-based temp files, never live.claude/directory - Sourcing production functions in test context requires careful mock setup (write_state, warn, etc.)
-
extract_issue_from_tasks_fileregex relies ongrep -m1 -i— test must verify case-insensitive matching
- Test isolation: each test must use
- Bug reproduction test passes: issue #154 tasks file is deleted when resuming for issue #232
- All 3 cleanup paths verified: initialize_state (unconditional rm), resume_state (issue mismatch rm), stage_build (injection gate)
- extract_issue_from_tasks_file handles 5+ metadata formats:
- Issue: #232,- Issue: 232,Issue: #232,- Issue: #232, and returns error for missing header - Issue normalization consistent: all paths produce bare number (no
#, no whitespace) before comparison - Edge cases pass: unset TASKS_FILE, missing file, malformed file, concurrent initialization
- Existing test suite (
sw-lib-pipeline-stages-test.sh) passes with zero regressions - No production code changes in the final diff
- Run existing tests:
bash scripts/sw-lib-pipeline-stages-test.sh— confirms no regressions - Run new test suite:
bash scripts/sw-pipeline-tasks-md-test.sh— confirms all 9 scenarios pass - Run full suite:
npm test— confirms no cross-contamination - Manual verification: inspect
git diffto confirm zero production file changes