Pipeline Plan 417

ezigus edited this page Apr 23, 2026 · 1 revision

I'll provide the comprehensive implementation plan directly below:

Implementation Plan: Cross-Stage Drift Detector — plan.md vs git diff

Problem Analysis

Current State

When the build loop completes, the pipeline immediately enters stage_review() function. However, there is no verification that the code changes actually match what was planned. A file listed in the plan could be completely untouched, or a file could be modified that was never planned—but neither is caught before burning review/audit/CQ quota.

Root Cause

Missing detection: stage_review() (line 6 in pipeline-stages-review.sh) runs immediately after build with zero consistency checking
Silent drift: Plan-to-build divergence goes undetected until full review cycle
Class of failure: Repeatable pipeline failure mode: planned work is incomplete, or out-of-scope changes introduce unplanned complexity

Impact

Wasted AI review cycles on unplanned changes
Missed detection of incomplete builds (planned files untouched)
No early warning before heavy processing

Files to Modify

scripts/lib/pipeline-stages-review.sh — Add drift detection helper + integration (~38 lines total)

Implementation Steps

Phase 1: Build the Drift Detection Helper (Lines 50–80)

Add detect_plan_drift() function before stage_review()
- Input: 1ドル=artifacts_dir, 2ドル=project_root
- Output: String of drift warnings (or empty if no drift)
- Logic:
  - Read $artifacts_dir/plan.md
  - Extract filenames from ## Files to Modify section (parse markdown bullets)
  - Get git diff --name-only HEAD (actual changed files)
  - Find planned files NOT in actual changes
  - Format as [DRIFT-WARNING] Planned file not modified: <filepath> for each untouched file
- Error handling: Return empty string (fail-open) if plan.md missing, parse fails, or git fails
Handle edge cases
- plan.md missing → return "" (no warnings, continue review)
- ## Files to Modify not found → return "" (no warnings)
- git diff fails → return "" with warning logged
- All planned files modified → return "" (no drift)

Phase 2: Integrate Drift Warnings into Review Prompt (Line ~60)

Call detect_plan_drift() from stage_review()
- Location: Between ruflo review completion (line 79) and native review prompt construction (line 82)
- Code: local drift_warnings; drift_warnings=$(detect_plan_drift "$ARTIFACTS_DIR" "$PROJECT_ROOT" 2>/dev/null || true)

Inject warnings into review_prompt

If [[ -n "$drift_warnings" ]], append section before diff:

review_prompt+="
## Cross-Stage Drift Detected
The following files were planned but not modified by the build:
${drift_warnings}

Reviewer: Verify whether these files were intentionally skipped or represent incomplete implementation.
"

Add event logging
- When drift detected: emit_event "review.drift_detected" "issue=${ISSUE_NUMBER:-0}"
- Allows tracking of plan-build divergence frequency

Phase 3: Testing

Create unit test file scripts/sw-lib-pipeline-stages-review-test.sh
- Test case 1: plan.md lists 2 files, git diff shows 1 → expect 1 drift warning
- Test case 2: plan.md lists 2 files, git diff shows 2 → expect no warnings
- Test case 3: plan.md missing → expect no warnings (fail-open)
- Test case 4: git diff fails → expect no warnings (fail-open)
- Test case 5: plan.md has no ## Files to Modify → expect no warnings
Create integration test
- Setup real git repo with commits
- Create plan.md with 3 files
- Create diff showing only 2 files modified
- Run stage_review()
- Verify review.md contains [DRIFT-WARNING] markers
Run npm test
- Verify no regressions in existing tests
- Ensure new tests discovered and pass

Task Checklist

Task 1: Implement detect_plan_drift() helper function with error handling
Task 2: Unit test — 5 test cases for detect_plan_drift() covering happy path and edge cases
Task 3: Call detect_plan_drift() from stage_review() around line 60
Task 4: Inject drift warnings into review_prompt variable
Task 5: Add event logging for drift detection (emit_event "review.drift_detected")
Task 6: Create integration test with full review stage
Task 7: Run npm test and verify all tests pass
Task 8: Add inline documentation to helper function
Task 9: Verify fail-open behavior with corrupted plan.md (manual test)
Task 10: Final validation — test suite passes, no regressions

Testing Approach

Test Pyramid

Unit tests (70%): 5 tests for detect_plan_drift() function
- Correct filename extraction from plan.md
- Accurate git diff comparison
- Missing plan.md handling
- Git command failures
- Parse error handling
Integration tests (20%): 2 tests with full review stage
- Drift warnings appear in final review prompt
- No warnings when all planned files changed
E2E tests (10%): 1 test for complete pipeline
- Full pipeline: plan → build with partial changes → review detects drift

Critical Paths to Test

Happy Path:

plan.md: src/a.js, src/b.js, src/c.js
git diff: src/a.js, src/b.js (changed)
Expected: [DRIFT-WARNING] Planned file not modified: src/c.js in review prompt

Error Case 1: Missing plan.md

No plan.md in artifacts
Expected: Empty drift warnings, review continues unaffected

Error Case 2: Git failure

git diff command fails
Expected: Fails safely, review stage doesn't break

Edge Case: All planned files modified

plan.md: 2 files
git diff: Both files + extras modified
Expected: No drift warnings (all planned work done)

Definition of Done

✓ detect_plan_drift() function implemented in pipeline-stages-review.sh ✓ Filenames extracted from ## Files to Modify markdown section ✓ Comparison against git diff --name-only HEAD works correctly ✓ [DRIFT-WARNING] injected for each unmodified planned file ✓ Fail-open: returns empty string if plan.md missing or git fails ✓ Integrated into stage_review() before native review prompt ✓ Drift warnings visible in final review output ✓ Event logging: emit_event "review.drift_detected" ✓ Unit tests pass (5 test cases) ✓ Integration tests pass (2 test cases) ✓ npm test passes (no regressions) ✓ Inline documentation added ✓ Manual fail-open verification completed ✓ Issue #417 resolved, Issue #128 superseded

Design Alternatives Considered

Alternative A: Extract from `## Files to Modify` (CHOSEN)

Approach: Parse markdown section for filenames, compare against git diff

Pros:

Simple: straightforward regex/grep pattern matching
Reusable: Works with standardized plan.md format
Low risk: ~20 lines of code
Already validated: plan.md format is stable

Cons:

Fragile to format changes (but low risk)
Doesn't detect deleted planned files

Chosen because: Simplicity + low risk + proven plan.md structure

Alternative B: Use git metadata

Approach: Extract from git log/blame to infer planned vs actual

Pros:

Format-independent
Automatically adapts

Cons:

Complex implementation
Slow (git operations)
Uncertain inference
Higher failure modes

Rejected: Too complex without proportional benefit

Alternative C: Embed in pipeline state

Approach: Save planned file list in pipeline-state.md or JSON

Pros:

Decoupled from plan.md parsing

Cons:

Requires state management changes
More infrastructure
No simplicity gain

Rejected: More infrastructure without benefit

Risk Analysis

Risk	Impact	Mitigation
plan.md parsing breaks on unexpected format	Drift detection fails silently	Fail-open design: return empty string, log warning, review continues
git diff performance on large repos	Slow review stage	Already called in review (line 37), no additional overhead
False positives: flagging legitimate skips	Reviewer confusion	`[DRIFT-WARNING]` provides context, not automatic failure
Review prompt becomes too large	Prompt truncation hides warnings	`guard_prompt_size()` already handles prompt inflation
Git failures (network, permissions)	Review stage breaks	Catch errors, return empty string, don't break review

Success Metrics

Plan-to-build drift is now detectable before review/audit/CQ burn time
Drift warnings help reviewers understand scope completeness
Zero breaking changes to existing review behavior
Issue #417 closes, Issue #128 superseded
Test suite passes without regressions

Pipeline Plan 417

Implementation Plan: Cross-Stage Drift Detector — plan.md vs git diff

Problem Analysis

Current State

Root Cause

Impact

Files to Modify

Implementation Steps

Phase 1: Build the Drift Detection Helper (Lines 50–80)

Phase 2: Integrate Drift Warnings into Review Prompt (Line ~60)

Phase 3: Testing

Task Checklist

Testing Approach

Test Pyramid

Critical Paths to Test

Definition of Done

Design Alternatives Considered

Alternative A: Extract from ## Files to Modify (CHOSEN)

Alternative B: Use git metadata

Alternative C: Embed in pipeline state

Risk Analysis

Success Metrics

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Alternative A: Extract from `## Files to Modify` (CHOSEN)