Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 417

ezigus edited this page Apr 23, 2026 · 1 revision

I'll provide the comprehensive implementation plan directly below:


Implementation Plan: Cross-Stage Drift Detector — plan.md vs git diff

Problem Analysis

Current State

When the build loop completes, the pipeline immediately enters stage_review() function. However, there is no verification that the code changes actually match what was planned. A file listed in the plan could be completely untouched, or a file could be modified that was never planned—but neither is caught before burning review/audit/CQ quota.

Root Cause

  • Missing detection: stage_review() (line 6 in pipeline-stages-review.sh) runs immediately after build with zero consistency checking
  • Silent drift: Plan-to-build divergence goes undetected until full review cycle
  • Class of failure: Repeatable pipeline failure mode: planned work is incomplete, or out-of-scope changes introduce unplanned complexity

Impact

  • Wasted AI review cycles on unplanned changes
  • Missed detection of incomplete builds (planned files untouched)
  • No early warning before heavy processing

Files to Modify

  • scripts/lib/pipeline-stages-review.sh — Add drift detection helper + integration (~38 lines total)

Implementation Steps

Phase 1: Build the Drift Detection Helper (Lines 50–80)

  1. Add detect_plan_drift() function before stage_review()

    • Input: 1ドル=artifacts_dir, 2ドル=project_root
    • Output: String of drift warnings (or empty if no drift)
    • Logic:
      • Read $artifacts_dir/plan.md
      • Extract filenames from ## Files to Modify section (parse markdown bullets)
      • Get git diff --name-only HEAD (actual changed files)
      • Find planned files NOT in actual changes
      • Format as [DRIFT-WARNING] Planned file not modified: <filepath> for each untouched file
    • Error handling: Return empty string (fail-open) if plan.md missing, parse fails, or git fails
  2. Handle edge cases

    • plan.md missing → return "" (no warnings, continue review)
    • ## Files to Modify not found → return "" (no warnings)
    • git diff fails → return "" with warning logged
    • All planned files modified → return "" (no drift)

Phase 2: Integrate Drift Warnings into Review Prompt (Line ~60)

  1. Call detect_plan_drift() from stage_review()

    • Location: Between ruflo review completion (line 79) and native review prompt construction (line 82)
    • Code: local drift_warnings; drift_warnings=$(detect_plan_drift "$ARTIFACTS_DIR" "$PROJECT_ROOT" 2>/dev/null || true)
  2. Inject warnings into review_prompt

    • If [[ -n "$drift_warnings" ]], append section before diff:
      review_prompt+="
      ## Cross-Stage Drift Detected
      The following files were planned but not modified by the build:
      ${drift_warnings}
      
      Reviewer: Verify whether these files were intentionally skipped or represent incomplete implementation.
      "
  3. Add event logging

    • When drift detected: emit_event "review.drift_detected" "issue=${ISSUE_NUMBER:-0}"
    • Allows tracking of plan-build divergence frequency

Phase 3: Testing

  1. Create unit test file scripts/sw-lib-pipeline-stages-review-test.sh

    • Test case 1: plan.md lists 2 files, git diff shows 1 → expect 1 drift warning
    • Test case 2: plan.md lists 2 files, git diff shows 2 → expect no warnings
    • Test case 3: plan.md missing → expect no warnings (fail-open)
    • Test case 4: git diff fails → expect no warnings (fail-open)
    • Test case 5: plan.md has no ## Files to Modify → expect no warnings
  2. Create integration test

    • Setup real git repo with commits
    • Create plan.md with 3 files
    • Create diff showing only 2 files modified
    • Run stage_review()
    • Verify review.md contains [DRIFT-WARNING] markers
  3. Run npm test

    • Verify no regressions in existing tests
    • Ensure new tests discovered and pass

Task Checklist

  • Task 1: Implement detect_plan_drift() helper function with error handling
  • Task 2: Unit test — 5 test cases for detect_plan_drift() covering happy path and edge cases
  • Task 3: Call detect_plan_drift() from stage_review() around line 60
  • Task 4: Inject drift warnings into review_prompt variable
  • Task 5: Add event logging for drift detection (emit_event "review.drift_detected")
  • Task 6: Create integration test with full review stage
  • Task 7: Run npm test and verify all tests pass
  • Task 8: Add inline documentation to helper function
  • Task 9: Verify fail-open behavior with corrupted plan.md (manual test)
  • Task 10: Final validation — test suite passes, no regressions

Testing Approach

Test Pyramid

  • Unit tests (70%): 5 tests for detect_plan_drift() function

    • Correct filename extraction from plan.md
    • Accurate git diff comparison
    • Missing plan.md handling
    • Git command failures
    • Parse error handling
  • Integration tests (20%): 2 tests with full review stage

    • Drift warnings appear in final review prompt
    • No warnings when all planned files changed
  • E2E tests (10%): 1 test for complete pipeline

    • Full pipeline: plan → build with partial changes → review detects drift

Critical Paths to Test

Happy Path:

  • plan.md: src/a.js, src/b.js, src/c.js
  • git diff: src/a.js, src/b.js (changed)
  • Expected: [DRIFT-WARNING] Planned file not modified: src/c.js in review prompt

Error Case 1: Missing plan.md

  • No plan.md in artifacts
  • Expected: Empty drift warnings, review continues unaffected

Error Case 2: Git failure

  • git diff command fails
  • Expected: Fails safely, review stage doesn't break

Edge Case: All planned files modified

  • plan.md: 2 files
  • git diff: Both files + extras modified
  • Expected: No drift warnings (all planned work done)

Definition of Done

detect_plan_drift() function implemented in pipeline-stages-review.sh ✓ Filenames extracted from ## Files to Modify markdown section ✓ Comparison against git diff --name-only HEAD works correctly ✓ [DRIFT-WARNING] injected for each unmodified planned file ✓ Fail-open: returns empty string if plan.md missing or git fails ✓ Integrated into stage_review() before native review prompt ✓ Drift warnings visible in final review output ✓ Event logging: emit_event "review.drift_detected" ✓ Unit tests pass (5 test cases) ✓ Integration tests pass (2 test cases) ✓ npm test passes (no regressions) ✓ Inline documentation added ✓ Manual fail-open verification completed ✓ Issue #417 resolved, Issue #128 superseded

Design Alternatives Considered

Alternative A: Extract from ## Files to Modify (CHOSEN)

Approach: Parse markdown section for filenames, compare against git diff

Pros:

  • Simple: straightforward regex/grep pattern matching
  • Reusable: Works with standardized plan.md format
  • Low risk: ~20 lines of code
  • Already validated: plan.md format is stable

Cons:

  • Fragile to format changes (but low risk)
  • Doesn't detect deleted planned files

Chosen because: Simplicity + low risk + proven plan.md structure


Alternative B: Use git metadata

Approach: Extract from git log/blame to infer planned vs actual

Pros:

  • Format-independent
  • Automatically adapts

Cons:

  • Complex implementation
  • Slow (git operations)
  • Uncertain inference
  • Higher failure modes

Rejected: Too complex without proportional benefit


Alternative C: Embed in pipeline state

Approach: Save planned file list in pipeline-state.md or JSON

Pros:

  • Decoupled from plan.md parsing

Cons:

  • Requires state management changes
  • More infrastructure
  • No simplicity gain

Rejected: More infrastructure without benefit

Risk Analysis

Risk Impact Mitigation
plan.md parsing breaks on unexpected format Drift detection fails silently Fail-open design: return empty string, log warning, review continues
git diff performance on large repos Slow review stage Already called in review (line 37), no additional overhead
False positives: flagging legitimate skips Reviewer confusion [DRIFT-WARNING] provides context, not automatic failure
Review prompt becomes too large Prompt truncation hides warnings guard_prompt_size() already handles prompt inflation
Git failures (network, permissions) Review stage breaks Catch errors, return empty string, don't break review

Success Metrics

  • Plan-to-build drift is now detectable before review/audit/CQ burn time
  • Drift warnings help reviewers understand scope completeness
  • Zero breaking changes to existing review behavior
  • Issue #417 closes, Issue #128 superseded
  • Test suite passes without regressions

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /