Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 175

ezigus edited this page Mar 17, 2026 · 1 revision

Implementation Plan: Issue Pre-flight Validation with Actionability Scoring

Socratic Design Refinement

Requirements Clarity

Minimum viable change: A validate_issue_quality() function in the intake stage that scores issue content on heuristics (description length, code references, acceptance criteria presence, vagueness detection) and blocks issues scoring below a configurable threshold (default: 60). Emits validation results to events.jsonl.

Implicit requirements:

  • Must not break existing pipelines that use --goal instead of --issue (goal-only pipelines skip validation)
  • Must integrate cleanly with the existing stage_intake() flow in pipeline-stages-intake.sh
  • Feedback must be posted back to the GitHub issue so the author knows what to fix
  • Must work without gh CLI (graceful degradation when GitHub is unavailable)

Acceptance criteria (from issue + inferred):

  1. Issues with minimal content (< 50 chars body) score low and are blocked
  2. Bug issues claiming code problems are checked for file/code references
  3. Vague phrases ("make it better", "improve performance", "fix things") are detected and penalized
  4. Score 0-100 with configurable threshold (default: 60)
  5. Actionable feedback posted to GitHub issue explaining what's missing
  6. Validation results emitted to events.jsonl
  7. Bypass mechanism for urgent/override situations (label or flag)

Design Alternatives

Approach A: Standalone validation module (CHOSEN)

  • New file scripts/lib/issue-validation.sh with pure scoring functions
  • Called from stage_intake() after issue metadata is fetched
  • Pros: Clean separation, independently testable, minimal blast radius
  • Cons: One more file to source

Approach B: Inline in stage_intake()

  • Add validation logic directly into pipeline-stages-intake.sh
  • Pros: No new files
  • Cons: Bloats an already 116-line function, harder to test individual heuristics

Approach C: AI-powered validation (call Claude to score)

  • Use ai_run_json() to have Claude evaluate issue quality
  • Pros: More nuanced understanding of requirements
  • Cons: Adds latency + cost to every pipeline start, defeats the purpose of preventing wasted budget

Decision: Approach A — standalone module. Minimal blast radius, independently testable, follows the existing scripts/lib/*.sh module pattern with double-source guards.

Risk Assessment

  • False positives blocking good issues: Mitigated by tunable threshold and bypass label (skip-validation)
  • Breaking existing --goal pipelines: Goal-only pipelines skip validation entirely (no ISSUE_BODY to validate)
  • Breaking daemon auto-processing: Daemon spawns pipelines with --issue; validation will apply. If an issue is blocked, the daemon should mark it and move on (not retry endlessly). We'll emit a specific event type for this.
  • Bash 3.2 compatibility: No associative arrays, no ${var,,} — use tr '[:upper:]' '[:lower:]' for case conversion

Dependency Analysis

  • Depends on: helpers.sh (emit_event, output helpers), pipeline-github.sh (gh_comment_issue for feedback)
  • Called by: pipeline-stages-intake.sh (stage_intake function)
  • No circular dependency risk — new module is leaf-level, only depends on helpers

Files to Modify

New Files

  1. scripts/lib/issue-validation.sh — Core validation + scoring logic
  2. scripts/sw-issue-validation-test.sh — Test suite for validation

Modified Files

  1. scripts/lib/pipeline-stages-intake.sh — Call validation after issue fetch
  2. config/event-schema.json — Add new event types for validation
  3. scripts/sw-pipeline-test.sh — Add integration test for validation in pipeline flow

Implementation Steps

Step 1: Create scripts/lib/issue-validation.sh

New module with double-source guard pattern. Functions:

validate_issue_quality(issue_body, issue_title, issue_labels) → returns 0 (pass) or 2 (fail)
 Sets globals: VALIDATION_SCORE, VALIDATION_FEEDBACK
_score_description_length(body) → 0-25 points
_score_structure(body) → 0-25 points
_score_specificity(body) → 0-25 points
_score_code_references(body, title) → 0-25 points
_detect_vague_phrases(body) → penalty (0 to -20)

Scoring breakdown (100 points max):

  • Description length (0-25): < 50 chars = 0, 50-150 = 10, 150-500 = 20, 500+ = 25
  • Structure (0-25): Has headings = +5, has bullet points/numbered lists = +5, has acceptance criteria section = +10, has code blocks = +5
  • Specificity (0-25): References specific files/paths = +10, mentions function/class names = +5, has error messages/stack traces = +5, has reproduction steps = +5
  • Code references (0-25): Contains file extensions (.js, .sh, .ts, etc.) = +10, references line numbers = +5, has code fences = +5, mentions specific directories = +5
  • Vagueness penalty (-20 to 0): Each vague phrase detected = -5 (capped at -20). Vague phrases: "make it better", "improve performance", "fix things", "clean up", "refactor everything", "make it work", "update stuff", "do something about"

Bypass conditions (skip validation, return score=100):

  • Issue has label skip-validation or hotfix or urgent
  • Pipeline started with --skip-gates flag
  • SHIPWRIGHT_SKIP_VALIDATION=1 env var

Threshold: Read from pipeline config intake.validation_threshold, default 60.

Step 2: Integrate into stage_intake()

After issue metadata is fetched (line 48 in current pipeline-stages-intake.sh) and before task type detection (line 51), insert:

# Issue quality validation
if [[ -n "$ISSUE_BODY" ]]; then
 if ! validate_issue_quality "$ISSUE_BODY" "$GOAL" "${ISSUE_LABELS:-}"; then
 # Post feedback to GitHub issue
 if [[ -n "$ISSUE_NUMBER" ]]; then
 local feedback_body="## Issue Validation Failed

**Actionability Score: ${VALIDATION_SCORE}/100** (threshold: ${VALIDATION_THRESHOLD})

${VALIDATION_FEEDBACK}

---
_Please update this issue with the requested information and re-apply the pipeline label._
_Generated by \`shipwright pipeline\` intake validation at $(now_iso)_"
 gh_comment_issue "$ISSUE_NUMBER" "$feedback_body"
 gh_add_labels "$ISSUE_NUMBER" "needs-info"
 fi
 emit_event "intake.validation_failed" \
 "issue=${ISSUE_NUMBER:-0}" \
 "score=$VALIDATION_SCORE" \
 "threshold=$VALIDATION_THRESHOLD"
 error "Issue #${ISSUE_NUMBER} failed actionability check (score: ${VALIDATION_SCORE}/${VALIDATION_THRESHOLD})"
 return 2
 fi
 emit_event "intake.validation_passed" \
 "issue=${ISSUE_NUMBER:-0}" \
 "score=$VALIDATION_SCORE"
fi

Step 3: Update event schema

Add to config/event-schema.json:

"intake.validation_passed": {
 "required": ["issue", "score"],
 "optional": ["threshold"]
},
"intake.validation_failed": {
 "required": ["issue", "score"],
 "optional": ["threshold", "feedback"]
}

Step 4: Write tests in scripts/sw-issue-validation-test.sh

Test cases:

  1. Empty body → score 0, validation fails
  2. Minimal body (< 50 chars) → low score, fails
  3. Well-structured issue with acceptance criteria → high score, passes
  4. Vague issue ("make it better") → penalized, likely fails
  5. Bug issue with code references → scores well on specificity
  6. Issue with skip-validation label → bypasses, returns 100
  7. Threshold override via config works
  8. Feedback message contains actionable suggestions

Step 5: Add pipeline integration test

In sw-pipeline-test.sh, add test_intake_validation_blocks_low_quality that:

  1. Sets up mock gh to return an issue with minimal body
  2. Runs pipeline with --issue 999
  3. Asserts exit code 2 and output contains "failed actionability check"

Task Checklist

  • Task 1: Create scripts/lib/issue-validation.sh with double-source guard, scoring functions, and validate_issue_quality() entry point
  • Task 2: Implement _score_description_length() — length-based scoring (0-25 points)
  • Task 3: Implement _score_structure() — structural quality detection (headings, lists, acceptance criteria, code blocks)
  • Task 4: Implement _score_specificity() — file references, function names, error messages, repro steps
  • Task 5: Implement _score_code_references() — file extensions, line numbers, code fences, directories
  • Task 6: Implement _detect_vague_phrases() — pattern matching for vague/non-actionable language
  • Task 7: Implement bypass logic (labels, flags, env vars) and threshold configuration
  • Task 8: Implement _build_validation_feedback() — generates human-readable feedback listing what's missing
  • Task 9: Integrate validate_issue_quality() into stage_intake() in pipeline-stages-intake.sh
  • Task 10: Add intake.validation_passed and intake.validation_failed to config/event-schema.json
  • Task 11: Create scripts/sw-issue-validation-test.sh with unit tests for all scoring functions
  • Task 12: Add integration test test_intake_validation_blocks_low_quality to sw-pipeline-test.sh
  • Task 13: Run full test suite (npm test) and fix any regressions

Testing Approach

Unit tests (sw-issue-validation-test.sh):

  • Source issue-validation.sh directly
  • Test each _score_* function with known inputs
  • Test composite scoring with realistic issue bodies
  • Test bypass conditions
  • Test feedback generation

Integration tests (sw-pipeline-test.sh):

  • Modify mock gh to return issue bodies with varying quality
  • Verify pipeline blocks on low-quality issues (exit 2)
  • Verify pipeline proceeds on high-quality issues
  • Verify events are emitted

Manual verification:

  • Run npm test to ensure no regressions
  • Verify with grep that events appear in correct format

Definition of Done

  • validate_issue_quality() scores issues 0-100 using heuristics
  • Issues scoring below threshold (default 60) are blocked in intake
  • Actionable feedback is posted to GitHub issue explaining deficiencies
  • Vague phrases are detected and penalized
  • Code/file references are rewarded in scoring
  • Bypass via skip-validation label, --skip-gates flag, or env var works
  • Events intake.validation_passed and intake.validation_failed emitted to events.jsonl
  • All new tests pass
  • All existing tests pass (npm test)
  • Bash 3.2 compatible (no associative arrays, no ${var,,})

Endpoint Specification

Not applicable — this is an internal pipeline stage, not an API endpoint.

Error Codes

  • Exit 0: Validation passed (score >= threshold)
  • Exit 2: Validation failed (score < threshold) — uses exit 2 per helpers.sh convention for "check condition failed"

Rate Limiting

Not applicable — runs once per pipeline invocation during intake.

Versioning

No API versioning needed. Event schema versioned via config/event-schema.json version field (currently "1").

User Stories

Primary: As a pipeline operator, I want issues to be validated before burning pipeline budget, so that low-quality issues are caught early with actionable feedback instead of wasting 12+ minutes of build time.

Secondary: As an issue author, I want clear feedback when my issue lacks actionable detail, so that I can improve it and re-submit without guessing what's missing.

Edge Cases from User Perspective

  1. Empty state: Issue has no body at all (just a title) → Score 0, feedback says "Issue has no description"
  2. Goal-only pipeline: --goal "Fix the thing" with no --issue → Validation skipped entirely (no issue body to validate)
  3. Override state: Urgent hotfix needs to bypass → hotfix label or --skip-gates flag bypasses validation
  4. Rich issue that scores perfectly: Issue with acceptance criteria, code references, reproduction steps → Score 90+, passes immediately
  5. Borderline issue (score ~60): Issue with some structure but missing specifics → Passes at default threshold, but operator can raise threshold

Baseline Metrics

Not applicable — this is a new feature, not a performance optimization. No existing metrics to baseline against.

Optimization Targets

  • Validation should complete in < 1 second (pure bash string matching, no AI calls)
  • Zero additional API calls beyond existing GitHub issue fetch

Profiling Strategy

Not applicable — pure bash heuristics with no performance concerns.

Benchmark Plan

Not applicable — validation is instantaneous string processing.

Alternatives Considered

  1. AI-powered validation (Claude scores the issue): More nuanced but adds ~30s latency and ~0ドル.05 per validation. Defeats the goal of preventing wasted budget. Rejected.
  2. Inline validation in stage_intake(): Simpler but creates a 200+ line function, harder to test. Rejected in favor of standalone module.
  3. External webhook/GitHub Action: Validates before pipeline starts. More decoupled but requires infrastructure setup and doesn't integrate with pipeline event system. Rejected.

Risk Analysis

Risk Impact Mitigation
False positives blocking good issues Medium — blocks valid work Configurable threshold + bypass labels + --skip-gates
Daemon retry loops on blocked issues High — wastes daemon slots Emit specific event; daemon should skip (not retry) blocked issues
Bash 3.2 incompatibility High — breaks on macOS No associative arrays; use tr for case conversion; test on Bash 3.2
Breaking existing tests Medium — CI failure Run full suite before PR

Task Decomposition with Dependencies

  1. Create issue-validation.sh module skeleton (blocks all others)
  2. Implement _score_description_length() (blocked by 1)
  3. Implement _score_structure() (blocked by 1)
  4. Implement _score_specificity() (blocked by 1)
  5. Implement _score_code_references() (blocked by 1)
  6. Implement _detect_vague_phrases() (blocked by 1)
  7. Implement bypass logic + threshold config (blocked by 1)
  8. Implement _build_validation_feedback() (blocked by 2-6)
  9. Implement validate_issue_quality() entry point (blocked by 2-8)
  10. Update event schema (independent)
  11. Integrate into stage_intake() (blocked by 9, 10)
  12. Write unit tests (blocked by 9)
  13. Write integration test (blocked by 11)
  14. Run full test suite (blocked by 12, 13)

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /