Pipeline Design 179

ezigus edited this page Mar 17, 2026 · 2 revisions

Now I have full context. Here's the ADR:

Design: Success pattern learning and template recommendation engine

Context

Shipwright pipelines use templates (fast, standard, full, hotfix, autonomous, cost-aware) that determine which stages run and how gates are handled. Today, template selection is manual or daemon-configured — there's no learning from historical outcomes.

The core success pattern engine (scripts/sw-success-patterns.sh, 586 lines) and 26 tests already exist. It captures patterns on pipeline completion, computes TF-IDF keyword similarity, and recommends templates. However, four gaps prevent it from being a closed-loop learning system:

Acceptance tracking is dead code — sw-pipeline.sh:2894 reads recommended_template: from the state file, but nothing writes it after intake.
Cost always 0 — success_capture_pattern() hardcodes cost_usd: 0 (line 242) despite the pipeline computing total_cost at lines 2680/2921.
No success correlation — Tracks acceptance/rejection but never records whether accepted recommendations led to successful pipelines.
No issue_type field — Patterns have labels and complexity but lack explicit issue type (bug/feature/etc).

Constraints: Bash 3.2 compatibility required. All JSON manipulation via jq --arg (no string interpolation). Atomic writes via tmp+mv. The success-patterns.json schema must remain backwards compatible (additive fields only, jq // 0 defaults).

Decision

Minimal targeted fixes (~80 lines across 3 files). All changes are additive JSON fields — no schema migration, no new dependencies.

Component Diagram

┌──────────────────────────────────────────────────────────────────┐
│ sw-pipeline.sh │
│ │
│ ┌─────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Intake │───▶│ Display Rec │───▶│ Persist recommended_ │ │
│ │ Stage │ │ (existing) │ │ template to state file │ │
│ └─────────┘ └──────────────┘ └────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Pipeline Completion │ │
│ │ 1. Compute total_cost → export PIPELINE_COST_USD │ │
│ │ 2. memory_finalize_pipeline() → capture_pattern() │ │
│ │ 3. success_track_acceptance(recommended, actual) │ │
│ │ 4. success_track_correlation(recommended, actual, pass?) │ │
│ └──────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
 │
 ▼
┌──────────────────────────────────────────────────────────────────┐
│ sw-success-patterns.sh │
│ │
│ ┌──────────────────┐ ┌─────────────────────┐ │
│ │ capture_pattern() │ │ recommend_template() │ │
│ │ +issue_type field │ │ (TF-IDF similarity) │ │
│ │ +PIPELINE_COST_USD│ └─────────────────────┘ │
│ └──────────────────┘ │
│ ┌──────────────────────┐ ┌──────────────────┐ │
│ │track_correlation() │ │ show_stats() │ │
│ │ NEW: accepted+pass → │ │ +correlation_rate │ │
│ │ increment succeeded │ └──────────────────┘ │
│ └──────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
 │
 ▼
┌──────────────────────────────────────────────────────────────────┐
│ ~/.shipwright/memory/<repo-hash>/ │
│ success-patterns.json │
│ │
│ { "version": 1, │
│ "patterns": [ { ..., "cost_usd": 5.23, │
│ "issue_type": "feature" } ], │
│ "stats": { ..., "recommendations_succeeded": 12 } } │
└──────────────────────────────────────────────────────────────────┘

Interface Contracts

// Gap 1: State file persistence (in sw-pipeline.sh, after intake stage)
// Write "recommended_template: <template>" to STATE_FILE
// Read by existing grep at line 2894
// Gap 2: Cost capture (sw-success-patterns.sh)
success_capture_pattern(state_file: string, artifacts_dir: string): void
// Reads $PIPELINE_COST_USD env var (set by sw-pipeline.sh before calling)
// Falls back to 0 if unset — backwards compatible
// Gap 3: Correlation tracking (NEW function)
success_track_correlation(
 recommended_template: string, // what was recommended
 actual_template: string, // what was used 
 outcome: "success" | "failure" // pipeline result
): void
// If recommended == actual AND outcome == "success": 
// stats.recommendations_succeeded += 1
// Emits success.correlation event
// Gap 4: Issue type extraction (within existing capture_pattern)
// Reads .issue_type from intake-metadata.json
// Adds "issue_type": "feature"|"bug"|... to pattern JSON
// Defaults to null if not present
// Updated stats display
success_show_stats(): void
// Now includes:
// Correlation rate: succeeded/accepted * 100%

Data Flow

Pipeline Start:
 intake completes
 → success_recommend_template(goal, labels, complexity)
 → returns {template, confidence, rationale} (or empty)
 → display recommendation box (existing)
 → NEW: write "recommended_template: <template>" to state file
Pipeline Completion:
 compute total_cost (line ~2680 or ~2921)
 → NEW: export PIPELINE_COST_USD=$total_cost
 memory_finalize_pipeline()
 → success_capture_pattern(state, artifacts)
 → reads PIPELINE_COST_USD (replaces hardcoded 0)
 → reads issue_type from intake-metadata.json
 success_track_acceptance(recommended, actual) — existing, now works
 → NEW: success_track_correlation(recommended, actual, outcome)
 → if accepted AND succeeded: stats.recommendations_succeeded += 1
 → emit success.correlation event

Error Boundaries

State file write failure (Gap 1): sed append wrapped in || true — recommendation display still works, acceptance tracking degrades gracefully to no-op (same as current behavior).
Missing PIPELINE_COST_USD (Gap 2): ${PIPELINE_COST_USD:-0} — falls back to current behavior.
Correlation function failure (Gap 3): Called with 2>/dev/null || true — pipeline completion is never blocked.
Missing intake-metadata.json (Gap 4): jq -r '.issue_type // ""' with 2>/dev/null || true — field remains null.

All error handling follows the existing pattern: non-critical operations never block the pipeline.

Alternatives Considered

Embedding-based similarity matching — Pros: better semantic understanding of issue descriptions / Cons: adds model dependency (need an embedding model at query time), overkill for categorical+keyword matching, violates the <100ms query constraint for local-only operation. Shell implementation would require an external service call.
SQLite storage instead of JSON — Pros: proper indexing, better query performance at scale / Cons: adds binary dependency, current 200-pattern FIFO cap keeps JSON fast enough, jq queries are <50ms on 200 records. Would revisit if cap increases to 1000+.
Full rewrite in Node/TypeScript — Pros: matches project's Node toolchain, better testability / Cons: shell integration is the natural fit for pipeline scripts, existing 26 tests pass, rewrite risk for no functional gain. The pattern engine is a pipeline-internal concern, not a user-facing API.

Implementation Plan

Files to create: None
Files to modify:
- scripts/sw-success-patterns.sh — Add issue_type extraction in success_capture_pattern(), read $PIPELINE_COST_USD instead of hardcoded 0, add success_track_correlation() function, update success_show_stats() with correlation rate
- scripts/sw-pipeline.sh — Persist recommended_template: to state file after intake, export PIPELINE_COST_USD before finalize, call success_track_correlation() at completion
- scripts/sw-success-patterns-test.sh — 5 new tests (correlation accepted+success, accepted+failure, rejected+success, cost capture, issue_type extraction)
- config/event-schema.json — Add success.correlation event type
Dependencies: None (all existing: jq, sed, grep)
Risk areas:
- Cost timing: memory_finalize_pipeline (line 2888) runs before total_cost is computed (line 2913+). Must either move the export before finalize, or reorder the calls. The env var approach (export PIPELINE_COST_USD) requires the export to happen before memory_finalize_pipeline calls success_capture_pattern.
- State file format: Adding recommended_template: is additive and grep-based — low risk, but must ensure no trailing whitespace or quoting issues.

Validation Criteria

After intake, grep 'recommended_template:' .claude/pipeline-state.md returns the recommended template (when patterns exist)
success_track_acceptance is called with a non-empty _rec_template (existing dead code path now executes)
Captured patterns have cost_usd > 0 when PIPELINE_COST_USD is set
Captured patterns have issue_type populated when intake-metadata.json contains it
success_track_correlation increments recommendations_succeeded only when accepted AND outcome is success
success_show_stats displays correlation rate (succeeded/accepted)
success.correlation event emitted at pipeline completion
All 26 existing tests still pass
5 new tests pass (correlation x3, cost, issue_type)
npm test shows no regressions

Schema Changes

Forward (additive only):

// success-patterns.json — stats object gains:
"recommendations_succeeded": 0
// success-patterns.json — each pattern object gains:
"issue_type": "feature" // nullable, from intake-metadata.json
// success-patterns.json — each pattern object changes:
"cost_usd": 5.23 // was always 0, now reads PIPELINE_COST_USD

Rollback: git revert. Old code ignores new fields via jq // 0 and // null defaults. No data migration needed in either direction.

Idempotency Strategy

All JSON writes use atomic tmp+mv (${sp_file}.tmp.$$ + mv)
Stats use += 1 via jq (idempotent per invocation, not per event — acceptable since pipeline completion runs exactly once)
Pattern IDs are sha256(goal+template+timestamp) — duplicates from re-runs produce distinct IDs, FIFO cap prevents unbounded growth
Correlation tracking is stateless per call — repeated calls with same args increment counters (pipeline completion should only call once)

Rollback Plan

git revert <commit> — removes all changes
Existing success-patterns.json files remain valid — new fields ignored by old code
recommendations_succeeded counter becomes orphaned but harmless (jq // 0 in old stats display)
No external state to clean up

Pipeline Design 179

Design: Success pattern learning and template recommendation engine

Context

Decision

Component Diagram

Interface Contracts

Data Flow

Error Boundaries

Alternatives Considered

Implementation Plan

Validation Criteria

Schema Changes

Idempotency Strategy

Rollback Plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally