-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Design 179
Now I have full context. Here's the ADR:
Shipwright pipelines use templates (fast, standard, full, hotfix, autonomous, cost-aware) that determine which stages run and how gates are handled. Today, template selection is manual or daemon-configured — there's no learning from historical outcomes.
The core success pattern engine (scripts/sw-success-patterns.sh, 586 lines) and 26 tests already exist. It captures patterns on pipeline completion, computes TF-IDF keyword similarity, and recommends templates. However, four gaps prevent it from being a closed-loop learning system:
-
Acceptance tracking is dead code —
sw-pipeline.sh:2894readsrecommended_template:from the state file, but nothing writes it after intake. -
Cost always 0 —
success_capture_pattern()hardcodescost_usd: 0(line 242) despite the pipeline computingtotal_costat lines 2680/2921. - No success correlation — Tracks acceptance/rejection but never records whether accepted recommendations led to successful pipelines.
- No issue_type field — Patterns have labels and complexity but lack explicit issue type (bug/feature/etc).
Constraints: Bash 3.2 compatibility required. All JSON manipulation via jq --arg (no string interpolation). Atomic writes via tmp+mv. The success-patterns.json schema must remain backwards compatible (additive fields only, jq // 0 defaults).
Minimal targeted fixes (~80 lines across 3 files). All changes are additive JSON fields — no schema migration, no new dependencies.
┌──────────────────────────────────────────────────────────────────┐
│ sw-pipeline.sh │
│ │
│ ┌─────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Intake │───▶│ Display Rec │───▶│ Persist recommended_ │ │
│ │ Stage │ │ (existing) │ │ template to state file │ │
│ └─────────┘ └──────────────┘ └────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Pipeline Completion │ │
│ │ 1. Compute total_cost → export PIPELINE_COST_USD │ │
│ │ 2. memory_finalize_pipeline() → capture_pattern() │ │
│ │ 3. success_track_acceptance(recommended, actual) │ │
│ │ 4. success_track_correlation(recommended, actual, pass?) │ │
│ └──────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ sw-success-patterns.sh │
│ │
│ ┌──────────────────┐ ┌─────────────────────┐ │
│ │ capture_pattern() │ │ recommend_template() │ │
│ │ +issue_type field │ │ (TF-IDF similarity) │ │
│ │ +PIPELINE_COST_USD│ └─────────────────────┘ │
│ └──────────────────┘ │
│ ┌──────────────────────┐ ┌──────────────────┐ │
│ │track_correlation() │ │ show_stats() │ │
│ │ NEW: accepted+pass → │ │ +correlation_rate │ │
│ │ increment succeeded │ └──────────────────┘ │
│ └──────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ ~/.shipwright/memory/<repo-hash>/ │
│ success-patterns.json │
│ │
│ { "version": 1, │
│ "patterns": [ { ..., "cost_usd": 5.23, │
│ "issue_type": "feature" } ], │
│ "stats": { ..., "recommendations_succeeded": 12 } } │
└──────────────────────────────────────────────────────────────────┘
// Gap 1: State file persistence (in sw-pipeline.sh, after intake stage) // Write "recommended_template: <template>" to STATE_FILE // Read by existing grep at line 2894 // Gap 2: Cost capture (sw-success-patterns.sh) success_capture_pattern(state_file: string, artifacts_dir: string): void // Reads $PIPELINE_COST_USD env var (set by sw-pipeline.sh before calling) // Falls back to 0 if unset — backwards compatible // Gap 3: Correlation tracking (NEW function) success_track_correlation( recommended_template: string, // what was recommended actual_template: string, // what was used outcome: "success" | "failure" // pipeline result ): void // If recommended == actual AND outcome == "success": // stats.recommendations_succeeded += 1 // Emits success.correlation event // Gap 4: Issue type extraction (within existing capture_pattern) // Reads .issue_type from intake-metadata.json // Adds "issue_type": "feature"|"bug"|... to pattern JSON // Defaults to null if not present // Updated stats display success_show_stats(): void // Now includes: // Correlation rate: succeeded/accepted * 100%
Pipeline Start:
intake completes
→ success_recommend_template(goal, labels, complexity)
→ returns {template, confidence, rationale} (or empty)
→ display recommendation box (existing)
→ NEW: write "recommended_template: <template>" to state file
Pipeline Completion:
compute total_cost (line ~2680 or ~2921)
→ NEW: export PIPELINE_COST_USD=$total_cost
memory_finalize_pipeline()
→ success_capture_pattern(state, artifacts)
→ reads PIPELINE_COST_USD (replaces hardcoded 0)
→ reads issue_type from intake-metadata.json
success_track_acceptance(recommended, actual) — existing, now works
→ NEW: success_track_correlation(recommended, actual, outcome)
→ if accepted AND succeeded: stats.recommendations_succeeded += 1
→ emit success.correlation event
-
State file write failure (Gap 1):
sedappend wrapped in|| true— recommendation display still works, acceptance tracking degrades gracefully to no-op (same as current behavior). -
Missing PIPELINE_COST_USD (Gap 2):
${PIPELINE_COST_USD:-0}— falls back to current behavior. -
Correlation function failure (Gap 3): Called with
2>/dev/null || true— pipeline completion is never blocked. -
Missing intake-metadata.json (Gap 4):
jq -r '.issue_type // ""'with2>/dev/null || true— field remains null.
All error handling follows the existing pattern: non-critical operations never block the pipeline.
-
Embedding-based similarity matching — Pros: better semantic understanding of issue descriptions / Cons: adds model dependency (need an embedding model at query time), overkill for categorical+keyword matching, violates the <100ms query constraint for local-only operation. Shell implementation would require an external service call.
-
SQLite storage instead of JSON — Pros: proper indexing, better query performance at scale / Cons: adds binary dependency, current 200-pattern FIFO cap keeps JSON fast enough,
jqqueries are <50ms on 200 records. Would revisit if cap increases to 1000+. -
Full rewrite in Node/TypeScript — Pros: matches project's Node toolchain, better testability / Cons: shell integration is the natural fit for pipeline scripts, existing 26 tests pass, rewrite risk for no functional gain. The pattern engine is a pipeline-internal concern, not a user-facing API.
- Files to create: None
-
Files to modify:
-
scripts/sw-success-patterns.sh— Addissue_typeextraction insuccess_capture_pattern(), read$PIPELINE_COST_USDinstead of hardcoded 0, addsuccess_track_correlation()function, updatesuccess_show_stats()with correlation rate -
scripts/sw-pipeline.sh— Persistrecommended_template:to state file after intake, exportPIPELINE_COST_USDbefore finalize, callsuccess_track_correlation()at completion -
scripts/sw-success-patterns-test.sh— 5 new tests (correlation accepted+success, accepted+failure, rejected+success, cost capture, issue_type extraction) -
config/event-schema.json— Addsuccess.correlationevent type
-
-
Dependencies: None (all existing:
jq,sed,grep) -
Risk areas:
-
Cost timing:
memory_finalize_pipeline(line 2888) runs beforetotal_costis computed (line 2913+). Must either move the export before finalize, or reorder the calls. The env var approach (export PIPELINE_COST_USD) requires the export to happen beforememory_finalize_pipelinecallssuccess_capture_pattern. -
State file format: Adding
recommended_template:is additive and grep-based — low risk, but must ensure no trailing whitespace or quoting issues.
-
Cost timing:
- After intake,
grep 'recommended_template:' .claude/pipeline-state.mdreturns the recommended template (when patterns exist) -
success_track_acceptanceis called with a non-empty_rec_template(existing dead code path now executes) - Captured patterns have
cost_usd > 0whenPIPELINE_COST_USDis set - Captured patterns have
issue_typepopulated whenintake-metadata.jsoncontains it -
success_track_correlationincrementsrecommendations_succeededonly when accepted AND outcome is success -
success_show_statsdisplays correlation rate (succeeded/accepted) -
success.correlationevent emitted at pipeline completion - All 26 existing tests still pass
- 5 new tests pass (correlation x3, cost, issue_type)
-
npm testshows no regressions
Forward (additive only):
// success-patterns.json — stats object gains: "recommendations_succeeded": 0 // success-patterns.json — each pattern object gains: "issue_type": "feature" // nullable, from intake-metadata.json // success-patterns.json — each pattern object changes: "cost_usd": 5.23 // was always 0, now reads PIPELINE_COST_USD
Rollback: git revert. Old code ignores new fields via jq // 0 and // null defaults. No data migration needed in either direction.
- All JSON writes use atomic tmp+mv (
${sp_file}.tmp.$$+mv) - Stats use
+= 1via jq (idempotent per invocation, not per event — acceptable since pipeline completion runs exactly once) - Pattern IDs are sha256(goal+template+timestamp) — duplicates from re-runs produce distinct IDs, FIFO cap prevents unbounded growth
- Correlation tracking is stateless per call — repeated calls with same args increment counters (pipeline completion should only call once)
-
git revert <commit>— removes all changes - Existing
success-patterns.jsonfiles remain valid — new fields ignored by old code -
recommendations_succeededcounter becomes orphaned but harmless (jq// 0in old stats display) - No external state to clean up