Pipeline Plan 179

ezigus edited this page Mar 16, 2026 · 3 revisions

Implementation Plan — Success Pattern Learning & Template Recommendation Engine

Status: Implementation Complete (Iteration 1)

The full feature was implemented in commit a212a34. All 26 tests pass. This plan documents the architecture and validates completeness against acceptance criteria.

Brainstorming: Design Decisions

Requirements Clarity

Minimum viable change: A standalone bash module (sw-success-patterns.sh) that captures success patterns, matches by keyword similarity, and recommends templates — integrated into existing pipeline/daemon hooks.

Implicit requirements answered:

Cold-start handling: requires minimum 3 patterns before making recommendations
Confidence threshold: 60% minimum prevents noisy recommendations
Storage cap: 200 patterns with FIFO eviction prevents unbounded growth

Alternatives Considered

Approach	Complexity	Performance	Maintainability	Blast Radius
A: Standalone bash module with TF-IDF keywords	Low	Good (jq-based)	High (single file)	Minimal (3 integration points)
B: SQLite-backed with full-text search	Medium	Better at scale	Lower (SQLite dep)	Medium (new dependency)
C: Embedding-based vector similarity	High	Best matching	Low (Python/Node dep)	Large (new runtime)

Chosen: Approach A. TF-IDF keyword matching with domain expansion is sufficient for the pattern volumes shipwright handles (capped at 200). It stays in bash, consistent with the entire codebase, and requires zero new dependencies. SQLite or embeddings would be over-engineering for a 200-item dataset.

Risk Analysis

Risk	Impact	Mitigation
jq not installed	Capture/recommend silently fails	All jq calls wrapped in `
Corrupt success-patterns.json	Lost pattern data	Atomic writes (tmp file + mv); init on missing file
False recommendations	Wrong template chosen	60% confidence threshold; minimum 2 keyword overlap; minimum 3 patterns
Performance on large pattern files	Slow recommendations	200 pattern cap; single jq invocation for scoring

Component Diagram

┌─────────────────────┐ ┌──────────────────────┐
│ sw-pipeline.sh │ │ daemon-triage.sh │
│ (display + track) │ │ (recommend) │
└────────┬────────────┘ └──────────┬────────────┘
 │ │
 ▼ ▼
┌─────────────────────────────────────────────────┐
│ sw-success-patterns.sh │
│ ┌─────────────┐ ┌──────────────┐ │
│ │ Capture │ │ Recommend │ │
│ │ (pattern │ │ (keyword │ │
│ │ extraction │ │ matching) │ │
│ └──────┬──────┘ └──────┬───────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────┐ │
│ │ success-patterns.json │ │
│ │ (200 patterns, FIFO) │ │
│ └─────────────────────────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Display │ │ Track │ │ Stats │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────┘
 │
 ▼
┌─────────────────────┐ ┌──────────────────────┐
│ sw-memory.sh │ │ sw-eventbus.sh │
│ (finalize hook) │ │ (event logging) │
└─────────────────────┘ └──────────────────────┘

Interface Contracts

// Capture — called by memory_finalize_pipeline() on success
success_capture_pattern(state_file: string, artifacts_dir: string): void
// Input: pipeline-state.md (YAML frontmatter), intake-metadata.json
// Output: Appends pattern to success-patterns.json
// Errors: Returns 0 silently on any failure (graceful degradation)
// Recommend — called by daemon-triage.sh at pipeline start
success_recommend_template(goal: string, labels: string, complexity: string): JSON | empty
// Input: goal text, comma-separated labels, "low"|"medium"|"high"
// Output: JSON { template, confidence, similar_count, avg_duration_secs, rationale, matched_ids }
// Returns empty if < 3 patterns, no keyword overlap, or confidence < threshold
// Display — called by sw-pipeline.sh after intake stage
success_display_recommendation(goal: string, labels: string, complexity: string): void
// Output: Formatted terminal box with recommendation details
// Track — called by sw-pipeline.sh at pipeline completion
success_track_acceptance(recommended_template: string, actual_template: string): void
// Increments recommendations_accepted if templates match
// Stats — CLI command
success_show_stats(): void
// Output: Pattern count, recommendation stats, template breakdown

Data Flow

Pipeline completes successfully
 → memory_finalize_pipeline() [sw-memory.sh:681]
 → success_capture_pattern(state_file, artifacts_dir) [sw-success-patterns.sh:127]
 → Parse state file (goal, template, issue, timestamps, stages)
 → Extract keywords with domain expansion
 → Append pattern to success-patterns.json (atomic write)
 → Emit "success.pattern_captured" event
 → FIFO cap at 200 patterns
New pipeline starts
 → daemon triage [daemon-triage.sh:404]
 → success_recommend_template(goal, labels, complexity)
 → Extract keywords from goal
 → Score each pattern by keyword overlap (Jaccard-like)
 → Filter: min 2 matches, min 0.2 score
 → Group by template, pick winner by count + avg_score
 → Return recommendation if confidence >= 60%
 → pipeline intake completes [sw-pipeline.sh:1803]
 → success_display_recommendation() — show formatted box
 → pipeline completes [sw-pipeline.sh:2892]
 → success_track_acceptance() — record if recommendation was followed

Error Boundaries

sw-success-patterns.sh: All public functions return 0 on error. Never blocks pipeline execution. All jq calls have || true fallbacks.
sw-pipeline.sh: Integration calls guarded by type ... >/dev/null 2>&1 checks and 2>/dev/null || true.
daemon-triage.sh: Recommendation is advisory only; template selection falls back to complexity-based default if no recommendation.
sw-memory.sh: Capture call is in a || true block; memory finalization succeeds even if pattern capture fails.

Schema (Data Pipeline)

success-patterns.json

{
 "version": 1,
 "patterns": [{
 "id": "sha256-16-chars",
 "goal": "issue description",
 "labels": ["bug", "enhancement"],
 "template": "standard",
 "complexity": "medium",
 "duration_secs": 1800,
 "cost_usd": 0,
 "stage_results": {"intake": "complete", "build": "complete"},
 "keywords": ["auth", "authentication", "middleware"],
 "issue_number": 42,
 "repo": "owner/repo",
 "captured_at": "2026年03月16日T10:00:00Z"
 }],
 "stats": {
 "total_captured": 15,
 "recommendations_made": 8,
 "recommendations_accepted": 6,
 "last_updated": "2026年03月16日T10:00:00Z"
 }
}

Idempotency Strategy

Pattern IDs are SHA-256 of goal + template + timestamp. Duplicate captures produce different IDs (timestamp differs), which is acceptable — the 200-cap FIFO eviction handles cleanup. No deduplication needed since each pipeline run is unique.

Rollback Plan

Delete scripts/sw-success-patterns.sh and scripts/sw-success-patterns-test.sh
Revert integration lines in sw-memory.sh, sw-pipeline.sh, daemon-triage.sh
Remove event types from config/event-schema.json
Remove test entry from package.json
Pattern data in ~/.shipwright/memory/*/success-patterns.json is inert — safe to leave or delete

Files Modified

File	Action	Purpose
`scripts/sw-success-patterns.sh`	Created	Core module: capture, recommend, display, track, stats
`scripts/sw-success-patterns-test.sh`	Created	26 test cases covering all functions
`scripts/sw-memory.sh`	Modified	Hook capture into `memory_finalize_pipeline()`
`scripts/sw-pipeline.sh`	Modified	Display recommendation after intake; track acceptance at completion
`scripts/lib/daemon-triage.sh`	Modified	Use recommendation in template selection
`config/event-schema.json`	Modified	Add success.* event type definitions
`package.json`	Modified	Register test suite in `npm test`

Task Decomposition

Create sw-success-patterns.sh — storage helpers, keyword extraction, init (no deps)
Implement capture — depends on Task 1
Implement recommend — depends on Task 1 (keyword extraction)
Implement display — depends on Task 3
Implement tracking + stats — depends on Task 1
Add CLI dispatcher — depends on Tasks 2-5
Add event schema — independent
Integrate into sw-memory.sh — depends on Task 2 (capture)
Integrate into daemon-triage.sh — depends on Task 3 (recommend)
Integrate into sw-pipeline.sh — depends on Tasks 4, 5 (display + track)
Write test suite — depends on Tasks 1-6
Register in package.json — depends on Task 11
Run full test suite — depends on all above

Task 7 is independent. Tasks 8-10 are independent of each other but depend on core module. Task 11 depends on core module being complete.

Testing Approach

Unit tests (sw-success-patterns-test.sh): 26 tests covering init, keyword extraction, capture, skip-failed, cold-start, recommendation, no-match, pattern cap, acceptance tracking, stats display, CLI subcommands, display, events
Integration: Capture is called from memory_finalize_pipeline() — tested via existing memory tests
Regression: Full npm test suite validates no regressions in 100+ test scripts

Definition of Done

Success patterns captured after every successful pipeline completion
Patterns stored in ~/.shipwright/memory/<repo>/success-patterns.json
TF-IDF keyword matching with domain expansion recommends template
Confidence score calculated from match count and keyword overlap
Recommendation displayed at pipeline start with rationale string
60% confidence threshold prevents low-quality recommendations
Acceptance rate tracked (accepted vs rejected)
Stats available via success_show_stats() and CLI stats command
200-pattern cap with FIFO eviction
All 26 tests pass
No regressions in full test suite
Events emitted for capture, recommendation, acceptance/rejection

Endpoint Specification

Not applicable — this is an internal bash module, not an API. All interfaces are shell functions documented in Interface Contracts above.

Rate Limiting / Versioning

Not applicable — internal module with no external-facing API surface. Schema version field ("version": 1) in success-patterns.json enables future migration if structure changes.

Pipeline Plan 179

Implementation Plan — Success Pattern Learning & Template Recommendation Engine

Status: Implementation Complete (Iteration 1)

Brainstorming: Design Decisions

Requirements Clarity

Alternatives Considered

Risk Analysis

Component Diagram

Interface Contracts

Data Flow

Error Boundaries

Schema (Data Pipeline)

success-patterns.json

Idempotency Strategy

Rollback Plan

Files Modified

Task Decomposition

Testing Approach

Definition of Done

Endpoint Specification

Rate Limiting / Versioning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!