Pipeline Plan 178

ezigus edited this page Mar 16, 2026 · 4 revisions

Implementation Plan: Pipeline Cost Forecast and Budget Gate (#178)

Socratic Design Refinement

Requirements Clarity

Minimum viable change: Add cost estimation before pipeline start using template stages, historical durations, and model pricing. Display forecast, gate on budget, emit variance events.

Implicit requirements: Cold-start behavior when no historical data exists; graceful degradation with confidence levels; CLI and dashboard parity.

Acceptance criteria (from issue):

Estimate cost using: template stage count x avg duration x model tier cost
Display forecast before pipeline start in CLI and dashboard
Block start if forecast exceeds remaining budget (configurable, --force-start override)
Emit forecast vs actual cost variance to events.jsonl after pipeline completes
Show cost forecast in dashboard when pipeline is queued
Include confidence interval (low/medium/high) based on historical data quality

Alternatives Considered

Approach A: Inline calculation in sw-pipeline.sh - Simple but mixes concerns, harder to test standalone. Approach B: Dedicated functions in sw-cost.sh with CLI subcommand - Clean separation, independently testable, reusable from daemon and CLI. Chosen. Approach C: Separate forecast script - Over-engineering for the scope; cost functions belong with cost module.

Trade-offs: Approach B adds ~270 lines to sw-cost.sh but keeps all cost logic co-located. The pipeline integration is minimal (~42 lines), maintaining separation of concerns.

Risk Assessment

Cold-start estimates may be wildly inaccurate: Mitigated by confidence levels (low when <4 data points) and default duration constants.
Budget gate could block legitimate pipelines: Mitigated by --force-start override flag.
Historical data query could be slow: Mitigated by reading only last 1000 events.jsonl lines.

Architecture

Component Diagram

 CLI / Daemon
 |
 sw-pipeline.sh
 (budget gate)
 |
 +----------+----------+
 | |
 sw-cost.sh dashboard/
 (forecast engine) (forecast display)
 | |
 +--------+--------+ server.ts (API)
 | | | |
 forecast display variance metrics.ts
 function function function (view)
 |
 events.jsonl
 (historical data)

Components

Forecast Engine (sw-cost.sh) - Calculates estimated cost from template stages, historical durations, model pricing
Budget Gate (sw-pipeline.sh) - Checks forecast against remaining budget at pipeline start
Variance Tracker (sw-cost.sh) - Records forecast vs actual after pipeline completion
CLI Interface (sw-cost.sh case statement) - shipwright cost forecast subcommand
Dashboard Display (dashboard/) - API endpoint + metrics view for forecast data

Interface Contracts

// cost_forecast(template_config_path, complexity) → JSON
interface ForecastResult {
 total_usd: number;
 stages: Array<{
 stage: string;
 model: string;
 duration_s: number;
 estimated_cost_usd: number;
 }>;
 confidence: "low" | "medium" | "high";
 data_points: number;
 complexity_multiplier: number;
}
// cost_forecast_display(forecast_json) → formatted CLI output
// cost_record_variance(forecast_usd, actual_usd, template, issue) → event emission
// cost_check_budget(estimated_cost) → exit code: 0=ok, 1=warning, 2=blocked

Data Flow

Pipeline start → load template JSON → call cost_forecast()
cost_forecast() → query events.jsonl for historical stage.completed durations → compute per-stage cost → return JSON
Display forecast via cost_forecast_display()
Check cost_check_budget(total_usd) → block or proceed
Emit cost.forecast event
Pipeline runs...
Pipeline completes → cost_record_variance(forecast, actual, template, issue) → emit cost.forecast_variance event

Error Boundaries

Forecast engine: returns defaults on missing data (no hard failure)
Budget gate: exits with code 1 if over budget (overridable)
Variance tracker: best-effort, failure doesn't block pipeline completion
Dashboard: returns empty arrays if no forecast data exists

Files to Modify

File	Action	Purpose
`scripts/sw-cost.sh`	Modify (+271 lines)	Add `cost_forecast()`, `cost_forecast_display()`, `cost_record_variance()`, `forecast` CLI subcommand
`scripts/sw-pipeline.sh`	Modify (+42 lines)	Add `--force-start` flag, forecast display, budget gate in `pipeline_start()`, variance recording at completion
`scripts/sw-cost-test.sh`	Modify (+165 lines)	Tests for forecast, display, variance, and budget gate functions
`dashboard/server.ts`	Modify (+50 lines)	`/api/costs/forecast` endpoint serving recent forecasts and variance history
`dashboard/src/core/api.ts`	Modify (+19 lines)	`fetchCostForecast()` client function
`dashboard/src/views/metrics.ts`	Modify (+66 lines)	`renderCostForecast()` component showing forecast table and variance chart

Implementation Steps

Add default stage duration constants to sw-cost.sh — JSON map of stage → default seconds (120s baseline)
Add token rate heuristics by stage category — intake/review are read-heavy (high input), build is write-heavy (high output)
Implement cost_forecast() — reads template stages, queries historical durations from events.jsonl, applies complexity multiplier, computes per-stage cost using model pricing
Implement confidence calculation — high (>20 data points), medium (4-20), low (<4)
Implement cost_forecast_display() — formatted table with stage, model, duration, estimated cost, total, confidence, budget status
Implement cost_record_variance() — computes variance USD and percentage, emits cost.forecast_variance event
Add forecast CLI subcommand — shipwright cost forecast [--pipeline <template>] [--complexity <N>] [--json]
Parse --force-start flag in sw-pipeline.sh argument handling
Integrate forecast gate into pipeline_start() — forecast, display, check budget, emit event or exit
Record variance at pipeline end — capture actual cost, call cost_record_variance()
Add dashboard API endpoint — /api/costs/forecast returns recent forecasts and variance history from events.jsonl
Add dashboard forecast view — table of recent forecasts, variance trend visualization
Write tests — unit tests for forecast calculation, display output, variance computation, budget gate interaction

Task Checklist

Task 1: Add default stage duration constants and token-rate heuristics to sw-cost.sh
Task 2: Implement cost_forecast() function with historical lookup and cold-start defaults
Task 3: Implement cost_forecast_display() formatted CLI output
Task 4: Implement cost_record_variance() with event emission
Task 5: Add forecast CLI subcommand to sw-cost.sh case statement
Task 6: Add --force-start flag parsing to sw-pipeline.sh
Task 7: Integrate forecast display and budget gate into pipeline_start()
Task 8: Record forecast variance at pipeline completion in sw-pipeline.sh
Task 9: Add dashboard API endpoint for forecast data
Task 10: Add dashboard forecast display component
Task 11: Write tests for all forecast functions in sw-cost-test.sh
Task 12: Run full test suite and fix any failures

Testing Approach

Unit tests (sw-cost-test.sh): Test forecast calculation with mock events.jsonl data, display output format, variance computation accuracy, budget gate return codes
Integration: Verify shipwright cost forecast --pipeline standard produces valid JSON output
Regression: Run full npm test suite to ensure no existing functionality broken
Manual validation: shipwright cost forecast --json with and without historical data

Performance

Baseline Metrics

Pipeline start time: <2s (should not add noticeable latency)
Historical query: reads last 1000 lines of events.jsonl (typically <100KB)

Optimization Targets

Forecast calculation: <500ms added to pipeline start
No blocking I/O on the critical path beyond the single events.jsonl read

Profiling Strategy

Not applicable — the forecast is a one-time calculation at pipeline start, not a hot path

Benchmark Plan

Not applicable for this scope — the operation is bounded by a single tail -1000 | jq call

Definition of Done

cost_forecast() returns JSON with total_usd, stages[], confidence, data_points for any template
Forecast displayed before pipeline start with formatted table
Pipeline blocked when forecast exceeds remaining budget (exit 1)
--force-start overrides budget gate with warning
cost.forecast event emitted to events.jsonl
cost.forecast_variance event emitted after pipeline completes
Dashboard shows cost forecast data
Confidence interval shown as low/medium/high
All tests pass (npm test)

Endpoint Specification

`GET /api/costs/forecast?period=30`

Response 200:

{
 "recent_forecasts": [{"issue": "178", "template": "standard", "forecast_usd": 5.42, "confidence": "medium", "ts": "..."}],
 "variance_history": [{"forecast_usd": 5.42, "actual_usd": 6.12, "variance_pct": 12.9, "template": "standard", "ts": "..."}]
}

Error 500: {"error": "Failed to read forecast data"}

Rate Limiting

Not applicable — internal dashboard API, single-user access.

Versioning

No API versioning needed — internal tooling, not public API.

Error Codes

Exit 0: Forecast within budget, pipeline proceeds
Exit 1: Forecast exceeds budget, pipeline blocked (override with --force-start)
Exit 2: Budget check warning (>=80% utilization), pipeline proceeds with warning

Responsive Breakpoints

Not applicable — dashboard uses terminal-width-aware rendering, not responsive CSS breakpoints.

Accessibility Checklist

Not applicable — CLI output and terminal-based dashboard, no web accessibility requirements.

Component Hierarchy

Dashboard cost forecast is a leaf component within the metrics view — no complex state management needed.

State Management Approach

Forecast data flows: events.jsonl → server API → client fetch → render
No client-side state persistence needed; data is read-only from server

Pipeline Plan 178

Implementation Plan: Pipeline Cost Forecast and Budget Gate (#178)

Socratic Design Refinement

Requirements Clarity

Alternatives Considered

Risk Assessment

Architecture

Component Diagram

Components

Interface Contracts

Data Flow

Error Boundaries

Files to Modify

Implementation Steps

Task Checklist

Testing Approach

Performance

Baseline Metrics

Optimization Targets

Profiling Strategy

Benchmark Plan

Definition of Done

Endpoint Specification

GET /api/costs/forecast?period=30

Rate Limiting

Versioning

Error Codes

Responsive Breakpoints

Accessibility Checklist

Component Hierarchy

State Management Approach

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

`GET /api/costs/forecast?period=30`