-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Plan 178
Minimum viable change: Add cost estimation before pipeline start using template stages, historical durations, and model pricing. Display forecast, gate on budget, emit variance events.
Implicit requirements: Cold-start behavior when no historical data exists; graceful degradation with confidence levels; CLI and dashboard parity.
Acceptance criteria (from issue):
- Estimate cost using: template stage count x avg duration x model tier cost
- Display forecast before pipeline start in CLI and dashboard
- Block start if forecast exceeds remaining budget (configurable,
--force-startoverride) - Emit forecast vs actual cost variance to events.jsonl after pipeline completes
- Show cost forecast in dashboard when pipeline is queued
- Include confidence interval (low/medium/high) based on historical data quality
Approach A: Inline calculation in sw-pipeline.sh - Simple but mixes concerns, harder to test standalone. Approach B: Dedicated functions in sw-cost.sh with CLI subcommand - Clean separation, independently testable, reusable from daemon and CLI. Chosen. Approach C: Separate forecast script - Over-engineering for the scope; cost functions belong with cost module.
Trade-offs: Approach B adds ~270 lines to sw-cost.sh but keeps all cost logic co-located. The pipeline integration is minimal (~42 lines), maintaining separation of concerns.
- Cold-start estimates may be wildly inaccurate: Mitigated by confidence levels (low when <4 data points) and default duration constants.
-
Budget gate could block legitimate pipelines: Mitigated by
--force-startoverride flag. - Historical data query could be slow: Mitigated by reading only last 1000 events.jsonl lines.
CLI / Daemon
|
sw-pipeline.sh
(budget gate)
|
+----------+----------+
| |
sw-cost.sh dashboard/
(forecast engine) (forecast display)
| |
+--------+--------+ server.ts (API)
| | | |
forecast display variance metrics.ts
function function function (view)
|
events.jsonl
(historical data)
-
Forecast Engine (
sw-cost.sh) - Calculates estimated cost from template stages, historical durations, model pricing -
Budget Gate (
sw-pipeline.sh) - Checks forecast against remaining budget at pipeline start -
Variance Tracker (
sw-cost.sh) - Records forecast vs actual after pipeline completion -
CLI Interface (
sw-cost.shcase statement) -shipwright cost forecastsubcommand -
Dashboard Display (
dashboard/) - API endpoint + metrics view for forecast data
// cost_forecast(template_config_path, complexity) → JSON interface ForecastResult { total_usd: number; stages: Array<{ stage: string; model: string; duration_s: number; estimated_cost_usd: number; }>; confidence: "low" | "medium" | "high"; data_points: number; complexity_multiplier: number; } // cost_forecast_display(forecast_json) → formatted CLI output // cost_record_variance(forecast_usd, actual_usd, template, issue) → event emission // cost_check_budget(estimated_cost) → exit code: 0=ok, 1=warning, 2=blocked
- Pipeline start → load template JSON → call
cost_forecast() -
cost_forecast()→ query events.jsonl for historicalstage.completeddurations → compute per-stage cost → return JSON - Display forecast via
cost_forecast_display() - Check
cost_check_budget(total_usd)→ block or proceed - Emit
cost.forecastevent - Pipeline runs...
- Pipeline completes →
cost_record_variance(forecast, actual, template, issue)→ emitcost.forecast_varianceevent
- Forecast engine: returns defaults on missing data (no hard failure)
- Budget gate: exits with code 1 if over budget (overridable)
- Variance tracker: best-effort, failure doesn't block pipeline completion
- Dashboard: returns empty arrays if no forecast data exists
| File | Action | Purpose |
|---|---|---|
scripts/sw-cost.sh |
Modify (+271 lines) | Add cost_forecast(), cost_forecast_display(), cost_record_variance(), forecast CLI subcommand |
scripts/sw-pipeline.sh |
Modify (+42 lines) | Add --force-start flag, forecast display, budget gate in pipeline_start(), variance recording at completion |
scripts/sw-cost-test.sh |
Modify (+165 lines) | Tests for forecast, display, variance, and budget gate functions |
dashboard/server.ts |
Modify (+50 lines) |
/api/costs/forecast endpoint serving recent forecasts and variance history |
dashboard/src/core/api.ts |
Modify (+19 lines) |
fetchCostForecast() client function |
dashboard/src/views/metrics.ts |
Modify (+66 lines) |
renderCostForecast() component showing forecast table and variance chart |
-
Add default stage duration constants to
sw-cost.sh— JSON map of stage → default seconds (120s baseline) - Add token rate heuristics by stage category — intake/review are read-heavy (high input), build is write-heavy (high output)
-
Implement
cost_forecast()— reads template stages, queries historical durations from events.jsonl, applies complexity multiplier, computes per-stage cost using model pricing - Implement confidence calculation — high (>20 data points), medium (4-20), low (<4)
-
Implement
cost_forecast_display()— formatted table with stage, model, duration, estimated cost, total, confidence, budget status -
Implement
cost_record_variance()— computes variance USD and percentage, emitscost.forecast_varianceevent -
Add
forecastCLI subcommand —shipwright cost forecast [--pipeline <template>] [--complexity <N>] [--json] -
Parse
--force-startflag insw-pipeline.shargument handling -
Integrate forecast gate into
pipeline_start()— forecast, display, check budget, emit event or exit -
Record variance at pipeline end — capture actual cost, call
cost_record_variance() -
Add dashboard API endpoint —
/api/costs/forecastreturns recent forecasts and variance history from events.jsonl - Add dashboard forecast view — table of recent forecasts, variance trend visualization
- Write tests — unit tests for forecast calculation, display output, variance computation, budget gate interaction
- Task 1: Add default stage duration constants and token-rate heuristics to
sw-cost.sh - Task 2: Implement
cost_forecast()function with historical lookup and cold-start defaults - Task 3: Implement
cost_forecast_display()formatted CLI output - Task 4: Implement
cost_record_variance()with event emission - Task 5: Add
forecastCLI subcommand tosw-cost.shcase statement - Task 6: Add
--force-startflag parsing tosw-pipeline.sh - Task 7: Integrate forecast display and budget gate into
pipeline_start() - Task 8: Record forecast variance at pipeline completion in
sw-pipeline.sh - Task 9: Add dashboard API endpoint for forecast data
- Task 10: Add dashboard forecast display component
- Task 11: Write tests for all forecast functions in
sw-cost-test.sh - Task 12: Run full test suite and fix any failures
-
Unit tests (
sw-cost-test.sh): Test forecast calculation with mock events.jsonl data, display output format, variance computation accuracy, budget gate return codes -
Integration: Verify
shipwright cost forecast --pipeline standardproduces valid JSON output -
Regression: Run full
npm testsuite to ensure no existing functionality broken -
Manual validation:
shipwright cost forecast --jsonwith and without historical data
- Pipeline start time: <2s (should not add noticeable latency)
- Historical query: reads last 1000 lines of events.jsonl (typically <100KB)
- Forecast calculation: <500ms added to pipeline start
- No blocking I/O on the critical path beyond the single events.jsonl read
- Not applicable — the forecast is a one-time calculation at pipeline start, not a hot path
- Not applicable for this scope — the operation is bounded by a single
tail -1000 | jqcall
-
cost_forecast()returns JSON withtotal_usd,stages[],confidence,data_pointsfor any template - Forecast displayed before pipeline start with formatted table
- Pipeline blocked when forecast exceeds remaining budget (exit 1)
-
--force-startoverrides budget gate with warning -
cost.forecastevent emitted to events.jsonl -
cost.forecast_varianceevent emitted after pipeline completes - Dashboard shows cost forecast data
- Confidence interval shown as low/medium/high
- All tests pass (
npm test)
-
Response 200:
{ "recent_forecasts": [{"issue": "178", "template": "standard", "forecast_usd": 5.42, "confidence": "medium", "ts": "..."}], "variance_history": [{"forecast_usd": 5.42, "actual_usd": 6.12, "variance_pct": 12.9, "template": "standard", "ts": "..."}] } -
Error 500:
{"error": "Failed to read forecast data"}
Not applicable — internal dashboard API, single-user access.
No API versioning needed — internal tooling, not public API.
- Exit 0: Forecast within budget, pipeline proceeds
- Exit 1: Forecast exceeds budget, pipeline blocked (override with
--force-start) - Exit 2: Budget check warning (>=80% utilization), pipeline proceeds with warning
Not applicable — dashboard uses terminal-width-aware rendering, not responsive CSS breakpoints.
Not applicable — CLI output and terminal-based dashboard, no web accessibility requirements.
Dashboard cost forecast is a leaf component within the metrics view — no complex state management needed.
- Forecast data flows: events.jsonl → server API → client fetch → render
- No client-side state persistence needed; data is read-only from server