Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 178

ezigus edited this page Mar 16, 2026 · 4 revisions

Implementation Plan: Pipeline Cost Forecast and Budget Gate (#178)

Socratic Design Refinement

Requirements Clarity

Minimum viable change: Add cost estimation before pipeline start using template stages, historical durations, and model pricing. Display forecast, gate on budget, emit variance events.

Implicit requirements: Cold-start behavior when no historical data exists; graceful degradation with confidence levels; CLI and dashboard parity.

Acceptance criteria (from issue):

  1. Estimate cost using: template stage count x avg duration x model tier cost
  2. Display forecast before pipeline start in CLI and dashboard
  3. Block start if forecast exceeds remaining budget (configurable, --force-start override)
  4. Emit forecast vs actual cost variance to events.jsonl after pipeline completes
  5. Show cost forecast in dashboard when pipeline is queued
  6. Include confidence interval (low/medium/high) based on historical data quality

Alternatives Considered

Approach A: Inline calculation in sw-pipeline.sh - Simple but mixes concerns, harder to test standalone. Approach B: Dedicated functions in sw-cost.sh with CLI subcommand - Clean separation, independently testable, reusable from daemon and CLI. Chosen. Approach C: Separate forecast script - Over-engineering for the scope; cost functions belong with cost module.

Trade-offs: Approach B adds ~270 lines to sw-cost.sh but keeps all cost logic co-located. The pipeline integration is minimal (~42 lines), maintaining separation of concerns.

Risk Assessment

  • Cold-start estimates may be wildly inaccurate: Mitigated by confidence levels (low when <4 data points) and default duration constants.
  • Budget gate could block legitimate pipelines: Mitigated by --force-start override flag.
  • Historical data query could be slow: Mitigated by reading only last 1000 events.jsonl lines.

Architecture

Component Diagram

 CLI / Daemon
 |
 sw-pipeline.sh
 (budget gate)
 |
 +----------+----------+
 | |
 sw-cost.sh dashboard/
 (forecast engine) (forecast display)
 | |
 +--------+--------+ server.ts (API)
 | | | |
 forecast display variance metrics.ts
 function function function (view)
 |
 events.jsonl
 (historical data)

Components

  1. Forecast Engine (sw-cost.sh) - Calculates estimated cost from template stages, historical durations, model pricing
  2. Budget Gate (sw-pipeline.sh) - Checks forecast against remaining budget at pipeline start
  3. Variance Tracker (sw-cost.sh) - Records forecast vs actual after pipeline completion
  4. CLI Interface (sw-cost.sh case statement) - shipwright cost forecast subcommand
  5. Dashboard Display (dashboard/) - API endpoint + metrics view for forecast data

Interface Contracts

// cost_forecast(template_config_path, complexity) → JSON
interface ForecastResult {
 total_usd: number;
 stages: Array<{
 stage: string;
 model: string;
 duration_s: number;
 estimated_cost_usd: number;
 }>;
 confidence: "low" | "medium" | "high";
 data_points: number;
 complexity_multiplier: number;
}
// cost_forecast_display(forecast_json) → formatted CLI output
// cost_record_variance(forecast_usd, actual_usd, template, issue) → event emission
// cost_check_budget(estimated_cost) → exit code: 0=ok, 1=warning, 2=blocked

Data Flow

  1. Pipeline start → load template JSON → call cost_forecast()
  2. cost_forecast() → query events.jsonl for historical stage.completed durations → compute per-stage cost → return JSON
  3. Display forecast via cost_forecast_display()
  4. Check cost_check_budget(total_usd) → block or proceed
  5. Emit cost.forecast event
  6. Pipeline runs...
  7. Pipeline completes → cost_record_variance(forecast, actual, template, issue) → emit cost.forecast_variance event

Error Boundaries

  • Forecast engine: returns defaults on missing data (no hard failure)
  • Budget gate: exits with code 1 if over budget (overridable)
  • Variance tracker: best-effort, failure doesn't block pipeline completion
  • Dashboard: returns empty arrays if no forecast data exists

Files to Modify

File Action Purpose
scripts/sw-cost.sh Modify (+271 lines) Add cost_forecast(), cost_forecast_display(), cost_record_variance(), forecast CLI subcommand
scripts/sw-pipeline.sh Modify (+42 lines) Add --force-start flag, forecast display, budget gate in pipeline_start(), variance recording at completion
scripts/sw-cost-test.sh Modify (+165 lines) Tests for forecast, display, variance, and budget gate functions
dashboard/server.ts Modify (+50 lines) /api/costs/forecast endpoint serving recent forecasts and variance history
dashboard/src/core/api.ts Modify (+19 lines) fetchCostForecast() client function
dashboard/src/views/metrics.ts Modify (+66 lines) renderCostForecast() component showing forecast table and variance chart

Implementation Steps

  1. Add default stage duration constants to sw-cost.sh — JSON map of stage → default seconds (120s baseline)
  2. Add token rate heuristics by stage category — intake/review are read-heavy (high input), build is write-heavy (high output)
  3. Implement cost_forecast() — reads template stages, queries historical durations from events.jsonl, applies complexity multiplier, computes per-stage cost using model pricing
  4. Implement confidence calculation — high (>20 data points), medium (4-20), low (<4)
  5. Implement cost_forecast_display() — formatted table with stage, model, duration, estimated cost, total, confidence, budget status
  6. Implement cost_record_variance() — computes variance USD and percentage, emits cost.forecast_variance event
  7. Add forecast CLI subcommandshipwright cost forecast [--pipeline <template>] [--complexity <N>] [--json]
  8. Parse --force-start flag in sw-pipeline.sh argument handling
  9. Integrate forecast gate into pipeline_start() — forecast, display, check budget, emit event or exit
  10. Record variance at pipeline end — capture actual cost, call cost_record_variance()
  11. Add dashboard API endpoint/api/costs/forecast returns recent forecasts and variance history from events.jsonl
  12. Add dashboard forecast view — table of recent forecasts, variance trend visualization
  13. Write tests — unit tests for forecast calculation, display output, variance computation, budget gate interaction

Task Checklist

  • Task 1: Add default stage duration constants and token-rate heuristics to sw-cost.sh
  • Task 2: Implement cost_forecast() function with historical lookup and cold-start defaults
  • Task 3: Implement cost_forecast_display() formatted CLI output
  • Task 4: Implement cost_record_variance() with event emission
  • Task 5: Add forecast CLI subcommand to sw-cost.sh case statement
  • Task 6: Add --force-start flag parsing to sw-pipeline.sh
  • Task 7: Integrate forecast display and budget gate into pipeline_start()
  • Task 8: Record forecast variance at pipeline completion in sw-pipeline.sh
  • Task 9: Add dashboard API endpoint for forecast data
  • Task 10: Add dashboard forecast display component
  • Task 11: Write tests for all forecast functions in sw-cost-test.sh
  • Task 12: Run full test suite and fix any failures

Testing Approach

  • Unit tests (sw-cost-test.sh): Test forecast calculation with mock events.jsonl data, display output format, variance computation accuracy, budget gate return codes
  • Integration: Verify shipwright cost forecast --pipeline standard produces valid JSON output
  • Regression: Run full npm test suite to ensure no existing functionality broken
  • Manual validation: shipwright cost forecast --json with and without historical data

Performance

Baseline Metrics

  • Pipeline start time: <2s (should not add noticeable latency)
  • Historical query: reads last 1000 lines of events.jsonl (typically <100KB)

Optimization Targets

  • Forecast calculation: <500ms added to pipeline start
  • No blocking I/O on the critical path beyond the single events.jsonl read

Profiling Strategy

  • Not applicable — the forecast is a one-time calculation at pipeline start, not a hot path

Benchmark Plan

  • Not applicable for this scope — the operation is bounded by a single tail -1000 | jq call

Definition of Done

  • cost_forecast() returns JSON with total_usd, stages[], confidence, data_points for any template
  • Forecast displayed before pipeline start with formatted table
  • Pipeline blocked when forecast exceeds remaining budget (exit 1)
  • --force-start overrides budget gate with warning
  • cost.forecast event emitted to events.jsonl
  • cost.forecast_variance event emitted after pipeline completes
  • Dashboard shows cost forecast data
  • Confidence interval shown as low/medium/high
  • All tests pass (npm test)

Endpoint Specification

GET /api/costs/forecast?period=30

  • Response 200:
    {
     "recent_forecasts": [{"issue": "178", "template": "standard", "forecast_usd": 5.42, "confidence": "medium", "ts": "..."}],
     "variance_history": [{"forecast_usd": 5.42, "actual_usd": 6.12, "variance_pct": 12.9, "template": "standard", "ts": "..."}]
    }
  • Error 500: {"error": "Failed to read forecast data"}

Rate Limiting

Not applicable — internal dashboard API, single-user access.

Versioning

No API versioning needed — internal tooling, not public API.

Error Codes

  • Exit 0: Forecast within budget, pipeline proceeds
  • Exit 1: Forecast exceeds budget, pipeline blocked (override with --force-start)
  • Exit 2: Budget check warning (>=80% utilization), pipeline proceeds with warning

Responsive Breakpoints

Not applicable — dashboard uses terminal-width-aware rendering, not responsive CSS breakpoints.

Accessibility Checklist

Not applicable — CLI output and terminal-based dashboard, no web accessibility requirements.

Component Hierarchy

Dashboard cost forecast is a leaf component within the metrics view — no complex state management needed.

State Management Approach

  • Forecast data flows: events.jsonl → server API → client fetch → render
  • No client-side state persistence needed; data is read-only from server

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /