Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 20

Seth Ford edited this page Feb 12, 2026 · 4 revisions

It seems I'm being blocked from writing files. Let me present the plan directly instead.


Plan: End-to-End Integration Test Suite in CI

Overview

Add an integration test suite that goes beyond the existing 17 mock-based test suites by running the real pipeline with enhanced mock binaries that validate stage ordering, state transitions, and artifact generation — plus a CI workflow with budget enforcement and structured reporting.

Two tiers:

  1. Smoke tests (no Claude, no GitHub) — validate pipeline machinery with deterministic mocks
  2. Live integration tests (requires CLAUDE_API_KEY + GITHUB_TOKEN) — run a minimal pipeline against real Claude, budget-capped at 1ドル.00

Files to Modify

File Action Purpose
scripts/sw-integration-test.sh Create Integration test suite (~500 lines)
templates/pipelines/integration-test.json Create Minimal 3-stage pipeline template
.github/workflows/integration-test.yml Create CI workflow (smoke + live jobs)
.github/workflows/test.yml Modify Add smoke test step after unit tests
package.json Modify Add test:smoke and test:integration scripts

Implementation Steps

Step 1: Create templates/pipelines/integration-test.json

Minimal template: intakebuildtest only, all auto-gated, model sonnet, max 3 iterations.

Step 2: Create scripts/sw-integration-test.sh

Same harness pattern as sw-pipeline-test.sh: set -euo pipefail, ERR trap, PASS/FAIL counters, temp dir isolation, mock binaries on PATH, assertion functions.

Tier 1 — Smoke Tests (10 tests, always run):

  1. test_smoke_full_stage_ordering — Run fast template E2E, parse pipeline-state.md, verify stage_progress order, status: success
  2. test_smoke_state_transitions — Verify state transitions pendingrunningcomplete per stage, current_stage updates correctly
  3. test_smoke_artifact_integrity — Verify all expected artifacts exist: intake.json (valid JSON), plan.md, test-results.log, events.jsonl entries
  4. test_smoke_no_crashes_fast_template — Fast template exits 0, no ERROR lines
  5. test_smoke_no_crashes_standard_template — Standard template exits 0
  6. test_smoke_no_crashes_autonomous_template — Autonomous template exits 0
  7. test_smoke_budget_enforcement — Mock cost at 1ドル.01, verify pipeline fails; mock at 0ドル.99, verify it passes
  8. test_smoke_resume_preserves_artifacts — Fail at test stage, resume, verify prior artifacts preserved
  9. test_smoke_dry_run_no_side_effects--dry-run creates no artifacts, no branches, no events
  10. test_smoke_stage_timing_recorded — Each completed stage has duration in state file log

Tier 2 — Live Tests (2 tests, require secrets):

  1. test_live_trivial_readme_change — Real Claude makes a trivial change, verify diff exists, clean git status, cost < 1ドル.00
  2. test_live_pr_creation — Verify PR created with valid URL, cleanup after

Mode selection: --live flag enables Tier 2, --filter <name> for single test.

Step 3: Create .github/workflows/integration-test.yml

  • smoke job: Matrix (macOS + Ubuntu), runs on every PR and push to main
  • live job: Runs after smoke passes, only on push to main or manual trigger, uses integration-test environment for secret access, 15-minute timeout, uploads artifacts

Step 4: Update package.json

Add "test:smoke" and "test:integration" scripts.

Step 5: Update .github/workflows/test.yml

Add - name: Run integration smoke tests step after existing unit test steps.


Task Checklist

  • Task 1: Create templates/pipelines/integration-test.json
  • Task 2: Create scripts/sw-integration-test.sh scaffold (harness, mocks, assertions)
  • Task 3: Implement smoke tests 1-5 (ordering, transitions, artifacts, crash tests)
  • Task 4: Implement smoke tests 6-10 (budget, resume, dry-run, timing)
  • Task 5: Implement live tests (README change, PR creation)
  • Task 6: Create .github/workflows/integration-test.yml
  • Task 7: Update package.json with new scripts
  • Task 8: Update .github/workflows/test.yml with smoke step
  • Task 9: Add CI summary reporting ($GITHUB_STEP_SUMMARY markdown table)
  • Task 10: Run smoke tests locally and fix failures
  • Task 11: Verify CLAUDE.md AUTO sections update

Testing Approach

  1. Local: bash scripts/sw-integration-test.sh — all 10 smoke tests pass
  2. Filter: --filter test_smoke_budget_enforcement for debugging individual tests
  3. CI: Push branch → verify workflow triggers and summary reports
  4. Live: Manual dispatch with run_live: true or merge to main
  5. Cross-platform: macOS + Ubuntu matrix

Definition of Done

  • npm run test:smoke passes (exit 0)
  • npm run test:integration runs full suite when secrets available
  • CI runs smoke tests on every PR to main
  • CI runs live tests on push to main (regression)
  • Budget cap: live tests abort if cost > 1ドル.00
  • CI summary: pass/fail per test in $GITHUB_STEP_SUMMARY
  • Existing 17 test suites still pass
  • Follows conventions: set -euo pipefail, bash 3.2, shipwright color theme

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /