Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 20

Seth Ford edited this page Feb 11, 2026 · 4 revisions

I have the complete plan ready. Here it is:


Plan: End-to-End Integration Test Suite in CI

Overview

Create two new test artifacts: (1) a bash E2E test script that runs the real pipeline with mocked Claude/GitHub (smoke tests) and supports an optional "live" mode against real Claude for integration tests, and (2) a GitHub Actions workflow for integration tests on PRs to main.

The design follows the existing test harness pattern exactly — PASS/FAIL counters, mock binaries in temp dirs, assertion helpers, colored output. No new frameworks or dependencies.

Files to Modify

File Action Purpose
scripts/sw-e2e-test.sh Create End-to-end integration test suite
.github/workflows/integration.yml Create CI workflow for integration tests on PRs
package.json Modify Add test:e2e and test:integration npm scripts
.claude/CLAUDE.md Modify Add E2E test suite to test suites table

Implementation Steps

Step 1: Create scripts/sw-e2e-test.sh

Follows the exact pattern from sw-pipeline-test.sh — same boilerplate (header, colors, counters), same assertion helpers (assert_exit_code, assert_output_contains, assert_file_exists, assert_file_contains, assert_branch_exists, assert_state_contains), same invoke_pipeline() → captures output + exit code, same run_test() runner, same setup_env() / reset_test() / cleanup_env() lifecycle.

Two modes: --smoke (default, mocked, fast) and --live (real Claude, budget-capped).

10 Smoke Tests (mocked — always run in CI):

  1. test_full_pipeline_stage_order — Run fast template (intake→build→test→PR) with mocks. Verify exit 0, state file status: complete, all stage artifacts exist.
  2. test_stage_order_preserved — Parse output for stage markers, verify correct ordering.
  3. test_state_file_updated_per_stage — Verify state file has all required YAML fields: pipeline, goal, status, branch, stage_progress, timestamps.
  4. test_no_unhandled_errors — Run full pipeline, verify no unbound variable or command not found in output.
  5. test_resume_from_interrupted — Run intake-only, manually edit state, resume. Verify continuation.
  6. test_artifacts_have_valid_json — Verify all .json in pipeline-artifacts are valid via jq.
  7. test_branch_created_cleanly — Feature branch exists, has commits, clean working tree.
  8. test_dry_run_produces_no_artifacts--dry-run creates no artifacts directory.
  9. test_pipeline_with_custom_template — Custom template with only intake+build, verify only those stages run.
  10. test_error_recovery_no_crash — Mock claude returns exit 1 during plan, verify graceful failure.

Step 2: Live Integration Mode

Same script with --live flag. One test: trivial goal ("Add a comment to README.md"), intake→build with real Claude. Budget protection: --max-turns 5, fast template, 120s timeout, 1ドル.00 cap. Skips gracefully if claude CLI unavailable.

Step 3: CI Workflow

.github/workflows/integration.yml — smoke tests on macOS + Ubuntu matrix for every PR. Conditional live integration job when CLAUDE_API_KEY secret exists. Upload test result artifacts.

Step 4: npm Scripts

Add "test:e2e" and "test:integration" to package.json scripts.

Step 5: Update CLAUDE.md

Add new test suite to the test suites table.

Task Checklist

  • Task 1: Create scripts/sw-e2e-test.sh boilerplate (header, colors, counters, assertions, setup/cleanup, main)
  • Task 2: Implement mock environment setup (real pipeline + templates, mock binaries, mock git project)
  • Task 3: Implement 10 smoke tests
  • Task 4: Implement live integration mode with budget protection and skip logic
  • Task 5: Create .github/workflows/integration.yml
  • Task 6: Update package.json with new scripts
  • Task 7: Update .claude/CLAUDE.md test suites table
  • Task 8: Run smoke tests locally to verify they pass
  • Task 9: Run ShellCheck and fix warnings

Definition of Done

  • npm run test:e2e runs the smoke test suite and passes
  • npm run test:integration runs live tests or skips gracefully
  • CI runs smoke tests on every PR to main
  • Budget-capped to 1ドル.00 per live test run
  • Pass/fail output matches existing test suite style
  • Bash 3.2 compatible, no ShellCheck errors

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /