-
Notifications
You must be signed in to change notification settings - Fork 1
Pipeline Plan 20
I have the complete plan ready. Here it is:
Create two new test artifacts: (1) a bash E2E test script that runs the real pipeline with mocked Claude/GitHub (smoke tests) and supports an optional "live" mode against real Claude for integration tests, and (2) a GitHub Actions workflow for integration tests on PRs to main.
The design follows the existing test harness pattern exactly — PASS/FAIL counters, mock binaries in temp dirs, assertion helpers, colored output. No new frameworks or dependencies.
| File | Action | Purpose |
|---|---|---|
scripts/sw-e2e-test.sh |
Create | End-to-end integration test suite |
.github/workflows/integration.yml |
Create | CI workflow for integration tests on PRs |
package.json |
Modify | Add test:e2e and test:integration npm scripts |
.claude/CLAUDE.md |
Modify | Add E2E test suite to test suites table |
Follows the exact pattern from sw-pipeline-test.sh — same boilerplate (header, colors, counters), same assertion helpers (assert_exit_code, assert_output_contains, assert_file_exists, assert_file_contains, assert_branch_exists, assert_state_contains), same invoke_pipeline() → captures output + exit code, same run_test() runner, same setup_env() / reset_test() / cleanup_env() lifecycle.
Two modes: --smoke (default, mocked, fast) and --live (real Claude, budget-capped).
10 Smoke Tests (mocked — always run in CI):
-
test_full_pipeline_stage_order— Runfasttemplate (intake→build→test→PR) with mocks. Verify exit 0, state filestatus: complete, all stage artifacts exist. -
test_stage_order_preserved— Parse output for stage markers, verify correct ordering. -
test_state_file_updated_per_stage— Verify state file has all required YAML fields:pipeline,goal,status,branch,stage_progress, timestamps. -
test_no_unhandled_errors— Run full pipeline, verify nounbound variableorcommand not foundin output. -
test_resume_from_interrupted— Run intake-only, manually edit state, resume. Verify continuation. -
test_artifacts_have_valid_json— Verify all.jsonin pipeline-artifacts are valid viajq. -
test_branch_created_cleanly— Feature branch exists, has commits, clean working tree. -
test_dry_run_produces_no_artifacts—--dry-runcreates no artifacts directory. -
test_pipeline_with_custom_template— Custom template with only intake+build, verify only those stages run. -
test_error_recovery_no_crash— Mock claude returns exit 1 during plan, verify graceful failure.
Same script with --live flag. One test: trivial goal ("Add a comment to README.md"), intake→build with real Claude. Budget protection: --max-turns 5, fast template, 120s timeout, 1ドル.00 cap. Skips gracefully if claude CLI unavailable.
.github/workflows/integration.yml — smoke tests on macOS + Ubuntu matrix for every PR. Conditional live integration job when CLAUDE_API_KEY secret exists. Upload test result artifacts.
Add "test:e2e" and "test:integration" to package.json scripts.
Add new test suite to the test suites table.
- Task 1: Create
scripts/sw-e2e-test.shboilerplate (header, colors, counters, assertions, setup/cleanup, main) - Task 2: Implement mock environment setup (real pipeline + templates, mock binaries, mock git project)
- Task 3: Implement 10 smoke tests
- Task 4: Implement live integration mode with budget protection and skip logic
- Task 5: Create
.github/workflows/integration.yml - Task 6: Update
package.jsonwith new scripts - Task 7: Update
.claude/CLAUDE.mdtest suites table - Task 8: Run smoke tests locally to verify they pass
- Task 9: Run ShellCheck and fix warnings
-
npm run test:e2eruns the smoke test suite and passes -
npm run test:integrationruns live tests or skips gracefully - CI runs smoke tests on every PR to main
- Budget-capped to 1ドル.00 per live test run
- Pass/fail output matches existing test suite style
- Bash 3.2 compatible, no ShellCheck errors