Pipeline Design 232

Seth Ford edited this page Mar 9, 2026 · 2 revisions

Now I have enough context. Let me produce the ADR.

Design: Test Infrastructure Pre-Flight Validation Gate

Context

Failed pipelines waste ~5ドル/run and ~4133s build time (per recent metrics) because broken test harnesses are only discovered late in the build-test loop. Root causes from recent failures (failures.json, 2026年03月09日) include missing artifact writes and jq parse errors — exactly the kind of issues a pre-flight check would catch early.

The Shipwright bash test harness has two valid patterns:

Inline: set -euo pipefail + PASS=0/FAIL=0 + trap ... ERR directly in the test file
Sourced: source "$SCRIPT_DIR/lib/test-helpers.sh" which provides counters, traps, and assertions automatically

Both patterns must be recognized as valid. Non-bash projects (Node/pytest/Cargo) need only lightweight validation (test command resolves, test directory exists).

Constraints:

Bash 3.2 compatible (no declare -A, no readarray)
Must not break existing working pipelines
Must handle new projects with no test files yet
shellcheck is optional (not all environments have it)

Decision

Create scripts/lib/pipeline-test-validation.sh as a guard-loaded library module following the exact pattern of pipeline-quality-checks.sh and pipeline-detection.sh. The module provides validate_test_infrastructure() called from stage_intake() after step 3 (test command auto-detection, line 65) and before step 4 (branch creation, line 68).

Data Flow

stage_intake()
 → detect_test_cmd() # step 3 — sets TEST_CMD
 → validate_test_infrastructure() # NEW — reads TEST_CMD, PROJECT_ROOT
 → find_test_files() # discovers *-test.sh, *.test.ts, etc.
 → validate_bash_test_harness() # per-file .sh validation
 → optional shellcheck pass
 → _write_validation_report() # JSON to pipeline-artifacts/
 → branch creation # step 4 — unchanged

Validation Logic

Severity model: ERROR blocks intake (return 1). WARNING is logged but non-blocking. INFO is informational.

Check	Scope	Severity	Detail
TEST_CMD binary resolves	All projects	ERROR	`command -v $(echo "$TEST_CMD" \| awk '{print 1ドル}')`
Test files discovered	All projects	INFO	Missing = warning for new projects, not an error
File is executable	.sh test files	ERROR	`-x "$file"`
`set -euo pipefail` or `set -e`	.sh test files	ERROR	`grep -qE 'set -e(uo pipefail)?'`
PASS/FAIL counters present	.sh test files	ERROR	Either inline `PASS=0` or `source.*test-helpers`
ERR trap present	.sh test files	WARNING	`grep -qE 'trap.*ERR'` (test-helpers.sh provides this)
`bash -n` syntax check	.sh test files	ERROR	Catches parse errors before runtime
shellcheck	.sh test files	WARNING	Only when `command -v shellcheck` succeeds

Key design decision: When a .sh test file sources test-helpers.sh, the PASS/FAIL counter and ERR trap checks are automatically satisfied — the validator recognizes this pattern and skips those individual checks.

Error Handling

validate_test_infrastructure() returns 0 (pass) or 1 (errors found)
On return 1, stage_intake() emits intake.test_validation_failed event and returns 1
SKIP_TEST_VALIDATION=true bypasses entirely with an info message
No test files found = INFO-level message, return 0 (new project safe)
shellcheck missing = INFO-level skip, not a failure
JSON report always written (even on skip — "skipped": true)

Report Format

Written to $ARTIFACTS_DIR/test-validation.json via _write_validation_report() using jq -n --arg for safe JSON construction (no string interpolation):

{
 "timestamp": "2026年03月09日T12:00:00Z",
 "valid": true,
 "test_cmd": "npm test",
 "test_files_found": 5,
 "checks": [
 {"file": "scripts/sw-hello-test.sh", "check": "executable", "severity": "error", "passed": true, "message": "File is executable"}
 ],
 "errors": 0,
 "warnings": 1,
 "skipped": false,
 "shellcheck_available": true
}

Alternatives Considered

Inline in stage_intake() — Pros: no new files, zero module overhead. Cons: stage_intake() already 119 lines, mixes concerns, harder to unit test independently. Would violate the separation pattern established by pipeline-detection.sh / pipeline-quality-checks.sh.
Extend pipeline-detection.sh — Pros: colocates with detect_test_cmd(). Cons: detection (what) vs validation (is it correct) are different concerns. pipeline-detection.sh is sourced by many modules; growing it increases blast radius. Adding ~150 lines of validation logic would make it the largest lib module.
Standalone script (sw-test-validation.sh) — Pros: runnable independently. Cons: can't share globals (TEST_CMD, ARTIFACTS_DIR), would need parameter passing, breaks the library module pattern used by all other pipeline components.

Implementation Plan

Files to Create

scripts/lib/pipeline-test-validation.sh (~150 lines) — module guard, find_test_files(), validate_bash_test_harness(), validate_test_infrastructure(), _write_validation_report()
scripts/sw-test-validation-test.sh (~300 lines) — full test suite sourcing test-helpers.sh

Files to Modify

scripts/lib/pipeline-cli.sh (line ~141) — add --skip-test-validation) SKIP_TEST_VALIDATION=true; shift ;; to parse_args()
scripts/lib/pipeline-stages-intake.sh (between lines 65-67) — call validate_test_infrastructure() after test command detection, before branch creation
scripts/lib/pipeline-stages.sh (after line 190) — source pipeline-test-validation.sh
package.json — add "test:test-validation": "bash scripts/sw-test-validation-test.sh" to scripts

Dependencies

None new. Uses existing jq, bash -n, optional shellcheck.

Risk Areas

False positives on non-Shipwright projects: Mitigated by only enforcing bash harness checks on .sh files. Node/Python/Rust/Go projects get lightweight validation only (test command resolves, test directory exists).
shellcheck variability: Different shellcheck versions produce different warnings. Using WARNING severity means this never blocks.
bash -n false negatives: bash -n catches syntax errors but not runtime errors. This is intentional — it's a fast pre-flight check, not a full test runner.
Insertion point in stage_intake(): Between step 3 and step 4 is safe because test command detection (step 3) has no side effects, and branch creation (step 4) is the first side effect. Failing before branch creation means no cleanup needed.

Validation Criteria

Pipeline with valid Shipwright test files (both inline and test-helpers.sh patterns) passes intake validation
Pipeline with a non-executable .sh test file fails at intake with "not executable" error message
Pipeline with a .sh test file missing set -euo pipefail fails at intake
Pipeline with a .sh test file that sources test-helpers.sh passes (PASS/FAIL provided externally)
Pipeline with --skip-test-validation bypasses all checks and logs info message
test-validation.json written to $ARTIFACTS_DIR with correct schema
shellcheck runs when installed, skips with INFO when not installed
No test files found (new project) produces INFO warning, returns 0
bash -n catches syntax errors in test files
Non-bash projects (Node, Python, Go) get lightweight validation (TEST_CMD resolves) without bash harness checks
No regressions in sw-pipeline-test.sh (existing 58 tests pass)
Test suite sw-test-validation-test.sh registered in package.json and passes with 0 failures

Test Pyramid Breakdown

Unit tests (15 tests, ~70%): Each validation function tested in isolation:

find_test_files(): discovers .sh tests, discovers .test.ts files, handles empty project, filters non-test files
validate_bash_test_harness(): passes valid inline harness, passes test-helpers sourced harness, fails non-executable, fails missing set -e, fails missing PASS/FAIL, fails bash -n syntax error, handles shellcheck available/unavailable
_write_validation_report(): correct JSON schema, handles zero checks, handles errors vs warnings

Integration tests (4 tests, ~25%): Full validate_test_infrastructure() flow:

Valid Shipwright project with mixed test patterns → passes
Project with one broken test file → fails with error count
SKIP_TEST_VALIDATION=true → bypasses, writes skipped report
Non-bash project (Node.js mock) → lightweight validation only

E2E test (1 test, ~5%):

Broken test file in mock project → stage_intake() returns 1 (requires mock of gh and git)

Coverage Targets

100% of the 8 check types (executable, set -e, PASS/FAIL, ERR trap, bash -n, shellcheck, test-helpers pattern, test command resolution)
Both pass and fail paths for every ERROR-severity check
Graceful degradation paths: shellcheck missing, no test files, new project

Critical Paths to Test

Happy path: Valid .sh test file with set -euo pipefail, PASS=0, FAIL=0, trap ... ERR → all checks pass, report shows "valid": true, "errors": 0

Error cases:

Non-executable test file (chmod 644) → ERROR, "valid": false, intake returns 1
Test file with #!/usr/bin/env bash but missing set -euo pipefail → ERROR on harness check
Test file with syntax error (unmatched quote) → bash -n catches it, ERROR severity

Edge cases:

Test file sources test-helpers.sh but has no inline PASS/FAIL → valid (sourced pattern recognized)
TEST_CMD="npm test" but npm not on PATH → ERROR on command resolution (but this is unlikely in practice; may want WARNING instead)
Mixed project: some .sh tests valid, one invalid → reports per-file, overall "valid": false
SKIP_TEST_VALIDATION=true with broken test files → bypassed, report has "skipped": true

Pipeline Design 232

Design: Test Infrastructure Pre-Flight Validation Gate

Context

Decision

Data Flow

Validation Logic

Error Handling

Report Format

Alternatives Considered

Implementation Plan

Files to Create

Files to Modify

Dependencies

Risk Areas

Validation Criteria

Test Pyramid Breakdown

Coverage Targets

Critical Paths to Test

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!