-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Plan 326
Perfect. I've analyzed the codebase and prepared a comprehensive implementation plan for enriching the test_first stage with Ruflo semantic recall. Here's the detailed breakdown:
-
scripts/lib/pipeline-stages-build.sh(Primary implementation)- Function:
stage_test_first()(lines 6–114) - Two insertion points: recall at line 20, store at line 109
- Function:
-
scripts/sw-ruflo-adapter-test.sh(Test coverage)- Add 2 new test sections at end of file
- Test recall and store function signatures
Insert after requirements gathering, before prompt building:
# Inject ruflo vector-similar past outcomes for TDD test generation local _ruflo_tdd_ctx="" if declare -f ruflo_recall_similar_outcomes >/dev/null 2>&1 && \ declare -f ruflo_available >/dev/null 2>&1 && \ ruflo_available; then _ruflo_tdd_ctx=$(ruflo_recall_similar_outcomes \ "${INTELLIGENCE_ISSUE_TYPE:-feature}" "${ISSUE_LABELS:-}" 2>/dev/null || true) fi
Update the tdd_prompt variable to include recalled outcomes:
Requirements: ${requirements} ${_ruflo_tdd_ctx:+ ## Similar Past Test Outcomes (ruflo semantic search) ${_ruflo_tdd_ctx}}
(The ${var:+text} syntax injects only if _ruflo_tdd_ctx is non-empty.)
Insert after test generation succeeds:
# Store test generation result in ruflo for cross-stage context if declare -f ruflo_store >/dev/null 2>&1 && [[ "$wrote_any" == "true" ]]; then local test_first_output test_first_output=$(git diff --stat HEAD~1 2>/dev/null | head -c 2000 || echo "TDD tests generated") ruflo_store "stage-test-first-result" \ "$test_first_output" \ "pipeline-${SHIPWRIGHT_PIPELINE_ID:-unknown}" || true fi
Tests that ruflo_recall_similar_outcomes() is called and results are available:
# Test NN: ruflo_recall_similar_outcomes called in stage_test_first context print_test_section "stage_test_first — ruflo recall integration" INTELLIGENCE_ISSUE_TYPE="feature" ISSUE_LABELS="tdd" export SHIPWRIGHT_PIPELINE_ID="test-pipeline-123" mock_ruflo_recall_count=0 ruflo_recall_similar_outcomes() { ((mock_ruflo_recall_count++)) echo "Similar test patterns found: mock coverage approach" return 0 } declare -f ruflo_recall_similar_outcomes >/dev/null 2>&1 && \ declare -f ruflo_available >/dev/null 2>&1 && \ RUFLO_AVAILABLE=true && \ _ruflo_tdd_ctx=$(ruflo_recall_similar_outcomes "$INTELLIGENCE_ISSUE_TYPE" "$ISSUE_LABELS" 2>/dev/null || true) [[ -n "$_ruflo_tdd_ctx" && "$mock_ruflo_recall_count" -eq 1 ]] && \ assert_pass "stage_test_first recalls similar outcomes" || \ assert_fail "stage_test_first recalls similar outcomes"
Tests that ruflo_store() is called with correct arguments:
# Test NN+1: ruflo_store called in stage_test_first context with test results print_test_section "stage_test_first — ruflo store integration" mock_store_called=0 mock_store_key="" mock_store_namespace="" ruflo_store() { ((mock_store_called++)) mock_store_key="1ドル" mock_store_namespace="3ドル" return 0 } wrote_any=true SHIPWRIGHT_PIPELINE_ID="test-pipeline-456" declare -f ruflo_store >/dev/null 2>&1 && \ [[ "$wrote_any" == "true" ]] && \ ruflo_store "stage-test-first-result" \ "mock test output" \ "pipeline-${SHIPWRIGHT_PIPELINE_ID:-unknown}" || true [[ "$mock_store_called" -eq 1 && "$mock_store_key" == "stage-test-first-result" && \ "$mock_store_namespace" == "pipeline-test-pipeline-456" ]] && \ assert_pass "stage_test_first stores results with correct namespace" || \ assert_fail "stage_test_first stores results with correct namespace"
npm test- Read
stage_test_first()inscripts/lib/pipeline-stages-build.sh - Insert recall block at line 20 (after requirements gathering)
- Update
tdd_promptvariable to inject_ruflo_tdd_ctxwith conditional syntax - Insert store block at line 109 (after test commit)
- Add recall test to
sw-ruflo-adapter-test.sh - Add store test to
sw-ruflo-adapter-test.sh - Run
npm testand verify all tests pass - Manual verification: Confirm pipeline recalls and stores without errors
- Verify all error paths fail-open (no pipeline breaks)
- Document implementation decision in memory for future reference
| Risk | Impact | Mitigation |
|---|---|---|
| Prompt injection (special chars in recall) | Malformed TDD prompt | Use ${var:+text} conditional expansion; undefined context is silently skipped |
| Latency impact | Recall lookup adds 50–100ms | Fail-open, memoized in Ruflo; negligible on 2–5 min pipeline |
| Function unavailability | Recall/store fail | All calls guarded by declare -f checks; fail-open prevents propagation |
| Namespace collision | Results shadow other pipelines |
SHIPWRIGHT_PIPELINE_ID always set in pipeline.sh before stages |
| Circular dependency | Infinite recursion | Ruflo functions are pure reads/writes; no stage invocation risk |
- Both function calls (
recall_similar_outcomes,ruflo_store) implemented - Recall context injected into Claude prompt using conditional expansion
- Store called after test files are written with correct namespace
- All calls fail-open (no error propagation to pipeline)
- Namespace format:
pipeline-${SHIPWRIGHT_PIPELINE_ID:-unknown} - Two new tests added (recall + store paths)
-
npm testruns successfully with 100% new code coverage - Manual smoke test confirms recall/store occur without breaking pipeline
Test Pyramid: 2 unit tests (mock-based), 0 integration (covered by existing e2e)
Critical Paths:
- ✓ Happy path: Recall returns context → injected → store called
- ✓ Unavailable:
RUFLO_AVAILABLE=false→ empty context → store skipped - ✓ Function missing:
declare -ffails → no calls made → pipeline continues
Coverage Target: 100% of new code paths in recall/store integration
Approach: Minimal 15–20 line integration (proven pattern from plan/design stages)
Why this approach:
- Issue explicitly specifies lightweight context enrichment only
- Reuses existing, battle-tested Ruflo functions
- Follows established pattern in
stage_plan()andstage_design()exactly - Fail-open design ensures zero risk to pipeline
- Can be extended later without refactoring
Alternatives rejected:
- Sophisticated context injection (out of scope)
- Async storage (adds complexity without benefit)
This implementation is ready to code — all patterns are proven and tested in the codebase.