-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Plan 324
This issue closes the feedback loop by invoking ruflo_learn_from_shipwright() at two critical points in the pipeline lifecycle: when validation succeeds (end of stage_validate) and when failure occurs (failure path in stage_monitor). This enables the semantic recall system to learn from actual pipeline outcomes.
Add two function calls (~10 lines of code per site) at strategic points with fail-open guards.
- Outcome data (goal, issue, task_type, success/failure) must be passed consistently
- Event emissions must use existing
emit_eventinfrastructure - No blocking behavior — learning failures must not break the pipeline
- Tests must validate both happy path (success) and error path (failure)
✅ Call ruflo_learn_from_shipwright() at end of stage_validate() (success path)
✅ Call ruflo_learn_from_shipwright() on failure path in stage_monitor()
✅ Both wrapped with fail-open guards (ruflo_available() + || true)
✅ Event emissions: ruflo.learn_from_shipwright at each call site
✅ Existing tests remain green (sw-ruflo-adapter-test.sh)
✅ New tests: validate-success path + monitor-failure path
✅ Full test suite passes: npm test
- Call
ruflo_learn_from_shipwright()at end ofstage_validate()(success) - Call
ruflo_learn_from_shipwright()on failure path instage_monitor()(failure)
Pros:
- Explicit success/failure semantics
- Learning happens immediately when outcome is known
- Matches issue's explicit requirements
- Clear separation of concerns
Cons:
- Two call sites to maintain
- Must ensure consistent parameter passing
Blast Radius: Minimal (two isolated function calls, fail-open guards)
Consolidate to one call site with outcome flag.
Pros:
- Single entry point for learning
- Easier to maintain
Cons:
- Delays success-path learning until end of pipeline
- More complex parameter construction
- Violates issue's explicit requirement for validate-success call
- ❌ Rejected
Queue learning to a background worker to prevent blocking.
Pros:
- Non-blocking
- Scales better for expensive learning operations
Cons:
- Adds complexity for minimal benefit
- Existing function is lightweight
- ❌ Rejected (over-engineering)
| Risk | Impact | Mitigation |
|---|---|---|
ruflo_learn_from_shipwright unavailable |
Learning skipped silently, semantic recall stays stale | Guard with ruflo_available() + || true (fail-open) |
| Outcome artifact missing/malformed | Function fails or receives wrong data | Check artifact exists, validate JSON structure before passing |
| Learning function throws (unhandled) | Pipeline exits abnormally | Wrap in ( ... ) || true subshell; log error |
| Event emission fails | Pipeline continues but telemetry lost | Guard event emission with || true
|
| Test execution doesn't cover both paths | Acceptance criteria not met | Explicitly test success + failure paths separately |
- Task 1.1: Read
scripts/lib/ruflo-adapter.sh~line 900 — understand function signature, parameters, return value - Task 1.2: Read
scripts/lib/pipeline-stages-monitor.sh— locatestage_validate()andstage_monitor()functions - Task 1.3: Identify outcome artifact location and structure (stored by which stage, what fields)
- Task 1.4: Verify
ruflo.learn_from_shipwrightevent is registered inconfig/event-schema.json - Task 1.5: Locate existing fail-open pattern examples (e.g.,
stage_build.shlines 414–434)
- Task 2.1: Add
ruflo_learn_from_shipwright()call at end ofstage_validate()with success outcome - Task 2.2: Wrap call with
ruflo_available()guard +|| truefallback - Task 2.3: Extract outcome data: goal, issue_number, task_type from intake artifact
- Task 2.4: Emit
ruflo.learn_from_shipwrightevent (outcome=success) - Task 2.5: Test locally:
npm test -- sw-ruflo-adapter-test.sh
- Task 3.1: Identify failure path in
stage_monitor()(rollback/error handling section) - Task 3.2: Add
ruflo_learn_from_shipwright()call with failure outcome - Task 3.3: Wrap with fail-open guards
- Task 3.4: Emit
ruflo.learn_from_shipwrightevent (outcome=failure, error context) - Task 3.5: Test locally:
npm test -- sw-ruflo-adapter-test.sh
- Task 4.1: Run existing tests:
npm test— verify no regressions - Task 4.2: Add test in
sw-ruflo-adapter-test.sh: verifystage_validatesuccess path calls learning - Task 4.3: Add test in
sw-ruflo-adapter-test.sh: verifystage_monitorfailure path calls learning - Task 4.4: Add test: verify fail-open behavior (learning unavailable → no error)
- Task 4.5: Add test: verify event emissions at both call sites
- Task 4.6: Run full test suite:
npm test— all green
- Task 5.1: Create brief comment above each call site explaining why it's there
- Task 5.2: Verify both calls extract outcome data correctly
- Task 5.3: Manual walkthrough: trace data flow from pipeline artifact → function parameters
- Task 5.4: Update any related documentation (if needed)
-
Unit Tests (70%):
- Test
ruflo_learn_from_shipwright()function in isolation (existing) - Mock
ruflo_available()to return true/false - Verify parameter passing
- Count: 4 existing + 2 new = 6 tests
- Test
-
Integration Tests (20%):
- Test
stage_validate()calls learning on success - Test
stage_monitor()calls learning on failure - Verify event emissions
- Mock pipeline artifacts
- Count: 2 new tests
- Test
-
E2E Tests (10%):
- Run full pipeline with learning enabled
- Verify semantic recall uses learned outcomes
- Count: 1 existing (implicit)
-
Critical Path 1 (Success): Happy path in
stage_validate→ learning called with correct outcome -
Critical Path 2 (Failure): Failure path in
stage_monitor→ learning called with error context -
Edge Case 1:
ruflounavailable → pipeline continues, learning skipped - Edge Case 2: Outcome artifact malformed → learning fails gracefully, pipeline continues
- Edge Case 3: Event emission fails → learning still completes
# Phase 1: Unit tests only npm test -- sw-ruflo-adapter-test.sh # Phase 2: All tests npm test
| File | Change | Lines |
|---|---|---|
scripts/lib/pipeline-stages-monitor.sh |
Add ruflo_learn_from_shipwright() call at end of stage_validate()
|
TBD (depends on function location) |
scripts/lib/pipeline-stages-monitor.sh |
Add ruflo_learn_from_shipwright() call on failure path in stage_monitor()
|
TBD |
scripts/sw-ruflo-adapter-test.sh |
Add 2 new tests (validate-success, monitor-failure) | ~40 lines |
config/event-schema.json |
Verify event is registered (no changes expected) | ✓ |
# Extract function definition from ruflo-adapter.sh grep -A 30 "^ruflo_learn_from_shipwright()" scripts/lib/ruflo-adapter.sh
Expected signature (based on issue context):
ruflo_learn_from_shipwright() { # Parameters: goal, issue_number, task_type, outcome (success/failure), error_context? # Namespace: learning-<repo_hash> }
In scripts/lib/pipeline-stages-monitor.sh, at the end of stage_validate():
# Call learning function with success outcome (fail-open) if ruflo_available; then local intake_artifact="$PIPELINE_ARTIFACTS/intake.json" if [[ -f "$intake_artifact" ]]; then ruflo_learn_from_shipwright \ "$(jq -r '.goal' "$intake_artifact")" \ "$(jq -r '.issue_number' "$intake_artifact")" \ "$(jq -r '.task_type' "$intake_artifact")" \ "success" \ || true emit_event "ruflo.learn_from_shipwright" "outcome=success" "stage=validate" fi fi || true
In scripts/lib/pipeline-stages-monitor.sh, on the failure/rollback path:
# Call learning function with failure outcome (fail-open) if ruflo_available; then local intake_artifact="$PIPELINE_ARTIFACTS/intake.json" local error_summary="$PIPELINE_ARTIFACTS/error-summary.json" if [[ -f "$intake_artifact" ]]; then local error_context="" [[ -f "$error_summary" ]] && error_context="$(jq -c . "$error_summary")" ruflo_learn_from_shipwright \ "$(jq -r '.goal' "$intake_artifact")" \ "$(jq -r '.issue_number' "$intake_artifact")" \ "$(jq -r '.task_type' "$intake_artifact")" \ "failure" \ "$error_context" \ || true emit_event "ruflo.learn_from_shipwright" "outcome=failure" "stage=monitor" "error_context=$error_context" fi fi || true
In scripts/sw-ruflo-adapter-test.sh:
test_ruflo_learn_called_on_validate_success() { # Mock outcome: stage_validate exits 0 # Verify: ruflo_learn_from_shipwright was called with success=success # Verify: event was emitted } test_ruflo_learn_called_on_monitor_failure() { # Mock outcome: stage_monitor failure path triggered # Verify: ruflo_learn_from_shipwright was called with outcome=failure # Verify: event was emitted } test_ruflo_learn_fail_open_when_unavailable() { # Mock: ruflo_available returns 1 (unavailable) # Verify: pipeline continues (no error) # Verify: no event emitted }
- Both
ruflo_learn_from_shipwright()calls added topipeline-stages-monitor.sh- Call 1: At end of
stage_validate(), outcome=success - Call 2: On failure path in
stage_monitor(), outcome=failure
- Call 1: At end of
- Both calls are fail-open (guarded,
|| truefallback) - Both calls extract goal, issue_number, task_type from intake artifact
- Events emitted:
ruflo.learn_from_shipwrightat each call site - New tests added to
sw-ruflo-adapter-test.sh:- Test: validate-success path calls learning
- Test: monitor-failure path calls learning
- Test: fail-open behavior (unavailable → no error)
- Existing tests pass:
npm test -- sw-ruflo-adapter-test.sh✅ - Full test suite passes:
npm test✅ - No regressions in other stages
- Acceptance criteria met: All 6 criteria satisfied
- Execute Phase 1 (Code Analysis) — read functions, understand data flow
- Execute Phase 2 (Success Call Site) — add, test, verify
- Execute Phase 3 (Failure Call Site) — add, test, verify
- Execute Phase 4 (Testing) — run full suite, add tests
- Execute Phase 5 (Verification) — walkthrough, documentation
- Create PR with both call sites, tests, and verification notes
- Will read
ruflo-adapter.shat line 900+ (targeted section, ~30 lines) - Will read
pipeline-stages-monitor.shfor both functions (likely < 100 lines each) - Will grep for existing fail-open examples in
pipeline-stages-build.sh(reference) - Will use
jqfor JSON parsing (consistent with existing patterns)
This plan is ready to execute once approved.