Pipeline Plan 189

ezigus edited this page Mar 18, 2026 · 2 revisions

Plan written to .claude/pipeline-artifacts/plan.md.

The feature is already fully implemented in commit fded001. The plan documents the architecture of the existing implementation:

Architecture: Standalone library (scripts/lib/context-error.sh) with 8 public functions, integrated into 4 existing modules via explicit function calls.

Key components:

Error categorization (10 categories)
Memory-based similar failure lookup (5s timeout, graceful degradation)
Action generation (category + stage + memory-based)
4-section formatted output (What Failed / Why / Similar Past Issues / Suggested Actions)
Suggestion feedback loop (record, resolve, emit events)
GitHub markdown formatting for issue comments

All 13 tasks complete, 52 test cases, all acceptance criteria met. wn) is needed alongside terminal output

Feedback loop: track which suggestions lead to resolution

Acceptance criteria (from issue):

Error messages include: stage name, iteration count, and original issue/goal
Query memory system for similar past failures and their resolutions
Suggest 2-3 concrete next actions based on error type and history
Include relevant log snippets with context (not just raw stack traces)
Format error output with clear sections: What Failed / Why / Similar Past Issues / Suggested Actions
Track whether suggested actions lead to successful resolution (feedback loop)

Design Alternatives

Approach A: Standalone library module (CHOSEN)

New scripts/lib/context-error.sh with pure functions
Integration via function calls from existing error paths (pipeline-state, daemon-failure, loop-iteration)
Tradeoffs: (+) minimal blast radius, (+) testable in isolation, (+) graceful degradation when memory unavailable, (-) requires integration touchpoints in 4 files

Approach B: Monkey-patch existing error() function

Override the global error() helper to always enrich output
Tradeoffs: (+) zero integration work, (-) massive blast radius (every error call affected), (-) hard to test, (-) would slow down all error output with memory queries

Approach C: Post-processing pipeline

Capture all stderr, process after stage completion
Tradeoffs: (+) no code changes to existing paths, (-) delayed feedback, (-) complex stderr capture, (-) loses real-time context

Decision: Approach A — standalone library with explicit integration points. Minimizes blast radius while hitting all acceptance criteria.

Risk Assessment

Risk	Mitigation
Memory query timeout slows error reporting	5-second timeout guard on `memory_ranked_search`
Memory system not loaded/available	Graceful fallback: `query_similar_failures` returns `[]` if `memory_ranked_search` not defined
Very long error messages blow up report	`tail -10` + ANSI strip on log snippets; `cut -c1-120` on memory fix text
jq not available	All jq calls use `2>/dev/null
Suggestion JSONL file corruption	Atomic writes via tmp file + `mv`
Event schema changes	New events registered in `config/event-schema.json`

Dependency Analysis

Depends on:

scripts/lib/helpers.sh — emit_event(), output helpers
sw-memory.sh — memory_ranked_search(), repo_memory_dir() (optional, graceful degradation)
jq — JSON processing (with fallbacks)

Depended on by:

scripts/lib/pipeline-state.sh — mark_stage_failed() calls format_context_error_report + record_suggestion
scripts/lib/daemon-failure.sh — daemon_on_failure() appends context-aware section to GitHub comments
scripts/lib/loop-iteration.sh — compose_prompt() injects historical error context
scripts/lib/pipeline-stages-delivery.sh — pipeline completion resolves outstanding suggestions

No circular dependency risks — context-error.sh only depends on helpers.sh and memory (both upstream).

Files to Modify

New Files

scripts/lib/context-error.sh — Core library (430 lines): categorization, memory query, action generation, report formatting, suggestion tracking
scripts/sw-lib-context-error-test.sh — Test suite (52 test cases)

Modified Files

scripts/lib/pipeline-state.sh — Integration: mark_stage_failed() generates context error report + records suggestion
scripts/lib/daemon-failure.sh — Integration: daemon_on_failure() appends context-aware error section to GitHub comment
scripts/lib/loop-iteration.sh — Integration: compose_prompt() injects historical error context from context-error functions
scripts/lib/pipeline-stages-delivery.sh — Integration: pipeline completion resolves outstanding suggestions
config/event-schema.json — Register new events: error.context_generated, suggestion.recorded, suggestion.resolved
scripts/lib/helpers.sh — Minor: ensure emit_event is available for context-error module
scripts/lib/test-helpers.sh — Test infrastructure updates for context-error tests

Implementation Steps

Create scripts/lib/context-error.sh with:
- categorize_error() — Maps error text to 10 categories (FILE_ACCESS, FUNCTION_ERROR, SYNTAX_ERROR, ASSERTION_FAILURE, TYPE_ERROR, TIMEOUT, MEMORY_ERROR, NETWORK_ERROR, RESOURCE_ERROR, UNKNOWN)
- query_similar_failures() — Wraps memory_ranked_search with 5s timeout, graceful fallback to []
- generate_suggested_actions() — Combines memory-based fixes + category defaults + stage-specific actions into 2-3 concrete steps
- format_context_error_report() — 4-section terminal report (What Failed / Why / Similar Past Issues / Suggested Actions)
- format_github_error_comment() — Markdown-formatted version for GitHub issue comments
- record_suggestion() — Appends to suggestions.jsonl with ID, category, stage, actions; emits suggestion.recorded event
- mark_suggestion_resolved() — Atomic update of suggestion resolved status; emits suggestion.resolved event
- resolve_outstanding_suggestions() — Batch-resolve all unresolved on pipeline success
Integrate into pipeline-state.sh — In mark_stage_failed(), source context-error.sh and call format_context_error_report() to save artifact + record_suggestion() to track
Integrate into daemon-failure.sh — In daemon_on_failure(), if context-error functions are available, append formatted context section to GitHub issue comment
Integrate into loop-iteration.sh — In compose_prompt(), inject historical error context from context-error functions into loop iteration prompts
Integrate into pipeline-stages-delivery.sh — On pipeline completion, call resolve_outstanding_suggestions() to close the feedback loop
Update config/event-schema.json — Register error.context_generated, suggestion.recorded, suggestion.resolved with required/optional fields
Create test suite — 52 test cases covering categorization, formatting, memory query mocking, suggestion recording/resolution, GitHub comment formatting, edge cases

Task Checklist

Task 1: Create scripts/lib/context-error.sh with error categorization (10 categories)
Task 2: Implement query_similar_failures() with memory system integration and timeout guard
Task 3: Implement generate_suggested_actions() with category + stage + memory-based actions
Task 4: Implement format_context_error_report() — 4-section terminal output
Task 5: Implement format_github_error_comment() — Markdown-formatted for GitHub
Task 6: Implement suggestion tracking (record_suggestion, mark_suggestion_resolved, resolve_outstanding_suggestions)
Task 7: Register new events in config/event-schema.json
Task 8: Integrate into pipeline-state.sh mark_stage_failed()
Task 9: Integrate into daemon-failure.sh daemon_on_failure()
Task 10: Integrate into loop-iteration.sh compose_prompt()
Task 11: Integrate into pipeline completion (resolve suggestions on success)
Task 12: Create comprehensive test suite (sw-lib-context-error-test.sh, 52 cases)
Task 13: Update all modified test suites to pass with new integration points

Task Decomposition (with dependencies)

Task 1 (context-error.sh core) — no dependencies
Task 2 (memory query) — depends on Task 1
Task 3 (action generation) — depends on Tasks 1, 2
Task 4 (terminal formatting) — depends on Tasks 1-3
Task 5 (GitHub formatting) — depends on Tasks 1-3
Task 6 (suggestion tracking) — depends on Task 1
Task 7 (event schema) — independent
Task 8 (pipeline-state integration) — depends on Tasks 4, 6
Task 9 (daemon-failure integration) — depends on Tasks 5, 6
Task 10 (loop-iteration integration) — depends on Tasks 2, 3
Task 11 (completion integration) — depends on Task 6
Task 12 (test suite) — depends on Tasks 1-6
Task 13 (existing test updates) — depends on Tasks 8-11

Testing Approach

Test file: scripts/sw-lib-context-error-test.sh (52 test cases)

Categories tested:

categorize_error() — All 10 error categories with representative error strings
query_similar_failures() — Mocked memory_ranked_search returning JSON; missing function fallback; empty memory fallback
generate_suggested_actions() — Category-specific actions; stage-specific actions; memory-based fix suggestions; combination testing
format_context_error_report() — Section presence; field population (stage, iteration, issue, goal); NO_COLOR support
format_github_error_comment() — Markdown structure; details/summary tags; field population
record_suggestion() — JSONL append; field validation; event emission
mark_suggestion_resolved() — Atomic update; correct ID matching; event emission
resolve_outstanding_suggestions() — Batch resolution; already-resolved preservation

Integration testing: Existing test suites for pipeline-state, daemon-failure, and loop-iteration updated to account for new integration points.

Run: npm test (executes all 150+ test suites including the new one)

Definition of Done

scripts/lib/context-error.sh exists with all 8 public functions
Error messages include stage name, iteration count, and original issue/goal
Memory system queried for similar past failures (with graceful degradation)
2-3 concrete next actions generated based on error type + history + stage
Log snippets included with context (filtered, ANSI-stripped, last 10 lines)
Output formatted with 4 sections: What Failed / Why / Similar Past Issues / Suggested Actions
GitHub comment format with markdown, collapsible details, and structured sections
Suggestion feedback loop: record suggestions, mark resolved on success, emit events
Integration into pipeline-state.sh, daemon-failure.sh, loop-iteration.sh, pipeline-stages-delivery.sh
New events registered in config/event-schema.json
52 test cases pass in sw-lib-context-error-test.sh
All existing test suites pass (npm test)

Alternatives Considered

Approach	Complexity	Performance	Maintainability	Blast Radius
A: Standalone library (chosen)	Medium	Good (5s timeout)	High (isolated, testable)	Low (4 integration points)
B: Override error() globally	Low	Poor (every error triggers query)	Low (hard to test, side effects)	Very high (all error paths)
C: Post-process stderr	High	Poor (delayed)	Medium (complex capture)	Medium (stderr interception)

Risk Analysis

Risk	Impact	Likelihood	Mitigation
Memory query timeout	Slow error reporting	Low	5-second timeout guard; fallback to `[]`
Memory system unavailable	No similar failures shown	Medium	Graceful `type` check before calling; `[]` default
jq unavailable	JSON processing fails	Low	All jq calls wrapped with `2>/dev/null
Suggestion file corruption	Lost feedback data	Low	Atomic writes via tmp + mv
Large error messages	Bloated reports	Medium	`tail -10` limit; `cut -c1-120` for memory text

Root Cause Hypothesis

Not applicable — No previous plan stage failure to diagnose.

Evidence Gathered

The implementation already exists in commit fded001 with all acceptance criteria met.

Fix Strategy

Not applicable — No failure to fix.

Verification Plan

Run npm test to verify all 150+ test suites pass
Run scripts/sw-lib-context-error-test.sh specifically to verify the 52 context-error test cases
Verify integration points by checking that pipeline-state, daemon-failure, and loop-iteration properly call context-error functions

Endpoint Specification

Not applicable — Shell library module, not an API endpoint.

Error Codes

Not applicable — Shell functions use return codes (0 = success) with graceful degradation.

Rate Limiting

Not applicable — Internal library; memory queries have a 5-second timeout guard.

Versioning

Version: 3.2.4 (synced with all scripts via VERSION variable). No breaking changes — all new functions are additive.

Pipeline Plan 189

Design Alternatives

Risk Assessment

Dependency Analysis

Files to Modify

New Files

Modified Files

Implementation Steps

Task Checklist

Task Decomposition (with dependencies)

Testing Approach

Definition of Done

Alternatives Considered

Risk Analysis

Root Cause Hypothesis

Evidence Gathered

Fix Strategy

Verification Plan

Endpoint Specification

Error Codes

Rate Limiting

Versioning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!