Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Plan 189

ezigus edited this page Mar 18, 2026 · 2 revisions

Plan written to .claude/pipeline-artifacts/plan.md.

The feature is already fully implemented in commit fded001. The plan documents the architecture of the existing implementation:

Architecture: Standalone library (scripts/lib/context-error.sh) with 8 public functions, integrated into 4 existing modules via explicit function calls.

Key components:

  • Error categorization (10 categories)
  • Memory-based similar failure lookup (5s timeout, graceful degradation)
  • Action generation (category + stage + memory-based)
  • 4-section formatted output (What Failed / Why / Similar Past Issues / Suggested Actions)
  • Suggestion feedback loop (record, resolve, emit events)
  • GitHub markdown formatting for issue comments

All 13 tasks complete, 52 test cases, all acceptance criteria met. wn) is needed alongside terminal output

  • Feedback loop: track which suggestions lead to resolution

Acceptance criteria (from issue):

  1. Error messages include: stage name, iteration count, and original issue/goal
  2. Query memory system for similar past failures and their resolutions
  3. Suggest 2-3 concrete next actions based on error type and history
  4. Include relevant log snippets with context (not just raw stack traces)
  5. Format error output with clear sections: What Failed / Why / Similar Past Issues / Suggested Actions
  6. Track whether suggested actions lead to successful resolution (feedback loop)

Design Alternatives

Approach A: Standalone library module (CHOSEN)

  • New scripts/lib/context-error.sh with pure functions
  • Integration via function calls from existing error paths (pipeline-state, daemon-failure, loop-iteration)
  • Tradeoffs: (+) minimal blast radius, (+) testable in isolation, (+) graceful degradation when memory unavailable, (-) requires integration touchpoints in 4 files

Approach B: Monkey-patch existing error() function

  • Override the global error() helper to always enrich output
  • Tradeoffs: (+) zero integration work, (-) massive blast radius (every error call affected), (-) hard to test, (-) would slow down all error output with memory queries

Approach C: Post-processing pipeline

  • Capture all stderr, process after stage completion
  • Tradeoffs: (+) no code changes to existing paths, (-) delayed feedback, (-) complex stderr capture, (-) loses real-time context

Decision: Approach A — standalone library with explicit integration points. Minimizes blast radius while hitting all acceptance criteria.

Risk Assessment

Risk Mitigation
Memory query timeout slows error reporting 5-second timeout guard on memory_ranked_search
Memory system not loaded/available Graceful fallback: query_similar_failures returns [] if memory_ranked_search not defined
Very long error messages blow up report tail -10 + ANSI strip on log snippets; cut -c1-120 on memory fix text
jq not available All jq calls use `2>/dev/null
Suggestion JSONL file corruption Atomic writes via tmp file + mv
Event schema changes New events registered in config/event-schema.json

Dependency Analysis

Depends on:

  • scripts/lib/helpers.shemit_event(), output helpers
  • sw-memory.shmemory_ranked_search(), repo_memory_dir() (optional, graceful degradation)
  • jq — JSON processing (with fallbacks)

Depended on by:

  • scripts/lib/pipeline-state.shmark_stage_failed() calls format_context_error_report + record_suggestion
  • scripts/lib/daemon-failure.shdaemon_on_failure() appends context-aware section to GitHub comments
  • scripts/lib/loop-iteration.shcompose_prompt() injects historical error context
  • scripts/lib/pipeline-stages-delivery.sh — pipeline completion resolves outstanding suggestions

No circular dependency risks — context-error.sh only depends on helpers.sh and memory (both upstream).


Files to Modify

New Files

  1. scripts/lib/context-error.sh — Core library (430 lines): categorization, memory query, action generation, report formatting, suggestion tracking
  2. scripts/sw-lib-context-error-test.sh — Test suite (52 test cases)

Modified Files

  1. scripts/lib/pipeline-state.sh — Integration: mark_stage_failed() generates context error report + records suggestion
  2. scripts/lib/daemon-failure.sh — Integration: daemon_on_failure() appends context-aware error section to GitHub comment
  3. scripts/lib/loop-iteration.sh — Integration: compose_prompt() injects historical error context from context-error functions
  4. scripts/lib/pipeline-stages-delivery.sh — Integration: pipeline completion resolves outstanding suggestions
  5. config/event-schema.json — Register new events: error.context_generated, suggestion.recorded, suggestion.resolved
  6. scripts/lib/helpers.sh — Minor: ensure emit_event is available for context-error module
  7. scripts/lib/test-helpers.sh — Test infrastructure updates for context-error tests

Implementation Steps

  1. Create scripts/lib/context-error.sh with:

    • categorize_error() — Maps error text to 10 categories (FILE_ACCESS, FUNCTION_ERROR, SYNTAX_ERROR, ASSERTION_FAILURE, TYPE_ERROR, TIMEOUT, MEMORY_ERROR, NETWORK_ERROR, RESOURCE_ERROR, UNKNOWN)
    • query_similar_failures() — Wraps memory_ranked_search with 5s timeout, graceful fallback to []
    • generate_suggested_actions() — Combines memory-based fixes + category defaults + stage-specific actions into 2-3 concrete steps
    • format_context_error_report() — 4-section terminal report (What Failed / Why / Similar Past Issues / Suggested Actions)
    • format_github_error_comment() — Markdown-formatted version for GitHub issue comments
    • record_suggestion() — Appends to suggestions.jsonl with ID, category, stage, actions; emits suggestion.recorded event
    • mark_suggestion_resolved() — Atomic update of suggestion resolved status; emits suggestion.resolved event
    • resolve_outstanding_suggestions() — Batch-resolve all unresolved on pipeline success
  2. Integrate into pipeline-state.sh — In mark_stage_failed(), source context-error.sh and call format_context_error_report() to save artifact + record_suggestion() to track

  3. Integrate into daemon-failure.sh — In daemon_on_failure(), if context-error functions are available, append formatted context section to GitHub issue comment

  4. Integrate into loop-iteration.sh — In compose_prompt(), inject historical error context from context-error functions into loop iteration prompts

  5. Integrate into pipeline-stages-delivery.sh — On pipeline completion, call resolve_outstanding_suggestions() to close the feedback loop

  6. Update config/event-schema.json — Register error.context_generated, suggestion.recorded, suggestion.resolved with required/optional fields

  7. Create test suite — 52 test cases covering categorization, formatting, memory query mocking, suggestion recording/resolution, GitHub comment formatting, edge cases


Task Checklist

  • Task 1: Create scripts/lib/context-error.sh with error categorization (10 categories)
  • Task 2: Implement query_similar_failures() with memory system integration and timeout guard
  • Task 3: Implement generate_suggested_actions() with category + stage + memory-based actions
  • Task 4: Implement format_context_error_report() — 4-section terminal output
  • Task 5: Implement format_github_error_comment() — Markdown-formatted for GitHub
  • Task 6: Implement suggestion tracking (record_suggestion, mark_suggestion_resolved, resolve_outstanding_suggestions)
  • Task 7: Register new events in config/event-schema.json
  • Task 8: Integrate into pipeline-state.sh mark_stage_failed()
  • Task 9: Integrate into daemon-failure.sh daemon_on_failure()
  • Task 10: Integrate into loop-iteration.sh compose_prompt()
  • Task 11: Integrate into pipeline completion (resolve suggestions on success)
  • Task 12: Create comprehensive test suite (sw-lib-context-error-test.sh, 52 cases)
  • Task 13: Update all modified test suites to pass with new integration points

Task Decomposition (with dependencies)

  1. Task 1 (context-error.sh core) — no dependencies
  2. Task 2 (memory query) — depends on Task 1
  3. Task 3 (action generation) — depends on Tasks 1, 2
  4. Task 4 (terminal formatting) — depends on Tasks 1-3
  5. Task 5 (GitHub formatting) — depends on Tasks 1-3
  6. Task 6 (suggestion tracking) — depends on Task 1
  7. Task 7 (event schema) — independent
  8. Task 8 (pipeline-state integration) — depends on Tasks 4, 6
  9. Task 9 (daemon-failure integration) — depends on Tasks 5, 6
  10. Task 10 (loop-iteration integration) — depends on Tasks 2, 3
  11. Task 11 (completion integration) — depends on Task 6
  12. Task 12 (test suite) — depends on Tasks 1-6
  13. Task 13 (existing test updates) — depends on Tasks 8-11

Testing Approach

Test file: scripts/sw-lib-context-error-test.sh (52 test cases)

Categories tested:

  • categorize_error() — All 10 error categories with representative error strings
  • query_similar_failures() — Mocked memory_ranked_search returning JSON; missing function fallback; empty memory fallback
  • generate_suggested_actions() — Category-specific actions; stage-specific actions; memory-based fix suggestions; combination testing
  • format_context_error_report() — Section presence; field population (stage, iteration, issue, goal); NO_COLOR support
  • format_github_error_comment() — Markdown structure; details/summary tags; field population
  • record_suggestion() — JSONL append; field validation; event emission
  • mark_suggestion_resolved() — Atomic update; correct ID matching; event emission
  • resolve_outstanding_suggestions() — Batch resolution; already-resolved preservation

Integration testing: Existing test suites for pipeline-state, daemon-failure, and loop-iteration updated to account for new integration points.

Run: npm test (executes all 150+ test suites including the new one)


Definition of Done

  • scripts/lib/context-error.sh exists with all 8 public functions
  • Error messages include stage name, iteration count, and original issue/goal
  • Memory system queried for similar past failures (with graceful degradation)
  • 2-3 concrete next actions generated based on error type + history + stage
  • Log snippets included with context (filtered, ANSI-stripped, last 10 lines)
  • Output formatted with 4 sections: What Failed / Why / Similar Past Issues / Suggested Actions
  • GitHub comment format with markdown, collapsible details, and structured sections
  • Suggestion feedback loop: record suggestions, mark resolved on success, emit events
  • Integration into pipeline-state.sh, daemon-failure.sh, loop-iteration.sh, pipeline-stages-delivery.sh
  • New events registered in config/event-schema.json
  • 52 test cases pass in sw-lib-context-error-test.sh
  • All existing test suites pass (npm test)

Alternatives Considered

Approach Complexity Performance Maintainability Blast Radius
A: Standalone library (chosen) Medium Good (5s timeout) High (isolated, testable) Low (4 integration points)
B: Override error() globally Low Poor (every error triggers query) Low (hard to test, side effects) Very high (all error paths)
C: Post-process stderr High Poor (delayed) Medium (complex capture) Medium (stderr interception)

Risk Analysis

Risk Impact Likelihood Mitigation
Memory query timeout Slow error reporting Low 5-second timeout guard; fallback to []
Memory system unavailable No similar failures shown Medium Graceful type check before calling; [] default
jq unavailable JSON processing fails Low All jq calls wrapped with `2>/dev/null
Suggestion file corruption Lost feedback data Low Atomic writes via tmp + mv
Large error messages Bloated reports Medium tail -10 limit; cut -c1-120 for memory text

Root Cause Hypothesis

Not applicable — No previous plan stage failure to diagnose.

Evidence Gathered

The implementation already exists in commit fded001 with all acceptance criteria met.

Fix Strategy

Not applicable — No failure to fix.

Verification Plan

  1. Run npm test to verify all 150+ test suites pass
  2. Run scripts/sw-lib-context-error-test.sh specifically to verify the 52 context-error test cases
  3. Verify integration points by checking that pipeline-state, daemon-failure, and loop-iteration properly call context-error functions

Endpoint Specification

Not applicable — Shell library module, not an API endpoint.

Error Codes

Not applicable — Shell functions use return codes (0 = success) with graceful degradation.

Rate Limiting

Not applicable — Internal library; memory queries have a 5-second timeout guard.

Versioning

Version: 3.2.4 (synced with all scripts via VERSION variable). No breaking changes — all new functions are additive.

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /