-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Design 189
ADR written to .claude/pipeline-artifacts/design.md.
Key architectural decisions documented:
-
Standalone library (
context-error.sh) with 8 public functions — chosen over globalerror()override (blast radius too high) and stderr post-processing (can't inject into loop prompts mid-iteration) -
4 integration points via
typeguards — pipeline-state, daemon-failure, loop-iteration, sw-pipeline completion — all degrade gracefully when the library isn't loaded - 5-second timeout on memory queries prevents IO-bound lookups from stalling error output
- Unidirectional dependency graph — callers → context-error → memory + helpers, no cycles
- Atomic JSONL writes for suggestion tracking with tmp+mv pattern
- All 15 validation criteria checked off against the existing implementation in commit
fded001prompts (loop-iteration), and pipeline completion (feedback loop) - All jq usage must tolerate
jqbeing absent - Atomic file writes required (pipefail + concurrent workers)
Standalone library module (scripts/lib/context-error.sh) with 8 pure-ish functions, integrated into 4 existing modules via explicit type func_name >/dev/null 2>&1 guard checks. No global side effects — callers opt in by sourcing the library and calling functions directly.
┌──────────────────────────────────────────────────────────────┐
│ Callers (Integration Points) │
│ │
│ pipeline-state.sh daemon-failure.sh loop-iteration.sh │
│ mark_stage_failed() daemon_on_failure() compose_prompt() │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ format_context_ format_github_ query_similar_ │
│ error_report() error_comment() failures() + │
│ │ │ categorize_error() + │
│ │ │ generate_actions() │
│ └────────┬───────────┘ │ │
│ ▼ │ │
│ ┌─────────────────────────────────┐ │ │
│ │ context-error.sh │◄─────────────┘ │
│ │ │ │
│ │ categorize_error() │ │
│ │ query_similar_failures() ─────┼──► sw-memory.sh │
│ │ generate_suggested_actions() │ (memory_ranked_search)│
│ │ format_context_error_report() │ │
│ │ format_github_error_comment() │ │
│ │ record_suggestion() ─────┼──► suggestions.jsonl │
│ │ mark_suggestion_resolved() │ │
│ │ resolve_outstanding_suggestions│ │
│ └─────────────────────────────────┘ │
│ │ │
│ ▼ │
│ sw-pipeline.sh (completion) ── resolve_outstanding_ │
│ suggestions() │
│ │
│ helpers.sh ── emit_event() ──► events.jsonl │
└──────────────────────────────────────────────────────────────┘
Dependencies flow one direction: callers → context-error → memory + helpers. No circular dependencies.
// Error categorization — maps error text to one of 10 known categories categorize_error(error_msg: string): ErrorCategory // Returns: "FILE_ACCESS" | "FUNCTION_ERROR" | "SYNTAX_ERROR" | "ASSERTION_FAILURE" // | "TYPE_ERROR" | "TIMEOUT" | "MEMORY_ERROR" | "NETWORK_ERROR" // | "RESOURCE_ERROR" | "UNKNOWN" // Errors: never fails (falls through to "UNKNOWN") // Memory query with timeout guard query_similar_failures(error_msg: string, max_results?: number = 3): JSONArray // Returns: JSON array of similar past failures, or "[]" // Errors: returns "[]" if memory_ranked_search unavailable, dir missing, or timeout // Timeout: 5 seconds hard limit via `timeout` command // Action generation — combines memory fixes + category defaults + stage context generate_suggested_actions( category: ErrorCategory, failure_class: string, // unused, reserved for future error-actionability bridge stage_id: string, similar_json: JSONArray ): string // newline-separated action list (2-4 items) // Errors: never fails (always produces at least 2 default actions) // Terminal-formatted 4-section report format_context_error_report( error_msg: string, stage_id: string, iteration: number, goal: string, issue_number: string ): string // multi-line formatted report // Errors: returns partial report on internal failures (each section independent) // Respects NO_COLOR env var for plain-text output // GitHub markdown-formatted report format_github_error_comment( error_msg: string, stage_id: string, iteration: number, goal: string, issue_number: string ): string // markdown with collapsible <details> for error logs // Errors: same graceful degradation as terminal format // Suggestion tracking — append to JSONL record_suggestion( suggestion_id: string, category: ErrorCategory, stage_id: string, actions_json: JSONArray ): void // Side effects: appends to $ARTIFACTS_DIR/suggestions.jsonl (atomic via tmp+mv) // emits suggestion.recorded event // Individual resolution mark_suggestion_resolved(suggestion_id: string, resolved?: string = "true"): void // Side effects: atomically rewrites suggestions.jsonl, emits suggestion.resolved event // Errors: no-op if file missing or ID not found // Batch resolution on pipeline success resolve_outstanding_suggestions(): void // Side effects: marks all unresolved suggestions as resolved (atomic rewrite) // Errors: no-op if file missing
Error path (pipeline stage fails):
Stage fails
→ pipeline-state.sh:mark_stage_failed()
→ reads last 5 lines from stage log
→ format_context_error_report(log_tail, stage, iteration, goal, issue)
→ categorize_error(log_tail) → category
→ query_similar_failures(log_tail, 3) → [similar matches] (5s timeout)
→ generate_suggested_actions(category, "", stage, similar) → actions
→ assemble 4-section report
→ save_artifact("context-error-{stage}.md", report)
→ record_suggestion(id, category, stage, actions_json)
→ emit_event("error.context_generated")
Daemon failure path (GitHub comment):
Daemon worker fails
→ daemon-failure.sh:daemon_on_failure()
→ format_github_error_comment(log_tail, "pipeline", 0, goal, issue)
→ same internal flow as above, markdown output
→ appended to GitHub issue comment body
Loop iteration path (prompt injection):
Build loop iteration with prior errors
→ loop-iteration.sh:compose_prompt()
→ query_similar_failures(error_lines, 3)
→ categorize_error(error_lines)
→ generate_suggested_actions(category, "", "build", similar)
→ inject "Historical Context" section into Claude prompt
Feedback loop (pipeline success):
Pipeline completes successfully
→ sw-pipeline.sh (completion block)
→ resolve_outstanding_suggestions()
→ rewrites suggestions.jsonl, setting resolved="true" on all null entries
| Component | Error Source | Handling |
|---|---|---|
query_similar_failures |
memory_ranked_search not loaded |
type check → return []
|
query_similar_failures |
Memory dir doesn't exist | Dir check → return []
|
query_similar_failures |
Query takes too long |
timeout 5 → return []
|
generate_suggested_actions |
jq parse failure on similar_json |
2>/dev/null fallback → skip memory-based fix, use category defaults |
record_suggestion |
jq unavailable |
2>/dev/null on jq call; file may not be written |
record_suggestion |
ARTIFACTS_DIR not writable |
mkdir -p with fallback; silent failure |
mark_suggestion_resolved |
Concurrent writers | Atomic tmp + mv; last writer wins (acceptable for tracking data) |
| All callers | context-error.sh not sourced |
type func_name >/dev/null 2>&1 guard before every call |
| All callers | Any function throws |
2>/dev/null with fallback wrapping at call sites |
Every integration point is guarded with type ... >/dev/null 2>&1 so the system operates identically when context-error.sh is not loaded. No caller can fail due to this module being absent or broken.
-
Override global
error()function — Pros: zero integration work, every error path automatically enriched. Cons: massive blast radius (all error calls would trigger memory queries), 5-second timeout penalty on every error, untestable (global state mutation), would break simple error output in non-pipeline contexts. Rejected because the cost model is inverted: most errors don't need memory context, but all would pay the latency price. -
Post-process stderr after stage completion — Pros: no source code changes to existing modules. Cons: requires complex stderr capture (tee + temp files), delays error feedback until after the stage completes (defeats the purpose of real-time context), loses the ability to inject context into loop prompts mid-iteration. Rejected because the loop-iteration use case requires error context during execution, not after.
-
scripts/lib/context-error.sh— 431-line core library (8 public functions, include guard, version-stamped) -
scripts/sw-lib-context-error-test.sh— 52-case test suite covering all functions, edge cases, and event emission
-
scripts/lib/pipeline-state.sh—mark_stage_failed()callsformat_context_error_report+record_suggestion(lines ~440-464) -
scripts/lib/daemon-failure.sh—daemon_on_failure()callsformat_github_error_comment(lines ~371-376) -
scripts/lib/loop-iteration.sh—compose_prompt()injects historical error context viaquery_similar_failures+categorize_error+generate_suggested_actions(lines ~121-132) -
scripts/sw-pipeline.sh— Pipeline completion block callsresolve_outstanding_suggestions(line ~2746) -
config/event-schema.json— Registers 3 new events:error.context_generated,suggestion.recorded,suggestion.resolved
- No new external dependencies
- Runtime dependency on
jq(with fallbacks for absence) - Optional dependency on
memory_ranked_searchfromsw-memory.sh(graceful degradation)
-
Memory query latency: Mitigated by 5-second
timeoutcommand wrapper. If the memory corpus is very large, queries could approach this limit. Monitor viasuggestion.recordedevent timestamps. -
suggestions.jsonl concurrent access:
mark_suggestion_resolvedandresolve_outstanding_suggestionsboth do full-file rewrite via tmp+mv. In theory, two concurrent pipelines could race. Acceptable for tracking/analytics data — not a correctness concern. -
Error category accuracy: Regex-based categorization (
categorize_error) is necessarily heuristic. The ordered regex cascade means a message matching both "timeout" and "assertion" patterns would be classified by whichever regex appears first. This is acceptable because the category drives suggestions, not control flow.
-
scripts/lib/context-error.shloads without error and exports 8 public functions -
categorize_errorcorrectly classifies all 10 error categories via regex patterns -
query_similar_failuresreturns[]when memory system is unavailable (no crash, no hang) -
query_similar_failuresrespects 5-second timeout (tested via mock) -
format_context_error_reportproduces all 4 sections: What Failed, Why, Similar Past Issues, Suggested Actions -
format_context_error_reportincludes stage name, iteration count, issue number, and goal in output -
format_github_error_commentproduces valid markdown with<details>blocks -
record_suggestionappends valid JSONL and emitssuggestion.recordedevent -
mark_suggestion_resolvedatomically updates the correct entry and emitssuggestion.resolvedevent -
resolve_outstanding_suggestionsbatch-resolves unresolved entries without touching already-resolved ones - All 4 integration points use
type ... >/dev/null 2>&1guards (zero impact when library not loaded) - All 52 test cases pass in
scripts/sw-lib-context-error-test.sh - Full test suite (
npm test) passes with no regressions -
NO_COLORenvironment variable produces plain-text output (no Unicode box-drawing) - New events registered in
config/event-schema.jsonwith correct required/optional fields