Pipeline Design 242

Seth Ford edited this page Mar 10, 2026 · 5 revisions

Now I have the full picture. Here's the ADR:

Design: Misleading "jq not available" warning when Claude outputs JSON object instead of array

Context

Claude CLI's --output-format json can produce either a JSON array ([{...}]) or a JSON object ({...}). Two functions in sw-loop.sh process this output:

_extract_text_from_json() (lines 561-618) — extracts .result text from Claude's JSON response. This function was already fixed in prior iterations: it now handles both array and object formats (Case 2, lines 579-606), so the misleading "jq not available" warning (Case 3, line 610) is only reachable when jq is genuinely absent. Tests 17-19 and 23 validate this.
accumulate_loop_tokens() (lines 511-556) — parses token counts. This function is still broken: line 516 checks head -c1 | grep -q '\[', so JSON objects (starting with {) fall through to the regex fallback (lines 548-554), silently losing structured token/cost data.

Constraints

Bash 3.2 compatibility (no associative arrays, no readarray)
jq may not be available — regex fallback must always exist
Atomic changes only — this is a bugfix, not a refactor

Decision

Fix accumulate_loop_tokens() to handle JSON objects the same way _extract_text_from_json() already does — check for both [ and { as the first character, then use the appropriate jq expression (.[-1].usage.* for arrays, .usage.* for objects).

This mirrors the pattern already validated in _extract_text_from_json() and keeps the two JSON-parsing functions symmetric.

Component Diagram

Claude CLI (--output-format json)
 │
 │ JSON array: [{result: "...", usage: {...}}]
 │ — OR —
 │ JSON object: {result: "...", usage: {...}}
 │
 ├──→ _extract_text_from_json() [ALREADY FIXED]
 │ ├─ Case 2a: array + jq → jq '.[-1].result'
 │ ├─ Case 2b: object + jq → jq '.result'
 │ ├─ Case 3: JSON + no jq → raw copy + warning
 │ └─ Case 4: not JSON → raw copy
 │
 └──→ accumulate_loop_tokens() [NEEDS FIX]
 ├─ Branch A: array + jq → jq '.[-1].usage.*' ← exists
 ├─ Branch B: object + jq → jq '.usage.*' ← MISSING
 └─ Fallback: regex parsing ← exists

Interface Contracts

// accumulate_loop_tokens(log_file: string): void
// Precondition: log_file is a path to a file (may not exist)
// Postcondition: LOOP_INPUT_TOKENS, LOOP_OUTPUT_TOKENS, LOOP_COST_MILLICENTS
// are incremented by values parsed from the file
// Error contract: never fails — all parse errors default to 0
// _extract_text_from_json(json_file: string, log_file: string, err_file: string): 0
// Precondition: json_file path provided (file may be empty/missing)
// Postcondition: log_file contains extracted text or placeholder
// Error contract: always returns 0 — never causes pipeline failure

Data Flow

accumulate_loop_tokens(log_file)
 → read first char of file
 → if '{' or '[' AND jq available:
 → if '[': jq '.[-1].usage.input_tokens // 0' (array path)
 → if '{': jq '.usage.input_tokens // 0' (object path) ← NEW
 → accumulate into LOOP_INPUT_TOKENS, LOOP_OUTPUT_TOKENS
 → parse cost_usd or estimate from model rates
 → else:
 → regex fallback (unchanged)

Error Boundaries

Component	Error Source	Handling
`accumulate_loop_tokens`	jq parse failure	`
`accumulate_loop_tokens`	missing file	early `return 0`
`accumulate_loop_tokens`	non-numeric jq output	`${var:-0}` default
`_extract_text_from_json`	jq failure	`

Alternatives Considered

Unified normalization layer — Pre-process all Claude output into a canonical array format before any parsing.
- Pros: Single parsing path, DRY
- Cons: Adds a new abstraction for a two-function problem; over-engineering for a bugfix; risk of breaking existing array handling
Remove the jq path entirely, use only regex — Since regex fallback exists, just always use it.
- Pros: Simpler, no jq dependency
- Cons: Loses structured parsing (cost_usd, cache tokens); regex is fragile and can't extract nested fields; significant accuracy regression
Wrap Claude CLI to force array output — Add | jq '[.]' or similar wrapper.
- Pros: Guarantees array format downstream
- Cons: Brittle coupling to Claude CLI internals; breaks if CLI changes; adds a jq dependency to the wrapper itself

Implementation Plan

Files to modify:
- scripts/sw-loop.sh — Fix accumulate_loop_tokens() (lines 516-546) to handle { first-char alongside [, using .usage.* instead of .[-1].usage.* for objects
- scripts/sw-loop-test.sh — Add test for accumulate_loop_tokens() with JSON object input (functional test, not just grep-based)
Files to create: None
Dependencies: None (jq is already optional with regex fallback)
Risk areas:
- Arithmetic evaluation in bash: ensure jq output is always numeric before $(( )) — already handled by ${var:-0} pattern
- The head -c1 approach assumes no leading whitespace — same assumption as _extract_text_from_json, acceptable since Claude CLI output is deterministic

Validation Criteria

accumulate_loop_tokens() correctly parses input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokens, and total_cost_usd from a JSON object (not just array)
New functional test: given {"type":"result","result":"...","usage":{"input_tokens":100,"output_tokens":50},"total_cost_usd":0.003}, token accumulators reflect correct values
Existing Test 23 continues to pass — no "jq not available" warning when jq IS present and output is a JSON object
All 69 existing sw-loop-test.sh tests continue to pass
Regex fallback still works when jq is absent (no regression)
No unrelated changes included in the diff

Pipeline Design 242

Design: Misleading "jq not available" warning when Claude outputs JSON object instead of array

Context

Constraints

Decision

Component Diagram

Interface Contracts

Data Flow

Error Boundaries

Alternatives Considered

Implementation Plan

Validation Criteria

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!