Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pipeline Design 242

Seth Ford edited this page Mar 10, 2026 · 5 revisions

Now I have the full picture. Here's the ADR:


Design: Misleading "jq not available" warning when Claude outputs JSON object instead of array

Context

Claude CLI's --output-format json can produce either a JSON array ([{...}]) or a JSON object ({...}). Two functions in sw-loop.sh process this output:

  1. _extract_text_from_json() (lines 561-618) — extracts .result text from Claude's JSON response. This function was already fixed in prior iterations: it now handles both array and object formats (Case 2, lines 579-606), so the misleading "jq not available" warning (Case 3, line 610) is only reachable when jq is genuinely absent. Tests 17-19 and 23 validate this.

  2. accumulate_loop_tokens() (lines 511-556) — parses token counts. This function is still broken: line 516 checks head -c1 | grep -q '\[', so JSON objects (starting with {) fall through to the regex fallback (lines 548-554), silently losing structured token/cost data.

Constraints

  • Bash 3.2 compatibility (no associative arrays, no readarray)
  • jq may not be available — regex fallback must always exist
  • Atomic changes only — this is a bugfix, not a refactor

Decision

Fix accumulate_loop_tokens() to handle JSON objects the same way _extract_text_from_json() already does — check for both [ and { as the first character, then use the appropriate jq expression (.[-1].usage.* for arrays, .usage.* for objects).

This mirrors the pattern already validated in _extract_text_from_json() and keeps the two JSON-parsing functions symmetric.

Component Diagram

Claude CLI (--output-format json)
 │
 │ JSON array: [{result: "...", usage: {...}}]
 │ — OR —
 │ JSON object: {result: "...", usage: {...}}
 │
 ├──→ _extract_text_from_json() [ALREADY FIXED]
 │ ├─ Case 2a: array + jq → jq '.[-1].result'
 │ ├─ Case 2b: object + jq → jq '.result'
 │ ├─ Case 3: JSON + no jq → raw copy + warning
 │ └─ Case 4: not JSON → raw copy
 │
 └──→ accumulate_loop_tokens() [NEEDS FIX]
 ├─ Branch A: array + jq → jq '.[-1].usage.*' ← exists
 ├─ Branch B: object + jq → jq '.usage.*' ← MISSING
 └─ Fallback: regex parsing ← exists

Interface Contracts

// accumulate_loop_tokens(log_file: string): void
// Precondition: log_file is a path to a file (may not exist)
// Postcondition: LOOP_INPUT_TOKENS, LOOP_OUTPUT_TOKENS, LOOP_COST_MILLICENTS
// are incremented by values parsed from the file
// Error contract: never fails — all parse errors default to 0
// _extract_text_from_json(json_file: string, log_file: string, err_file: string): 0
// Precondition: json_file path provided (file may be empty/missing)
// Postcondition: log_file contains extracted text or placeholder
// Error contract: always returns 0 — never causes pipeline failure

Data Flow

accumulate_loop_tokens(log_file)
 → read first char of file
 → if '{' or '[' AND jq available:
 → if '[': jq '.[-1].usage.input_tokens // 0' (array path)
 → if '{': jq '.usage.input_tokens // 0' (object path) ← NEW
 → accumulate into LOOP_INPUT_TOKENS, LOOP_OUTPUT_TOKENS
 → parse cost_usd or estimate from model rates
 → else:
 → regex fallback (unchanged)

Error Boundaries

Component Error Source Handling
accumulate_loop_tokens jq parse failure `
accumulate_loop_tokens missing file early return 0
accumulate_loop_tokens non-numeric jq output ${var:-0} default
_extract_text_from_json jq failure `

Alternatives Considered

  1. Unified normalization layer — Pre-process all Claude output into a canonical array format before any parsing.

    • Pros: Single parsing path, DRY
    • Cons: Adds a new abstraction for a two-function problem; over-engineering for a bugfix; risk of breaking existing array handling
  2. Remove the jq path entirely, use only regex — Since regex fallback exists, just always use it.

    • Pros: Simpler, no jq dependency
    • Cons: Loses structured parsing (cost_usd, cache tokens); regex is fragile and can't extract nested fields; significant accuracy regression
  3. Wrap Claude CLI to force array output — Add | jq '[.]' or similar wrapper.

    • Pros: Guarantees array format downstream
    • Cons: Brittle coupling to Claude CLI internals; breaks if CLI changes; adds a jq dependency to the wrapper itself

Implementation Plan

  • Files to modify:

    • scripts/sw-loop.sh — Fix accumulate_loop_tokens() (lines 516-546) to handle { first-char alongside [, using .usage.* instead of .[-1].usage.* for objects
    • scripts/sw-loop-test.sh — Add test for accumulate_loop_tokens() with JSON object input (functional test, not just grep-based)
  • Files to create: None

  • Dependencies: None (jq is already optional with regex fallback)

  • Risk areas:

    • Arithmetic evaluation in bash: ensure jq output is always numeric before $(( )) — already handled by ${var:-0} pattern
    • The head -c1 approach assumes no leading whitespace — same assumption as _extract_text_from_json, acceptable since Claude CLI output is deterministic

Validation Criteria

  • accumulate_loop_tokens() correctly parses input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokens, and total_cost_usd from a JSON object (not just array)
  • New functional test: given {"type":"result","result":"...","usage":{"input_tokens":100,"output_tokens":50},"total_cost_usd":0.003}, token accumulators reflect correct values
  • Existing Test 23 continues to pass — no "jq not available" warning when jq IS present and output is a JSON object
  • All 69 existing sw-loop-test.sh tests continue to pass
  • Regex fallback still works when jq is absent (no regression)
  • No unrelated changes included in the diff

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /