-
Notifications
You must be signed in to change notification settings - Fork 1.3k
codex-rescue subagent returns empty output on any companion error — "return nothing" instruction + stderr-only error path #350
Description
Summary
When node codex-companion.mjs task ... fails for any reason — auth not refreshed, codex CLI missing, EPERM on a worktree index.lock, JSON parse error, app-server crash, anything caught by the top-level main().catch() — the codex:codex-rescue subagent emits an empty string to its parent. The parent thread (Claude Code) sees no output and treats the rescue as a no-op. The user is never told the call failed.
This is distinct from #324 (auto-backgrounding + strip---wait produces a stub instead of the real output). Here the failure mode is no output at all when anything in the companion errors before the worker produces stdout.
Root cause — two compounding decisions
1. Agent prompt explicitly instructs swallowing errors
plugins/codex/agents/codex-rescue.md:42:
"If the Bash call fails or Codex cannot be invoked, return nothing."
Combined with plugins/codex/commands/rescue.md:43 ("Return the Codex companion stdout verbatim ... Do not paraphrase, summarize, rewrite, or add commentary before or after it."), the agent is contractually required to emit empty output on every failure path.
2. Companion writes errors to stderr; agent only forwards stdout
plugins/codex/scripts/codex-companion.mjs:1023-1027:
main().catch((error) => { const message = error instanceof Error ? error.message : String(error); process.stderr.write(`${message}\n`); process.exitCode = 1; });
ensureCodexAvailable(), ensureGitRepository(), JSON parse failures, the runAppServerTurn throw path, and any uncaught exception inside executeTaskRun all funnel here. The exit code is non-zero, stdout is empty, stderr has the actual diagnostic — but the rescue agent is told to forward stdout verbatim, and Claude Code's Bash tool does not surface stderr to the agent's output buffer in the rescue contract.
Real-world repro
Plugin v1.0.4 (commit 807e03a), Claude Code on macOS 15.5, codex-cli 0.130.0.
A /codex:rescue task targeting a worktree at ~/automation-worktrees/4d-paste-url — Codex completed Task 1 (TDD red→green), then hit:
fatal: Unable to create '/Users/minicp/automation/.git/worktrees/4d-paste-url/index.lock': Operation not permitted
In foreground mode the rendered failure text came back via stdout — visible. The job record at ~/.claude/plugins/data/codex-openai-codex/state/<slug>/jobs/task-mpmsifq9-69l3tf.json confirms status: "completed" with result.status: 0 and the full diagnostic in rendered.
If the same task had been dispatched --background (which codex-rescue.md:24 instructs the agent to prefer for "complicated, open-ended, multi-step" requests), the only stdout the parent would have seen is the "queued" launch message; the worker's subsequent EPERM error would only land in the job file, which the agent is forbidden from polling per codex-rescue.md:27-28.
I also observed earlier session-end cases where codex-companion was invoked, exited non-zero before producing any stdout (likely an auth-refresh race after laptop sleep), and the rescue agent returned empty. From the user's perspective: "I asked codex to do a thing, nothing happened, no message."
Why this is worse than #324
#324 produces a wrong-but-visible answer (the "queued" stub). This bug produces no answer. The user has no signal to investigate, no error message to search, and Claude Code's main thread proceeds as if rescue succeeded with empty content.
Suggested fix — defense in depth (both layers)
Neither component should rely on the other's stdio discipline. Fix both:
Companion side — route fatal errors through stdout as a structured envelope
plugins/codex/scripts/codex-companion.mjs:1023-1027:
main().catch((error) => { const message = error instanceof Error ? error.message : String(error); const envelope = { status: "error", exitCode: 1, error: message }; process.stdout.write(`${JSON.stringify(envelope, null, 2)}\n`); process.stderr.write(`${message}\n`); process.exitCode = 1; });
This matches what every other JSON-returning subcommand already does — failures become first-class output instead of side-channel diagnostics.
Agent side — forward stderr + exit code when the Bash call fails
plugins/codex/agents/codex-rescue.md:42, replace:
"If the Bash call fails or Codex cannot be invoked, return nothing."
with:
"Invoke the Bash call as
node \"\${CLAUDE_PLUGIN_ROOT}/scripts/codex-companion.mjs\" task ... 2>&1. If the command exits non-zero, return the captured combined output verbatim along with the exit code. Do not interpret, retry, or add commentary."
Two independent guarantees: companion always emits something parseable on stdout, and the agent surfaces whatever it gets even if the companion misbehaves. Either layer alone closes most of the failure surface; both together close all of it.
Optional: add a smoke test
A CI check that runs codex-companion.mjs task with a known-bad workspace (e.g., chmod-stripped .git) and asserts the stdout is non-empty would prevent this class of regression.
Cross-reference
- codex:codex-rescue subagent returns stub instead of actual task output #324 (silent path Add Gemini CLI extension commands #1 : stub instead of result) — different failure surface, same theme
- Large prompt to codex:codex-rescue is silently rejected as 'user denied' with no user prompt; Claude Code cannot recover #308 (silent path Review fails due to unknown sandboxing variant #3 : large prompt rejected as "user denied") — different cause, same observability gap
- Windows: /codex:review and /codex:rescue silently return empty results because plugin forces broken sandbox modes #349 (Windows: forced sandbox modes → empty results) — platform-specific silent failure
Common thread across all four: codex-rescue has no error contract with its caller. Worth thinking about as a class of bugs, not as four independent fixes.