-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[BUG] /codex:setup reports loggedIn:false when shared broker is busy; getCodexAuthStatus missing direct-fallback #342
Description
Description
/codex:setup (and any other code path that calls getCodexAuthStatus) intermittently reports loggedIn: false with detail: "Shared Codex broker is busy.", even when codex login status shows the user is logged in and the underlying CLI works fine.
The misreport persists across new Claude Code sessions until the broker process is killed or the broker session state is wiped — which is confusing because the user sees "you're not logged in" right after a successful !codex login.
Root cause
In plugins/codex/scripts/lib/codex.mjs, getCodexAuthStatus connects with reuseExistingBroker: true:
https://github.com/openai/codex-plugin-cc/blob/main/plugins/codex/scripts/lib/codex.mjs#L831-L863
If the shared broker is currently servicing another request (activeRequestSocket set), it returns the JSON-RPC BUSY error (code: -32001). getCodexAuthStatusFromClient swallows that exception into buildAuthStatus({ loggedIn: false, detail: error.message }), and the outer getCodexAuthStatus propagates that as the final result.
The sibling helper withAppServer in the same file already handles this scenario by retrying with disableBroker: true:
https://github.com/openai/codex-plugin-cc/blob/main/plugins/codex/scripts/lib/codex.mjs#L607-L636
getCodexAuthStatus simply doesn't implement that fallback.
Reproduction
- Run any command that spawns the shared broker and holds it busy with a long-running RPC call (e.g. start
/codex:rescueor anytask/start). - While that's still running, run
/codex:setupin another tool/window or programmatically. - Observe
auth.loggedIn: falsewithauth.detail: "Shared Codex broker is busy."even though credentials are valid.
A reliable synthetic repro (no real codex needed): start a tiny node TCP server on a named pipe / unix socket that responds to every JSON-RPC id-bearing request with {error:{code:-32001,message:"Shared Codex broker is busy."}}, point CODEX_COMPANION_APP_SERVER_ENDPOINT at it, then run node scripts/codex-companion.mjs setup --json. Without a fallback you see loggedIn:false; with a disableBroker:true retry path you see the real loggedIn:true.
Suggested fix
Mirror what withAppServer already does. Roughly:
export async function getCodexAuthStatus(cwd, options = {}) { const availability = getCodexAvailability(cwd); if (!availability.available) { /* unchanged */ } const brokerRequested = Boolean((options.env ?? process.env)[BROKER_ENDPOINT_ENV]) || Boolean(loadBrokerSession(cwd)?.endpoint); let client = null; try { client = await CodexAppServerClient.connect(cwd, { env: options.env, reuseExistingBroker: true }); return await getCodexAuthStatusFromClient(client, cwd); } catch (error) { const brokerAttempted = client?.transport === "broker" || brokerRequested; const shouldRetryDirect = brokerAttempted && (error?.rpcCode === BROKER_BUSY_RPC_CODE || error?.code === "ENOENT" || error?.code === "ECONNREFUSED"); if (client) { await client.close().catch(() => {}); client = null; } if (shouldRetryDirect) { try { client = await CodexAppServerClient.connect(cwd, { env: options.env, disableBroker: true }); return await getCodexAuthStatusFromClient(client, cwd); } catch (directError) { return buildAuthStatus({ loggedIn: false, detail: directError instanceof Error ? directError.message : String(directError), source: "app-server" }); } } return buildAuthStatus({ loggedIn: false, detail: error instanceof Error ? error.message : String(error), source: "app-server" }); } finally { if (client) { await client.close().catch(() => {}); } } }
This also requires removing the inner try/catch from getCodexAuthStatusFromClient so the underlying RPC error (with rpcCode) propagates up; otherwise it always gets wrapped into a "logged out" status before the outer can inspect it.
Happy to send a PR if useful.
Environment
- codex-plugin-cc: 1.0.4 (latest as of this report)
- Codex CLI: 0.132.0
- Node: 24.15.0
- OS: Windows 11 (also affects the unix-socket transport on macOS/Linux — the underlying logic is platform-independent)