Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[RFC] Multi-AI Era Refactoring — Implementation Ready, Need Reviewers and Testers #1577

professional-slacker started this conversation in Ideas
Discussion options

[RFC] Multi-AI Era Refactoring — Implementation Ready, Need Reviewers and Testers

TL;DR

OpenClaude's architecture needs multi-provider, correct cd→spawn, path-independent resume.
3 expansion proposals listed below.
Implementation: A few days with AI-assisted coding.
Human review & testing: Expected to take a few weeks — depending heavily on community availability. 
This is where I need your help.

Looking for: code reviewers, cross-platform testers (Linux/macOS/Windows), and design input. If you've been bitten by any of these issues, your experience helps.


Community Impact

Past incidents and root causes:

Symptom Root Cause Related Issues
"API key invalid / 401" Wrong API key used for the wrong provider #1156, #1245, #1360, #1426, #1455 and 20+ more
"cd does not work" / "files written to wrong path" Virtual cd does not change spawn cwd #190, #372, #844
"Cannot resume session after rename" Session key depends on cwd path, breaks on path change #568, #1031, #1203

P1 (API_KEY issue) is a limitation of the current design scope. /provider works correctly within the current scope. The goal of this proposal is to build the foundation for handling multiple credentials simultaneously in the multi-agent era.


Reproduce

Proposal 1: Provider Credential Isolation

This would make OpenClaude the only application currently capable of true multi-agent orchestration.

In the current design, a Provider is treated as an active state that can be switched within a session. The architecture does not support holding and using multiple Provider credentials independently and in parallel within a single session.

This creates constraints in use cases such as:

Using Opengateway Mimo for chat, Agent A (DeepSeek) for source code search, and Agent B (Gemini) for image recognition concurrently in the same session.
Switching via /provider overwrites the active Provider reference, causing state conflicts in workflows that assume multiple Providers can be handled simultaneously.

Additionally, differences in priority between environment variables and CLI settings can cause mismatches between startup configuration and runtime state, leading to "API_KEY invalid" errors due to misconfiguration — a recurring issue for a significant number of users.

The goal is to treat provider / model / API_KEY as a per-agent separable identifier, structurally preventing authentication and configuration conflicts during multi-agent execution.

Currently, WebSearch already maintains its own independent credentials (TAVILY_API_KEY, etc.) separate from the LLM API key. This proposal applies the same pattern to LLM providers — transparent credential resolution per agent.

Proposal 2: Virtual cd and Spawn CWD Mismatch

When cwd is not specified in spawn calls, the startup directory is used by default. This causes issues not only when using cd, but also in scenarios such as installers using spawn commands and remote connections via gRPC — even without any directory change. These inconsistencies should be resolved.

# cd to /tmp/hotfix for emergency fixes
# AI runs "lint check" after editing
# → toolHooks auto-fix runs lint in startup cwd (/home/user/main-project)
# → lint errors in the hotfix project are missed
# Start in /home/user/project
# Ask AI: "search for xyz in /tmp/other-project"
# AI: !cd /tmp/other-project → git grep xyz
# → ripgrep searches in startup cwd (/home/user/project)
# → incorrectly reports "Not found"

Proposal 3: Path-Dependent Session Resume

If a directory is renamed or accessed via a symlink, the session key (cwd hash) changes, making it impossible to resume previous sessions. This stems from how cwd is fundamentally handled — P2 and P3 are complementary fixes.

# Start a session in /tmp/my-project, have a conversation, exit (session saved)
mv /tmp/my-project /tmp/my-project-renamed
# Resume:
openclaude --resume
# "No sessions found for this project"
# Session key is a hash of the absolute path, so path change causes mismatch

1. Provider/Credential Layer (Class Diagram)

classDiagram
 note "【Current Architecture】\nTight coupling with Single Provider & Single API Key"
 class ProviderConfig {
 +string activeProfile
 +string apiKey
 +getActiveConfig()
 }
 note for ProviderConfig "- Strictly expects OPENAI_API_KEY as final fallback\n- Incapable of maintaining multiple concurrent provider sessions"
 %% ==========================================
 note "【Proposed Target Architecture】\nSeparation of Registry & Credentials (Loose Coupling)"
 class CredentialStore {
 <<interface>>
 +get(providerId: string) Credential
 +set(providerId: string, cred: Credential) void
 }
 
 class ProviderRegistry {
 <<interface>>
 +Map~string, ProviderConfig~ providers
 +resolve(model: string) ProviderConfig
 }
 class Credential {
 +string apiKey
 +string baseUrl
 +string authType
 }
 class TargetProviderConfig {
 +string providerId
 +string defaultModel
 +validate() bool
 }
 ProviderRegistry "1" *-- "many" TargetProviderConfig : Manages
 ProviderRegistry ..> CredentialStore : Resolves credentials
 TargetProviderConfig ..> Credential : Applies
Loading

Figure 1 (Class Diagram) Description:
Decouples the rigid legacy configuration into an extensible ProviderRegistry (which resolves target model profiles at runtime) and a pluggable CredentialStore (which securely encapsulates API keys and custom base URLs), successfully unblocking simultaneous multi-provider orchestration.


2. Shell Execution & Context Management (Sequence Diagram)

sequenceDiagram
 autonumber
 
 box RGBA(255, 0, 0, 0.1) Current Broken Flow (Defective process.cwd Dependency)
 actor User as User Agent
 participant Shell as Shell.ts (AsyncLocal)
 participant Hook as toolHooks.ts
 participant Spawn as bashProvider.ts (spawn)
 end
 User->>Shell: cd subdir
 Note over Shell: Only mutates AsyncLocalStorage state override.<br/>The actual OS-level process.cwd() remains unmutated.
 Shell-->>User: Command execution ack
 
 User->>Spawn: spawn("ls")
 Spawn->>Hook: Fetch execution context (CWD fallback)
 Hook-->>Spawn: Returns unmutated process.cwd() (Root directory)
 Note over Spawn: Executes command inside original root directory,<br/>violating user's mental model and context.
 Spawn-->>User: Returns file list of root instead of subdir (Critical Bug)
 %% ==========================================
 box RGBA(0, 255, 0, 0.1) Proposed Flow (Stateful VirtualCWD Synchronization)
 actor User2 as User Agent
 participant Session as SessionManager
 participant Spawn2 as bashProvider.ts (spawn)
 participant Node as OS Process Base
 end
 User2->>Session: cd subdir
 Note over Session: Directly mutates stateful session.virtualCwd
 Session->>Node: Synchronizes environment via process.chdir(subdir)
 Node-->>Session: Execution ack
 Session-->>User2: Command execution ack
 User2->>Spawn2: spawn("ls")
 Note over Spawn2: Explicitly injects session.virtualCwd<br/>into target options.cwd allocation
 Spawn2->>Node: Spawns sub-process inside resolved target subdir
 Node-->>Spawn2: Expected execution stdout/stderr
 Spawn2-->>User2: Returns correct subdir file list (Expected Behavior)
Loading

Figure 2 (Sequence Diagram) Description:
Eliminates the unstable pseudo-directory tracking built on AsyncLocalStorage. The updated architecture explicitly tracks state via virtualCwd at the core session layer, forcefully injecting it into shell/spawn options while ensuring physical OS-level synchronization using process.chdir().


3. Session Lifecycle & Persistence (State Transition Diagram)

stateDiagram-v2
 state "【Current Design】\nAbsolute Path String Hash Dependency (Brittle)" as CurrentSession {
 [*] --> Init : process.cwd() = /home/user/project
 Init --> Transform : Execution of sanitizePath()
 Transform --> KeyGeneration : Yields rigid key "home-user-project-v1-hash"
 KeyGeneration --> HardcodedBinding : Persisted directly under ~/.openclaude/projects/
 HardcodedBinding --> EnvironmentShift : External mv / symlink / bind mount action
 EnvironmentShift --> HistorialDataLoss : Path string deviation causes complete failure to resolve legacy session
 }
 %% ==========================================
 state "【Proposed Target Design】\n3-Tier Identification Topology (Highly Portable)" as ProposedSession {
 [*] --> Resolution : Triggered via specific sessionId or Recursive Git Root Search
 Resolution --> StructuralIdentity : Detects immutable "projectId" bound to Git Repository Root
 StructuralIdentity --> ContextSync : Fully restores agent execution context using sessionId metadata index
 ContextSync --> WorkspaceNavigation : Local path modifications strictly update "virtualCwd" state metadata only
 WorkspaceNavigation --> ContinuousPersistence : Seamlessly persists session states across path/machine migrations via stable IDs
 }
Loading

Figure 3 (State Transition Diagram) Description:
Shifts the state lookup engine away from absolute file-path strings. By adopting a resilient 3-tier indexing model—sessionId (immutable primary key), projectId (stable Git root hash), and virtualCwd (mutable relative offset)—the session layer becomes completely decoupled from underlying directory mutations.


4. Blast Radius Matrix (Flowchart)

flowchart LR
 subgraph BlastRadius [Blast Radius Matrix / Systemic Dependencies]
 direction TB
 
 P1["Proposal1: Legacy Provider Management<br/>(Single API Key Constraint)"]
 P2["Proposal 2: Flawed CWD Virtualization<br/>(Pseudo 'cd' Layer)"]
 P3["Proposal 3: Brittle Absolute Path Dependency<br/>(Fragile Resume Index)"]
 
 config["config/profile<br/>(Profile Management)"]
 API["API Client Layer"]
 shell["shell/spawn<br/>(Process Orchestration)"]
 fileOps["File System Operations"]
 session["Session Persistence<br/>(Storage Engine)"]
 resume["Session Resume Trigger"]
 vscode["VSCode Extension Context"]
 
 P1 --> config
 P1 --> API
 
 P2 --> shell
 P2 --> fileOps
 
 P3 --> session
 P3 --> resume
 P3 --> vscode
 end
 
 style P1 fill:#ff9999,stroke:#333,stroke-width:2px,color:#000
 style P2 fill:#ff9999,stroke:#333,stroke-width:2px,color:#000
 style P3 fill:#ff9999,stroke:#333,stroke-width:2px,color:#000
Loading

Figure 4 (Blast Radius Flowchart) Description:
Illustrates the deterministic causal mapping between the three core architectural flaws and their respective blast radii across the sub-modules. It clearly demonstrates that legacy tight coupling ripples outward, multi-breaking core execution layers simultaneously.


Summary

Proposal Root Cause Key Files Impact
P1 Cannot hold multiple credentials simultaneously in one process providerProfiles.ts, providerConfig.ts Multi-agent scenarios require multiple credential sets to coexist (extension of the existing WebSearch API pattern)
P2 process.cwd() is not reflected in spawn targets cwd.ts, Shell.ts Spawned commands execute in the wrong directory
P3 Session key = hash of cwd path sessionStoragePortable.ts Resume fails after rename / symlink changes
You must be logged in to vote

Replies: 1 comment 2 replies

Comment options

This is neat but your assumption about single-key providers is VERY flawed. It's only that the openai env is multi-used. its possible to directly assign a key without using that based on the provider in question.
also if you used /provider inside the cli.. you can configure as many providers as you would like.

I don't understand the request to use cd.. your in a prompt not your shell anymore..

session rename issue is a concern..

You must be logged in to vote
2 replies
Comment options

@jatmn
Thank you for your comment.
Actually, this RFC was drafted with the help of an AI translator.
I didn't mean to use the word "Defect" — in my mind, I was thinking of it as "Phase 1 / Step 1" of the proposal. My bad for not catching that translation quirk before posting

Regarding the /provider spec, you're absolutely right. The way I initially wrote it completely misrepresents the actual behavior, so let me follow up with a separate comment tomorrow or so to clarify what I really meant.

Comment options

@jatmn
My vision was always a multi-agent orchestration platform — that's why the title hasn't changed. Proposal 2 covers more than just the cd issue. Please take a look when you have time. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /