a simple and blazing fast LiteLLM-compatible ai gateway for coding agents (Claude Code, Codex, Hermes, etc.)
lite claude
The first run prompts for your LiteLLM URL and API key, saves them to
~/.config/lite/claude.env, and starts Claude Code with:
ANTHROPIC_BASE_URL="https://your-litellm-rust-server.com" ANTHROPIC_AUTH_TOKEN="$LITELLM_API_KEY"
Arguments after lite claude are forwarded to Claude Code:
lite claude --help lite claude --model claude-sonnet-4-5
Run lite claude --reset to ignore saved settings and enter them again.
lite codex
The first run prompts for your LiteLLM URL and API key, saves them to
~/.config/lite/codex.env, and starts Codex pointed at the gateway. Codex uses
the OpenAI Responses API (SSE over HTTP — no WebSocket), so requests land on
POST /v1/responses. The wizard injects the gateway via -c config overrides
and never edits your ~/.codex/config.toml:
codex \ -c model_provider="litellm" \ -c model_providers.litellm.base_url="https://your-litellm-rust-server.com/v1" \ -c model_providers.litellm.wire_api="responses" \ -c model_providers.litellm.env_key="LITELLM_API_KEY" # LITELLM_API_KEY is exported from your saved key
Arguments after lite codex are forwarded to Codex:
lite codex exec "fix the failing test" lite codex -m gpt-5.5
Run lite codex --reset to ignore saved settings and enter them again.
The gateway needs an OpenAI model route in its config:
model_list: - model_name: openai/* litellm_params: model: openai/* api_key: os.environ/OPENAI_API_KEY api_base: https://api.openai.com
Codex Mac app: the desktop app reads ~/.codex/config.toml, so route it by
adding a provider block there (same fields the wizard passes), then select it in
the app:
model_provider = "litellm" [model_providers.litellm] name = "LiteLLM" base_url = "https://your-litellm-rust-server.com/v1" wire_api = "responses" env_key = "LITELLM_API_KEY"
Installing/updating the CLI:
cargo install --path . --forceso theliteon yourPATHincludes thecodexsubcommand (a stale install errors withunrecognized subcommand 'codex').
litellm-rust is compatible with your existing litellm config.yaml and DB.
model_list: - model_name: anthropic/* litellm_params: model: anthropic/* api_key: os.environ/ANTHROPIC_API_KEY general_settings: master_key: os.environ/MASTER_KEY sandbox_choice: "e2b" # can be either "e2b" or "daytona" e2b_sandbox_params: e2b_api_key: os.environ/E2B_API_KEY e2b_template: "litellm-4gb"
$ litellm-rust --config /app/config.yaml
POST /messages POST /responses POST /realtime POST /audio
- OpenAI
- Azure OpenAI
- VertexAI
- Bedrock
Entry points and what runs at startup:
src/main.rs— binary entry point. Parses CLI args, loadsconfig.yaml, builds the HTTP client, callsmodel_prices::load(), then wires everything intoAppStateand starts the server.src/model_prices.rs— fetches the LiteLLM model cost/capability map from upstream at startup; falls back to the embeddedmodel_prices_backup.jsonsnapshot if the network is unavailable. Returns aModelCostMap;main.rsstores it onAppState. Override the URL withLITELLM_MODEL_COST_MAP_URL.src/errors.rs— typed error enum. All error variants map to HTTP status + JSON body in one place.
Subsystems:
src/http/— HTTP layer only. Route registration, auth, body extraction, response shaping. No business logic.src/providers/— provider registry, per-provider request/response transformation, model router (maps model name → deployment + handler).src/proxy/— config loading, master-key auth,AppState.src/cli/—lite claudewizard: credential storage, model selector, Claude Code launcher.
See CODING_STANDARDS.md.