-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration
blumi (a single-binary coding agent for macOS and Linux) is configured by one JSON file:
~/.blumi/settings.json. Everything blumi does — provider, model, permissions, memory, the
gateway, the grid — is driven from that file, and every section is optional because blumi ships
with sensible defaults. An empty file is valid, and for most people a single provider key (set via
blumi login) is all that's needed to start. This page is the complete reference: every section,
its fields, its defaults, and copy-pasteable examples.
-
Where config lives:
~/.blumi/settings.json(global). Per-project overrides go in./.blumi/settings.json. -
File format: plain JSON, written with file mode
0600; it holds secrets, so never commit it. -
Fastest setup: run
blumi login— a wizard that picks a provider, takes a key, chooses a model, and writes the right fields. -
Everything is optional: blumi has built-in defaults, so an empty
settings.jsonis valid. -
Load order (later wins): built-in defaults →
~/.blumi/settings.json→./.blumi/settings.json→BLUMI_-prefixed environment variables → per-invocation flags. -
Override per run: flags like
--provider,--model,--persona, and--yolobeat the file. -
Apply changes live: ask the agent to
reload_self(or use the web/phone "reload"); process-level settings need a restart.
The annotated JSON blocks below use
//comments for clarity. Strip them if your editor is strict —settings.jsonis plain JSON.
blumi merges configuration from four layers in order, where later layers win, using the figment library:
- Built-in defaults
-
~/.blumi/settings.json— global, your main file -
./.blumi/settings.json— per-project overrides (commit-safe project tweaks; secrets stay global) -
Environment variables — prefix
BLUMI_, nest with__(e.g.BLUMI_LLM__MODEL=claude-opus-4-5,BLUMI_GRID__SECRET=...,BLUMI_PROVIDERS__OPENAI__API_KEY=...)
settings.json is written mode 0600 and holds secrets — never commit it (the repo's .blumi/
is gitignored). Per-invocation flags (--provider, --model, --persona, --yolo) beat all of the above.
The fastest way to configure blumi is the interactive login wizard, which writes settings.json
for you:
blumi login
Pick a provider, paste a key (or endpoint), choose a model — it writes the right bits into
settings.json. Re-run any time to switch. That's all most people need; the rest of this page is for
tuning.
How you apply a change depends on the setting: most are hot-reloaded, but a few process-level settings need a restart.
-
TUI / gateway session: ask the agent to
reload_self, or use the web/phone "reload" — it re-readssettings.jsonwithout losing the conversation. -
Process-level settings (bind host/port, the web password, the grid identity) are read once at
startup, so they need a restart:
launchctl kickstart -k gui/$(id -u)/com.blumi.serve(macOS) /systemctl --user restart blumi-serve(Linux), or just relaunchblumi tui.
blumi is provider-agnostic, so you configure a provider by adding an entry under providers in
settings.json (or, more easily, by running blumi login). Built-in presets exist for the common
providers so you usually set only a key. The presets, keyed by name in providers, are:
| Preset name | kind |
Notes |
|---|---|---|
anthropic |
anthropic |
Claude — API-key auth only |
openai |
openai_compat |
OpenAI and any OpenAI-compatible endpoint (set base_url) |
gemini |
gemini |
Google Gemini (native client) |
azure |
anthropic_foundry |
Azure AI Foundry (Anthropic models) |
local |
openai_compat |
a local server (llama.cpp / Ollama-compatible) — no key |
mock |
— | deterministic, for tests/demos |
Each provider entry:
"providers": { "anthropic": { "api_key_env": "ANTHROPIC_API_KEY" }, // read key from env (preferred) "openai": { "api_key": "sk-...", "base_url": "https://api.openai.com/v1" }, "local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" } }
-
api_key_env(read the key from an env var) is preferred over a literalapi_keyso the key never sits in the file. -
base_urlpoints OpenAI-compatible clients at any endpoint (Ollama, llama.cpp, OpenRouter, ...). -
kindis only needed for a fully custom provider name; the presets above set it for you.
The llm section selects which provider and model are active and sets the per-turn limits. To
change the model, set llm.model (or run blumi login, or pass --model for a single run):
"llm": { "provider": "anthropic", // which entry in "providers" "model": "claude-sonnet-4-5", // "" = let the provider pick/probe "context_size": 131072, "max_output_tokens": 16384, "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_iterations": 25, // tool steps allowed per turn "max_auto_continue": 12, // self-continue rounds when a turn hits the step cap (0 = off) "max_auto_continue_tokens": 400000, // token ceiling for one auto-continue run (0 = no cap) "wake_on_rollover": true, // a context rollover (compaction) refreshes the token budget so a long task keeps going "max_local_agents": 4 // max concurrent local sub-agents (overflow → grid or queue) }
Override per run without editing: blumi --provider openai --model gpt-4o run "...".
The permissions section defines what the agent may do unattended, via per-tool allow / deny /
ask pattern lists:
"permissions": { "yolo": false, // true = auto-approve EVERYTHING (use only sandboxed) "tools": { "Bash": { "deny": ["rm -rf*", "sudo*"], "ask": ["git push*"] }, "FileWrite": { "allow": ["src/**"] } } }
Per-tool allow / deny / ask pattern lists. Interactive by default (mutating tools surface an
approval card); toggle YOLO live with ctrl+y / /yolo / --yolo. For real guardrails, pair with
hooks.pre_tool_use.
The persona field selects the active agent style, and personas defines custom ones (each can
override the model and temperature):
"persona": "default", // active persona (built-in or custom) "personas": { "reviewer": { "description": "Careful code reviewer", "instructions": "Be terse. Prioritise correctness + security. Propose diffs.", "model": "claude-opus-4-5", // optional: switch model when active "temperature": 0.2 // optional override } }
Built-ins include architect, pair, reviewer, team. Switch with --persona <name> or /persona.
The executor section chooses where tool calls execute: the local host, a Docker sandbox, or a
remote machine over SSH:
"executor": { "backend": "local", // "local" (host) | "docker" (sandbox) | "ssh" (remote) "docker_image": "debian:stable-slim", "ssh_host": "user@box", // for backend = "ssh" "ssh_workdir": "/home/user/proj" }
Use docker (or ssh to a throwaway box) when you want YOLO/gateway autonomy without risking the host.
The brain section configures a cheap local model that vets approval prompts so the flagship model
is not interrupted:
"brain": { "mode": "off", // "off" | "advisory" (annotate) | "auto" (decide) "provider": "local", // "" = reuse the main agent's provider "model": "qwen2.5:3b" // a small/cheap/local model is ideal here }
A cheap model that vets approval prompts so the flagship isn't interrupted — escalates to you on uncertainty or danger.
The router section enables cost-aware routing that, per turn, picks a light model versus the
flagship to cut cost. It is off by default:
"router": { "mode": "off", // "off" | "heuristic" | "hybrid" | "judge" "light": { "provider": "", "model": "claude-haiku-4-5" }, "heavy": { "provider": "", "model": "claude-opus-4-5" }, "judge": { "provider": "", "model": "" }, // "" = reuse brain.* then llm.* "subagent_tier": "light", // "light" | "heavy" | "inherit" "prefer_grid_light": false, // run the light tier on a grid peer's local model (free) "heuristics": { "heavy_chars": 1500, "light_chars": 280, "heavy_tool_count": 12, "escalate_iteration": 6, "heavy_keywords": [], "light_keywords": [] } }
Per turn, picks light vs flagship to cut cost. Off by default. Empty tier provider/model reuse
the active llm.*. See Self-Management → Cost-aware routing.
The heal section controls self-healing: blumi auto-recovers failed tool calls, learns fixes, and
mines recurring ones into recovery skills. It is on by default:
"heal": { "enabled": true, "recovery_budget": 2, // recovery attempts per turn "verify": false, // only mark a fix "verified" if a later step succeeds "learn": true, // store failure→fix episodes in memory "evolve": "auto", // "auto" | "propose" | "off" (mine recurring fixes → skills) "redact_paths": true }
Auto-recovers failed tool calls, learns fixes, and mines recurring ones into recovery skills. On by default. See Self-Management → Self-healing.
The embeddings, memory, and knowledge sections power recall, semantic memory, and code search,
and all three are on by default. The bundled local embedder downloads a ~90 MB model on first
use, then works fully offline (RAG, or retrieval-augmented generation, runs entirely on your machine).
See Memory & Knowledge.
"embeddings": { "enabled": true, "backend": "local", // "local" (bundled ONNX) | "openai" (endpoint) | "grid" (peer) "provider": "", // for backend = "openai": a name from "providers" "model": "bge-small-en-v1.5", "dim": 384 }, "memory": { "enabled": true, "recall_k": 5, // memories injected per turn (RAG) "dedup_threshold": 0.92, // admission gate: near-duplicates are merged "max_per_namespace": 2000, "diffuse": true, // share non-`user` learnings across the grid "sweep_secs": 60, // governance cadence (eviction/consolidation) "resolve_conflicts": false, // opt-in: LLM supersedes contradictory memories in the sweep "retrospect": true, // daily: replay each session's new transcript → consolidate learnings "retrospect_hours": 24 // min hours between retrospection passes }, "knowledge": { "enabled": true, "max_file_kb": 256, // skip files larger than this when indexing "exclude": ["target", "node_modules", ".git", "dist"], "graph": { "mode": "lite", // off | lite | structural (structural needs the `code-graph` build feature) "resolve_imports": true, // resolve use/import when building structural edges "max_edges_per_symbol": 64, // fan-out cap (0 = uncapped) "rpl_impact": true // feed a file's code-graph fan-in into the RPL blast radius } }
-
Semantic memory accrues automatically (failure→fix episodes, the agent's memory tool); browse it
with
/memories(TUI) or the web/phone Control Center (where you can pin/edit/delete). Theusernamespace never leaves your node. Tell the agent "remember that ..." to force-store; pin an entry to exempt it from eviction. -
Knowledge is empty until you index a repo:
blumi knowledge ingest .thenblumi knowledge status(or/knowledgein the TUI). Powerscode_search/code_retrieve.
The acceleration section picks the hardware execution provider for the bundled embedder (CPU,
Apple CoreML, or CUDA):
"acceleration": { "mode": "auto", // "auto" | "cpu" | "apple" (CoreML) | "cuda" "embeddings_accel": "auto" // override just the bundled embedder }
blumi accel doctor shows what was detected. See
Memory & Knowledge → GPU acceleration.
The web section holds the authentication hash for the gateway and browser UI. Set it with
blumi serve pair --password <pw> rather than editing the hash by hand:
"web": { "password_hash": "" } // set via `blumi serve pair --password <pw>` (argon2; never plaintext)
A password is required when binding to a non-loopback address. The same server serves the browser UI and the blugo phone app. See Gateway.
The voice section configures speech-to-text (STT) and text-to-speech (TTS); it is disabled until
you set enabled: true and supply a key:
"voice": { "enabled": false, "api_key": "", "stt_base_url": "https://api.openai.com/v1", "stt_model": "whisper-1", "tts_provider": "openai", // "openai" | "elevenlabs" "tts_base_url": "", // blank = provider default "tts_model": "tts-1", "tts_voice": "alloy", // OpenAI voice name, or an ElevenLabs voice id "tts_api_key": "" // separate TTS key (falls back to api_key) }
See Voice.
The gateway section configures blumi as a chat bot across Telegram, Discord, Slack, and WhatsApp,
each keyed by its own tokens:
"gateway": { "yolo": false, // auto-approve in bot sessions (only with a sandboxed executor!) "telegram": { "token": "123456:AA...", // @BotFather "allowed_chats": [], // [] = anyone who messages it; or [<your-chat-id>] "voice": false // transcribe voice notes + speak replies (needs voice.* too) }, "discord": { "token": "", "allowed_channels": [] }, "slack": { "bot_token": "xoxb-...", "app_token": "xapp-..." }, "whatsapp": { "token": "", "phone_number_id": "", "verify_token": "", "webhook_port": 8080 } }
Run one transport in the foreground (blumi gateway telegram) or all configured ones as a service
(blumi gateway install). One bot per token — see Gateway → Messaging-bot gateway.
The grid section forms a distributed, multi-node grid: several blumi serve nodes that share a
secret discover each other and hand tasks to peers:
"grid": { "enabled": false, "secret": "", // same value on every node = same grid (or BLUMI_GRID__SECRET) "grid_id": "", // blank = derived from the secret digest "node_name": "", // blank = hostname "peers": ["10.0.0.150:7777"] // static peers (in addition to mDNS auto-discovery) }
Several blumi serve nodes sharing a secret form a grid that hands tasks to peers. The secret is
never advertised (only a non-sensitive digest). See Grid.
The remote section lists remote blumi instances the TUI (terminal UI) can attach to as tabs, and
workspaces lists directories scanned for sibling git repos in the TUI sidebar:
"remote": { "instances": [] }, // remote blumi instances the TUI can attach to as tabs "workspaces": { "roots": ["~/code"] } // dirs scanned for sibling git repos in the TUI sidebar
The git section sets the author identity stamped on commits the agent makes:
"git": { "author_name": "Blumi", "author_email": "you@example.com" }
Stamped on commits the agent makes (GIT_AUTHOR_* / GIT_COMMITTER_*). Empty author_name disables
the override.
The always_on section lets the gateway run a read-only discovery pass when idle and propose
tasks plus a report. It is off by default:
"always_on": { "enabled": false, "autonomy": "propose", // "off" | "propose" (add tasks) | "auto" (reserved) "cadence_secs": 900, "min_interval_secs": 300, "skip_if_todos": 1, // skip while the board already has todos "max_open_discoveries": 5, "max_per_pass": 3 }
When idle, the gateway runs a read-only discovery pass and proposes tasks + a report. Off by default. See Self-Management → Always-on discovery.
The notify section pings you when blumi loop or a discovery pass finishes, via desktop, a chat
bot, or browser Web Push. It is off by default:
"notify": { "enabled": false, "on": ["loop", "discovery"], // which completions fire (also "turn"); [] = loop+discovery "desktop": true, // OS notification on the host "bot": { "transport": "telegram", "target": "<chat-id>" }, // proactive bot message "web_push": false // browser Web Push (VAPID; secure-context only) }
Pings you when blumi loop / discovery finishes. Off by default. See
Self-Management → Completion notifications.
Phone push uses Firebase Cloud Messaging (FCM) and is the one notification path that is not a
settings.json setting.
The blugo phone app's Dispatch feature
gets a push when a node finishes a turn. This is not a config-file setting — it's enabled by
file presence alone, so it's zero-config and independent of the notify block above (it never
adds desktop/bot noise).
To turn push on, drop a Firebase service account on the gateway machine:
cp your-firebase-adminsdk.json ~/.blumi/fcm-service-account.json chmod 600 ~/.blumi/fcm-service-account.json
- The gateway auto-detects the file and sends turn-complete pushes via FCM HTTP v1 (a short-lived
OAuth token is minted from the service account and cached in-process). The
project_idis read from the file — nothing else to configure. - Device tokens are stored at
~/.blumi/fcm.json(the app registers them on connect) and are pruned automatically when FCM reports them stale (HTTP 404 /UNREGISTERED). - No file = silent no-op. Dispatch still works in-app; you just don't get backgrounded pushes.
-
Never commit the service account (or the app's
google-services.json) — both are gitignored. The private key stays at~/.blumi/fcm-service-account.json(chmod 600) and never enters git.
Optional override: set
notify.fcm.service_account_path(andnotify.fcm.project_id) if you keep the file somewhere other than~/.blumi/fcm-service-account.json. Push is Android-only today.
The hooks section runs your own shell commands at lifecycle points to inject context or gate tool
calls. It is off by default:
"hooks": { "user_prompt_submit": [ { "command": "git branch --show-current", "timeout_secs": 5 } ], "pre_tool_use": [ { "command": "jq -e '.input.command|test(\"rm -rf\")|not' >/dev/null", "matcher": "Bash" } ] }
user_prompt_submit injects each command's stdout as turn context; pre_tool_use can block a tool
(non-zero exit). Off by default. See Self-Management → Lifecycle hooks.
The mcp_servers section registers external Model Context Protocol (MCP) servers, launched over
stdio, whose tools then appear to the agent:
"mcp_servers": { "github": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"], "env": { "GITHUB_TOKEN": "ghp_..." }, "enabled": true } }
External MCP servers launched over stdio; their tools appear to the agent. {workspace} in args is
substituted with the project path. See blumi mcp and CLI Usage.
The lsp_servers section registers language servers (LSP, the Language Server Protocol) that power
blumi's code-intelligence tool — definitions, references, and diagnostics:
"lsp_servers": { "rust": { "command": "rust-analyzer", "args": [], "extensions": ["rs"], "language_id": "rust" } }
Power the Lsp code-intelligence tool (definitions, references, diagnostics).
Here is a realistic everyday settings.json — Claude for the flagship model, a cheap local brain, memory on,
and notifications to Telegram:
{
"llm": { "provider": "anthropic", "model": "claude-sonnet-4-5" },
"providers": {
"anthropic": { "api_key_env": "ANTHROPIC_API_KEY" },
"local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" }
},
"brain": { "mode": "advisory", "provider": "local", "model": "qwen2.5:3b" },
"router": { "mode": "hybrid", "light": { "model": "claude-haiku-4-5" },
"heavy": { "model": "claude-opus-4-5" } },
"memory": { "enabled": true },
"knowledge": { "enabled": true },
"git": { "author_name": "Blumi", "author_email": "you@example.com" },
"notify": { "enabled": true, "bot": { "transport": "telegram", "target": "123456789" } },
"gateway": { "telegram": { "token": "123456:AA...", "allowed_chats": [123456789] } }
}Auto-continue lets a turn keep going past the per-turn step cap, bounded by a token budget. When a
turn stops only because it hit the per-turn step cap, blumi auto-continues the same session and
narrates each step — bounded by both llm.max_auto_continue (default 12) and
llm.max_auto_continue_tokens (default 400k), whichever hits first. Tune live with /autocontinue <n>
(0 disables).
Beyond semantic memory, blumi reads two markdown files in ~/.blumi/ as a frozen snapshot each
session: project MEMORY.md (agent notes) and USER.md (about you). View or edit them in
the TUI (/memory), the web/phone Control Center, or on disk.
blumi stores its configuration in ~/.blumi/settings.json (a single plain-JSON file written
with file mode 0600). Per-project overrides live in ./.blumi/settings.json, and BLUMI_-prefixed
environment variables override both. Other state under ~/.blumi/ includes the memory files
MEMORY.md and USER.md, and optional credentials such as fcm-service-account.json.
You need an API key for hosted providers (Anthropic, OpenAI, Gemini, Azure), set most safely via
api_key_env so the key stays out of the file. You do not need a key for a local,
OpenAI-compatible server (such as Ollama or llama.cpp): use the local preset with a base_url and
no key.
Set llm.model (and llm.provider) in settings.json, or run blumi login to pick interactively.
To change the model for a single run without editing the file, pass the flag:
blumi --provider openai --model gpt-4o run "...".
Yes. Point llm at a local, keyless provider (the local preset with an Ollama/llama.cpp
base_url), and keep embeddings.backend set to local — the bundled ONNX embedder downloads a
~90 MB model on first use and then works fully offline, so memory, RAG recall, and code search all
run without a network.
Ask the agent to reload_self (or click "reload" in the web/phone UI) to re-read settings.json
without losing the conversation. Only process-level settings — the bind host/port, the web password,
and the grid identity — are read once at startup and require a restart.
No — settings.json holds secrets and is written mode 0600, so you should never commit it (the
repo's .blumi/ is gitignored). Keep secrets in your global ~/.blumi/settings.json (or in
environment variables via api_key_env) and reserve the commit-safe ./.blumi/settings.json for
non-secret per-project tweaks.
Use the permissions section with per-tool deny / ask pattern lists, leave yolo set to false
(its default), and for real guardrails add a hooks.pre_tool_use hook that blocks a tool by exiting
non-zero. When you do want full autonomy, pair it with a sandboxed executor (docker or ssh).