-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration
Everything blumi does is driven by one JSON file — ~/.blumi/settings.json. This page is the
complete reference: every section, its fields, its defaults, and copy-pasteable examples. All
sections are optional — blumi ships with sensible defaults, so an empty file is valid and a one-line
provider key is usually all you need to start.
The annotated JSON blocks below use
//comments for clarity. Strip them if your editor is strict —settings.jsonis plain JSON.
blumi merges these in order (later wins), via figment:
- Built-in defaults
-
~/.blumi/settings.json— global, your main file -
./.blumi/settings.json— per-project overrides (commit-safe project tweaks; secrets stay global) -
Environment variables — prefix
BLUMI_, nest with__(e.g.BLUMI_LLM__MODEL=claude-opus-4-5,BLUMI_GRID__SECRET=...,BLUMI_PROVIDERS__OPENAI__API_KEY=...)
settings.json is written mode 0600 and holds secrets — never commit it (the repo's .blumi/
is gitignored). Per-invocation flags (--provider, --model, --persona, --yolo) beat all of the above.
blumi login
Pick a provider, paste a key (or endpoint), choose a model — it writes the right bits into
settings.json. Re-run any time to switch. That's all most people need; the rest of this page is for
tuning.
-
TUI / gateway session: ask the agent to
reload_self, or use the web/phone "reload" — it re-readssettings.jsonwithout losing the conversation. -
Process-level settings (bind host/port, the web password, the grid identity) are read once at
startup, so they need a restart:
launchctl kickstart -k gui/$(id -u)/com.blumi.serve(macOS) /systemctl --user restart blumi-serve(Linux), or just relaunchblumi tui.
blumi is provider-agnostic. Built-in presets exist for the common ones (so you usually set only a key),
keyed by name in providers:
| Preset name | kind |
Notes |
|---|---|---|
anthropic |
anthropic |
Claude — API-key auth only |
openai |
openai_compat |
OpenAI and any OpenAI-compatible endpoint (set base_url) |
gemini |
gemini |
Google Gemini (native client) |
azure |
anthropic_foundry |
Azure AI Foundry (Anthropic models) |
local |
openai_compat |
a local server (llama.cpp / Ollama-compatible) — no key |
mock |
— | deterministic, for tests/demos |
Each provider entry:
"providers": { "anthropic": { "api_key_env": "ANTHROPIC_API_KEY" }, // read key from env (preferred) "openai": { "api_key": "sk-...", "base_url": "https://api.openai.com/v1" }, "local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" } }
-
api_key_env(read the key from an env var) is preferred over a literalapi_keyso the key never sits in the file. -
base_urlpoints OpenAI-compatible clients at any endpoint (Ollama, llama.cpp, OpenRouter, ...). -
kindis only needed for a fully custom provider name; the presets above set it for you.
"llm": { "provider": "anthropic", // which entry in "providers" "model": "claude-sonnet-4-5", // "" = let the provider pick/probe "context_size": 131072, "max_output_tokens": 16384, "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_iterations": 25, // tool steps allowed per turn "max_auto_continue": 12, // self-continue rounds when a turn hits the step cap (0 = off) "max_auto_continue_tokens": 400000, // token ceiling for one auto-continue run (0 = no cap) "max_local_agents": 4 // max concurrent local sub-agents (overflow → grid or queue) }
Override per run without editing: blumi --provider openai --model gpt-4o run "...".
"permissions": { "yolo": false, // true = auto-approve EVERYTHING (use only sandboxed) "tools": { "Bash": { "deny": ["rm -rf*", "sudo*"], "ask": ["git push*"] }, "FileWrite": { "allow": ["src/**"] } } }
Per-tool allow / deny / ask pattern lists. Interactive by default (mutating tools surface an
approval card); toggle YOLO live with ctrl+y / /yolo / --yolo. For real guardrails, pair with
hooks.pre_tool_use.
"persona": "default", // active persona (built-in or custom) "personas": { "reviewer": { "description": "Careful code reviewer", "instructions": "Be terse. Prioritise correctness + security. Propose diffs.", "model": "claude-opus-4-5", // optional: switch model when active "temperature": 0.2 // optional override } }
Built-ins include architect, pair, reviewer, team. Switch with --persona <name> or /persona.
"executor": { "backend": "local", // "local" (host) | "docker" (sandbox) | "ssh" (remote) "docker_image": "debian:stable-slim", "ssh_host": "user@box", // for backend = "ssh" "ssh_workdir": "/home/user/proj" }
Use docker (or ssh to a throwaway box) when you want YOLO/gateway autonomy without risking the host.
"brain": { "mode": "off", // "off" | "advisory" (annotate) | "auto" (decide) "provider": "local", // "" = reuse the main agent's provider "model": "qwen2.5:3b" // a small/cheap/local model is ideal here }
A cheap model that vets approval prompts so the flagship isn't interrupted — escalates to you on uncertainty or danger.
"router": { "mode": "off", // "off" | "heuristic" | "hybrid" | "judge" "light": { "provider": "", "model": "claude-haiku-4-5" }, "heavy": { "provider": "", "model": "claude-opus-4-5" }, "judge": { "provider": "", "model": "" }, // "" = reuse brain.* then llm.* "subagent_tier": "light", // "light" | "heavy" | "inherit" "prefer_grid_light": false, // run the light tier on a grid peer's local model (free) "heuristics": { "heavy_chars": 1500, "light_chars": 280, "heavy_tool_count": 12, "escalate_iteration": 6, "heavy_keywords": [], "light_keywords": [] } }
Per turn, picks light vs flagship to cut cost. Off by default. Empty tier provider/model reuse
the active llm.*. See Self-Management → Cost-aware routing.
"heal": { "enabled": true, "recovery_budget": 2, // recovery attempts per turn "verify": false, // only mark a fix "verified" if a later step succeeds "learn": true, // store failure→fix episodes in memory "evolve": "auto", // "auto" | "propose" | "off" (mine recurring fixes → skills) "redact_paths": true }
Auto-recovers failed tool calls, learns fixes, and mines recurring ones into recovery skills. On by default. See Self-Management → Self-healing.
These three power recall, semantic memory, and code search. All on by default — the bundled local embedder downloads a ~90 MB model on first use, then works fully offline. See Memory & Knowledge.
"embeddings": { "enabled": true, "backend": "local", // "local" (bundled ONNX) | "openai" (endpoint) | "grid" (peer) "provider": "", // for backend = "openai": a name from "providers" "model": "bge-small-en-v1.5", "dim": 384 }, "memory": { "enabled": true, "recall_k": 5, // memories injected per turn (RAG) "dedup_threshold": 0.92, // admission gate: near-duplicates are merged "max_per_namespace": 2000, "diffuse": true, // share non-`user` learnings across the grid "sweep_secs": 60 // governance cadence (eviction/consolidation) }, "knowledge": { "enabled": true, "max_file_kb": 256, // skip files larger than this when indexing "exclude": ["target", "node_modules", ".git", "dist"] }
-
Semantic memory accrues automatically (failure→fix episodes, the agent's memory tool); browse it
with
/memories(TUI) or the web/phone Control Center (where you can pin/edit/delete). Theusernamespace never leaves your node. Tell the agent "remember that ..." to force-store; pin an entry to exempt it from eviction. -
Knowledge is empty until you index a repo:
blumi knowledge ingest .thenblumi knowledge status(or/knowledgein the TUI). Powerscode_search/code_retrieve.
"acceleration": { "mode": "auto", // "auto" | "cpu" | "apple" (CoreML) | "cuda" "embeddings_accel": "auto" // override just the bundled embedder }
blumi accel doctor shows what was detected. See
Memory & Knowledge → GPU acceleration.
"web": { "password_hash": "" } // set via `blumi serve pair --password <pw>` (argon2; never plaintext)
A password is required when binding to a non-loopback address. The same server serves the browser UI and the blugo phone app. See Gateway.
"voice": { "enabled": false, "api_key": "", "stt_base_url": "https://api.openai.com/v1", "stt_model": "whisper-1", "tts_provider": "openai", // "openai" | "elevenlabs" "tts_base_url": "", // blank = provider default "tts_model": "tts-1", "tts_voice": "alloy", // OpenAI voice name, or an ElevenLabs voice id "tts_api_key": "" // separate TTS key (falls back to api_key) }
See Voice.
"gateway": { "yolo": false, // auto-approve in bot sessions (only with a sandboxed executor!) "telegram": { "token": "123456:AA...", // @BotFather "allowed_chats": [], // [] = anyone who messages it; or [<your-chat-id>] "voice": false // transcribe voice notes + speak replies (needs voice.* too) }, "discord": { "token": "", "allowed_channels": [] }, "slack": { "bot_token": "xoxb-...", "app_token": "xapp-..." }, "whatsapp": { "token": "", "phone_number_id": "", "verify_token": "", "webhook_port": 8080 } }
Run one transport in the foreground (blumi gateway telegram) or all configured ones as a service
(blumi gateway install). One bot per token — see Gateway → Messaging-bot gateway.
"grid": { "enabled": false, "secret": "", // same value on every node = same grid (or BLUMI_GRID__SECRET) "grid_id": "", // blank = derived from the secret digest "node_name": "", // blank = hostname "peers": ["10.0.0.150:7777"] // static peers (in addition to mDNS auto-discovery) }
Several blumi serve nodes sharing a secret form a grid that hands tasks to peers. The secret is
never advertised (only a non-sensitive digest). See Grid.
"remote": { "instances": [] }, // remote blumi instances the TUI can attach to as tabs "workspaces": { "roots": ["~/code"] } // dirs scanned for sibling git repos in the TUI sidebar
"git": { "author_name": "Blumi", "author_email": "you@example.com" }
Stamped on commits the agent makes (GIT_AUTHOR_* / GIT_COMMITTER_*). Empty author_name disables
the override.
"always_on": { "enabled": false, "autonomy": "propose", // "off" | "propose" (add tasks) | "auto" (reserved) "cadence_secs": 900, "min_interval_secs": 300, "skip_if_todos": 1, // skip while the board already has todos "max_open_discoveries": 5, "max_per_pass": 3 }
When idle, the gateway runs a read-only discovery pass and proposes tasks + a report. Off by default. See Self-Management → Always-on discovery.
"notify": { "enabled": false, "on": ["loop", "discovery"], // which completions fire (also "turn"); [] = loop+discovery "desktop": true, // OS notification on the host "bot": { "transport": "telegram", "target": "<chat-id>" }, // proactive bot message "web_push": false // browser Web Push (VAPID; secure-context only) }
Pings you when blumi loop / discovery finishes. Off by default. See
Self-Management → Completion notifications.
The blugo phone app's Dispatch feature
gets a push when a node finishes a turn. This is not a config-file setting — it's enabled by
file presence alone, so it's zero-config and independent of the notify block above (it never
adds desktop/bot noise).
To turn push on, drop a Firebase service account on the gateway machine:
cp your-firebase-adminsdk.json ~/.blumi/fcm-service-account.json chmod 600 ~/.blumi/fcm-service-account.json
- The gateway auto-detects the file and sends turn-complete pushes via FCM HTTP v1 (a short-lived
OAuth token is minted from the service account and cached in-process). The
project_idis read from the file — nothing else to configure. - Device tokens are stored at
~/.blumi/fcm.json(the app registers them on connect) and are pruned automatically when FCM reports them stale (HTTP 404 /UNREGISTERED). - No file = silent no-op. Dispatch still works in-app; you just don't get backgrounded pushes.
-
Never commit the service account (or the app's
google-services.json) — both are gitignored. The private key stays at~/.blumi/fcm-service-account.json(chmod 600) and never enters git.
Optional override: set
notify.fcm.service_account_path(andnotify.fcm.project_id) if you keep the file somewhere other than~/.blumi/fcm-service-account.json. Push is Android-only today.
"hooks": { "user_prompt_submit": [ { "command": "git branch --show-current", "timeout_secs": 5 } ], "pre_tool_use": [ { "command": "jq -e '.input.command|test(\"rm -rf\")|not' >/dev/null", "matcher": "Bash" } ] }
user_prompt_submit injects each command's stdout as turn context; pre_tool_use can block a tool
(non-zero exit). Off by default. See Self-Management → Lifecycle hooks.
"mcp_servers": { "github": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"], "env": { "GITHUB_TOKEN": "ghp_..." }, "enabled": true } }
External MCP servers launched over stdio; their tools appear to the agent. {workspace} in args is
substituted with the project path. See blumi mcp and CLI Usage.
"lsp_servers": { "rust": { "command": "rust-analyzer", "args": [], "extensions": ["rs"], "language_id": "rust" } }
Power the Lsp code-intelligence tool (definitions, references, diagnostics).
A realistic everyday settings.json — Claude for the flagship, a cheap local brain, memory on,
notifications to Telegram:
{
"llm": { "provider": "anthropic", "model": "claude-sonnet-4-5" },
"providers": {
"anthropic": { "api_key_env": "ANTHROPIC_API_KEY" },
"local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" }
},
"brain": { "mode": "advisory", "provider": "local", "model": "qwen2.5:3b" },
"router": { "mode": "hybrid", "light": { "model": "claude-haiku-4-5" },
"heavy": { "model": "claude-opus-4-5" } },
"memory": { "enabled": true },
"knowledge": { "enabled": true },
"git": { "author_name": "Blumi", "author_email": "you@example.com" },
"notify": { "enabled": true, "bot": { "transport": "telegram", "target": "123456789" } },
"gateway": { "telegram": { "token": "123456:AA...", "allowed_chats": [123456789] } }
}When a turn stops only because it hit the per-turn step cap, blumi auto-continues the same session and
narrates each step — bounded by both llm.max_auto_continue (default 12) and
llm.max_auto_continue_tokens (default 400k), whichever hits first. Tune live with /autocontinue <n>
(0 disables).
Beyond semantic memory, two markdown files in ~/.blumi/ are read as a frozen snapshot each session:
project MEMORY.md (agent notes) and USER.md (about you). View/edit them in the TUI
(/memory), the web/phone Control Center, or on disk.