Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Configuration

ankurCES edited this page Jun 7, 2026 · 14 revisions

Configuration

Everything blumi does is driven by one JSON file — ~/.blumi/settings.json. This page is the complete reference: every section, its fields, its defaults, and copy-pasteable examples. All sections are optional — blumi ships with sensible defaults, so an empty file is valid and a one-line provider key is usually all you need to start.

The annotated JSON blocks below use // comments for clarity. Strip them if your editor is strict — settings.json is plain JSON.

How config is loaded (layering)

blumi merges these in order (later wins), via figment:

  1. Built-in defaults
  2. ~/.blumi/settings.json — global, your main file
  3. ./.blumi/settings.json — per-project overrides (commit-safe project tweaks; secrets stay global)
  4. Environment variables — prefix BLUMI_, nest with __ (e.g. BLUMI_LLM__MODEL=claude-opus-4-5, BLUMI_GRID__SECRET=..., BLUMI_PROVIDERS__OPENAI__API_KEY=...)

settings.json is written mode 0600 and holds secrets — never commit it (the repo's .blumi/ is gitignored). Per-invocation flags (--provider, --model, --persona, --yolo) beat all of the above.

Fastest path: the login wizard

blumi login

Pick a provider, paste a key (or endpoint), choose a model — it writes the right bits into settings.json. Re-run any time to switch. That's all most people need; the rest of this page is for tuning.

Applying changes

  • TUI / gateway session: ask the agent to reload_self, or use the web/phone "reload" — it re-reads settings.json without losing the conversation.
  • Process-level settings (bind host/port, the web password, the grid identity) are read once at startup, so they need a restart: launchctl kickstart -k gui/$(id -u)/com.blumi.serve (macOS) / systemctl --user restart blumi-serve (Linux), or just relaunch blumi tui.

Providers & models

Providers & keys

blumi is provider-agnostic. Built-in presets exist for the common ones (so you usually set only a key), keyed by name in providers:

Preset name kind Notes
anthropic anthropic Claude — API-key auth only
openai openai_compat OpenAI and any OpenAI-compatible endpoint (set base_url)
gemini gemini Google Gemini (native client)
azure anthropic_foundry Azure AI Foundry (Anthropic models)
local openai_compat a local server (llama.cpp / Ollama-compatible) — no key
mock deterministic, for tests/demos

Each provider entry:

"providers": {
 "anthropic": { "api_key_env": "ANTHROPIC_API_KEY" }, // read key from env (preferred)
 "openai": { "api_key": "sk-...", "base_url": "https://api.openai.com/v1" },
 "local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" }
}
  • api_key_env (read the key from an env var) is preferred over a literal api_key so the key never sits in the file.
  • base_url points OpenAI-compatible clients at any endpoint (Ollama, llama.cpp, OpenRouter, ...).
  • kind is only needed for a fully custom provider name; the presets above set it for you.

llm — the active model + turn limits

"llm": {
 "provider": "anthropic", // which entry in "providers"
 "model": "claude-sonnet-4-5", // "" = let the provider pick/probe
 "context_size": 131072,
 "max_output_tokens": 16384,
 "temperature": 0.7,
 "top_p": 0.8,
 "top_k": 20,
 "max_iterations": 25, // tool steps allowed per turn
 "max_auto_continue": 12, // self-continue rounds when a turn hits the step cap (0 = off)
 "max_auto_continue_tokens": 400000, // token ceiling for one auto-continue run (0 = no cap)
 "max_local_agents": 4 // max concurrent local sub-agents (overflow → grid or queue)
}

Override per run without editing: blumi --provider openai --model gpt-4o run "...".


Agent behavior

permissions — what the agent may do unattended

"permissions": {
 "yolo": false, // true = auto-approve EVERYTHING (use only sandboxed)
 "tools": {
 "Bash": { "deny": ["rm -rf*", "sudo*"], "ask": ["git push*"] },
 "FileWrite": { "allow": ["src/**"] }
 }
}

Per-tool allow / deny / ask pattern lists. Interactive by default (mutating tools surface an approval card); toggle YOLO live with ctrl+y / /yolo / --yolo. For real guardrails, pair with hooks.pre_tool_use.

persona + personas — agent style

"persona": "default", // active persona (built-in or custom)
"personas": {
 "reviewer": {
 "description": "Careful code reviewer",
 "instructions": "Be terse. Prioritise correctness + security. Propose diffs.",
 "model": "claude-opus-4-5", // optional: switch model when active
 "temperature": 0.2 // optional override
 }
}

Built-ins include architect, pair, reviewer, team. Switch with --persona <name> or /persona.

executor — where tools run

"executor": {
 "backend": "local", // "local" (host) | "docker" (sandbox) | "ssh" (remote)
 "docker_image": "debian:stable-slim",
 "ssh_host": "user@box", // for backend = "ssh"
 "ssh_workdir": "/home/user/proj"
}

Use docker (or ssh to a throwaway box) when you want YOLO/gateway autonomy without risking the host.

brain — local-LLM approval reviewer

"brain": {
 "mode": "off", // "off" | "advisory" (annotate) | "auto" (decide)
 "provider": "local", // "" = reuse the main agent's provider
 "model": "qwen2.5:3b" // a small/cheap/local model is ideal here
}

A cheap model that vets approval prompts so the flagship isn't interrupted — escalates to you on uncertainty or danger.

router — cost-aware model routing

"router": {
 "mode": "off", // "off" | "heuristic" | "hybrid" | "judge"
 "light": { "provider": "", "model": "claude-haiku-4-5" },
 "heavy": { "provider": "", "model": "claude-opus-4-5" },
 "judge": { "provider": "", "model": "" }, // "" = reuse brain.* then llm.*
 "subagent_tier": "light", // "light" | "heavy" | "inherit"
 "prefer_grid_light": false, // run the light tier on a grid peer's local model (free)
 "heuristics": {
 "heavy_chars": 1500, "light_chars": 280, "heavy_tool_count": 12,
 "escalate_iteration": 6, "heavy_keywords": [], "light_keywords": []
 }
}

Per turn, picks light vs flagship to cut cost. Off by default. Empty tier provider/model reuse the active llm.*. See Self-Management → Cost-aware routing.

heal — self-healing & evolution

"heal": {
 "enabled": true,
 "recovery_budget": 2, // recovery attempts per turn
 "verify": false, // only mark a fix "verified" if a later step succeeds
 "learn": true, // store failure→fix episodes in memory
 "evolve": "auto", // "auto" | "propose" | "off" (mine recurring fixes → skills)
 "redact_paths": true
}

Auto-recovers failed tool calls, learns fixes, and mines recurring ones into recovery skills. On by default. See Self-Management → Self-healing.


Memory & knowledge

These three power recall, semantic memory, and code search. All on by default — the bundled local embedder downloads a ~90 MB model on first use, then works fully offline. See Memory & Knowledge.

"embeddings": {
 "enabled": true,
 "backend": "local", // "local" (bundled ONNX) | "openai" (endpoint) | "grid" (peer)
 "provider": "", // for backend = "openai": a name from "providers"
 "model": "bge-small-en-v1.5",
 "dim": 384
},
"memory": {
 "enabled": true,
 "recall_k": 5, // memories injected per turn (RAG)
 "dedup_threshold": 0.92, // admission gate: near-duplicates are merged
 "max_per_namespace": 2000,
 "diffuse": true, // share non-`user` learnings across the grid
 "sweep_secs": 60 // governance cadence (eviction/consolidation)
},
"knowledge": {
 "enabled": true,
 "max_file_kb": 256, // skip files larger than this when indexing
 "exclude": ["target", "node_modules", ".git", "dist"]
}
  • Semantic memory accrues automatically (failure→fix episodes, the agent's memory tool); browse it with /memories (TUI) or the web/phone Control Center (where you can pin/edit/delete). The user namespace never leaves your node. Tell the agent "remember that ..." to force-store; pin an entry to exempt it from eviction.
  • Knowledge is empty until you index a repo: blumi knowledge ingest . then blumi knowledge status (or /knowledge in the TUI). Powers code_search / code_retrieve.

acceleration — embedder execution provider

"acceleration": {
 "mode": "auto", // "auto" | "cpu" | "apple" (CoreML) | "cuda"
 "embeddings_accel": "auto" // override just the bundled embedder
}

blumi accel doctor shows what was detected. See Memory & Knowledge → GPU acceleration.


Surfaces (web, phone, bots, grid)

web — the gateway / web UI auth

"web": { "password_hash": "" } // set via `blumi serve pair --password <pw>` (argon2; never plaintext)

A password is required when binding to a non-loopback address. The same server serves the browser UI and the blugo phone app. See Gateway.

voice — speech-to-text + text-to-speech

"voice": {
 "enabled": false,
 "api_key": "",
 "stt_base_url": "https://api.openai.com/v1",
 "stt_model": "whisper-1",
 "tts_provider": "openai", // "openai" | "elevenlabs"
 "tts_base_url": "", // blank = provider default
 "tts_model": "tts-1",
 "tts_voice": "alloy", // OpenAI voice name, or an ElevenLabs voice id
 "tts_api_key": "" // separate TTS key (falls back to api_key)
}

See Voice.

gateway — run blumi as a chat bot

"gateway": {
 "yolo": false, // auto-approve in bot sessions (only with a sandboxed executor!)
 "telegram": {
 "token": "123456:AA...", // @BotFather
 "allowed_chats": [], // [] = anyone who messages it; or [<your-chat-id>]
 "voice": false // transcribe voice notes + speak replies (needs voice.* too)
 },
 "discord": { "token": "", "allowed_channels": [] },
 "slack": { "bot_token": "xoxb-...", "app_token": "xapp-..." },
 "whatsapp": { "token": "", "phone_number_id": "", "verify_token": "", "webhook_port": 8080 }
}

Run one transport in the foreground (blumi gateway telegram) or all configured ones as a service (blumi gateway install). One bot per token — see Gateway → Messaging-bot gateway.

grid — distributed multi-node

"grid": {
 "enabled": false,
 "secret": "", // same value on every node = same grid (or BLUMI_GRID__SECRET)
 "grid_id": "", // blank = derived from the secret digest
 "node_name": "", // blank = hostname
 "peers": ["10.0.0.150:7777"] // static peers (in addition to mDNS auto-discovery)
}

Several blumi serve nodes sharing a secret form a grid that hands tasks to peers. The secret is never advertised (only a non-sensitive digest). See Grid.

remote + workspaces

"remote": { "instances": [] }, // remote blumi instances the TUI can attach to as tabs
"workspaces": { "roots": ["~/code"] } // dirs scanned for sibling git repos in the TUI sidebar

git — commit identity

"git": { "author_name": "Blumi", "author_email": "you@example.com" }

Stamped on commits the agent makes (GIT_AUTHOR_* / GIT_COMMITTER_*). Empty author_name disables the override.


Autonomy & notifications

always_on — proactive discovery

"always_on": {
 "enabled": false,
 "autonomy": "propose", // "off" | "propose" (add tasks) | "auto" (reserved)
 "cadence_secs": 900,
 "min_interval_secs": 300,
 "skip_if_todos": 1, // skip while the board already has todos
 "max_open_discoveries": 5,
 "max_per_pass": 3
}

When idle, the gateway runs a read-only discovery pass and proposes tasks + a report. Off by default. See Self-Management → Always-on discovery.

notify — completion notifications

"notify": {
 "enabled": false,
 "on": ["loop", "discovery"], // which completions fire (also "turn"); [] = loop+discovery
 "desktop": true, // OS notification on the host
 "bot": { "transport": "telegram", "target": "<chat-id>" }, // proactive bot message
 "web_push": false // browser Web Push (VAPID; secure-context only)
}

Pings you when blumi loop / discovery finishes. Off by default. See Self-Management → Completion notifications.

Push notifications (FCM)

The blugo phone app's Dispatch feature gets a push when a node finishes a turn. This is not a config-file setting — it's enabled by file presence alone, so it's zero-config and independent of the notify block above (it never adds desktop/bot noise).

To turn push on, drop a Firebase service account on the gateway machine:

cp your-firebase-adminsdk.json ~/.blumi/fcm-service-account.json
chmod 600 ~/.blumi/fcm-service-account.json
  • The gateway auto-detects the file and sends turn-complete pushes via FCM HTTP v1 (a short-lived OAuth token is minted from the service account and cached in-process). The project_id is read from the file — nothing else to configure.
  • Device tokens are stored at ~/.blumi/fcm.json (the app registers them on connect) and are pruned automatically when FCM reports them stale (HTTP 404 / UNREGISTERED).
  • No file = silent no-op. Dispatch still works in-app; you just don't get backgrounded pushes.
  • Never commit the service account (or the app's google-services.json) — both are gitignored. The private key stays at ~/.blumi/fcm-service-account.json (chmod 600) and never enters git.

Optional override: set notify.fcm.service_account_path (and notify.fcm.project_id) if you keep the file somewhere other than ~/.blumi/fcm-service-account.json. Push is Android-only today.

hooks — lifecycle hooks

"hooks": {
 "user_prompt_submit": [
 { "command": "git branch --show-current", "timeout_secs": 5 }
 ],
 "pre_tool_use": [
 { "command": "jq -e '.input.command|test(\"rm -rf\")|not' >/dev/null", "matcher": "Bash" }
 ]
}

user_prompt_submit injects each command's stdout as turn context; pre_tool_use can block a tool (non-zero exit). Off by default. See Self-Management → Lifecycle hooks.


Tools & integrations

mcp_servers — Model Context Protocol tools

"mcp_servers": {
 "github": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"],
 "env": { "GITHUB_TOKEN": "ghp_..." }, "enabled": true }
}

External MCP servers launched over stdio; their tools appear to the agent. {workspace} in args is substituted with the project path. See blumi mcp and CLI Usage.

lsp_servers — language servers (code intel)

"lsp_servers": {
 "rust": { "command": "rust-analyzer", "args": [], "extensions": ["rs"], "language_id": "rust" }
}

Power the Lsp code-intelligence tool (definitions, references, diagnostics).


Putting it together

A realistic everyday settings.json — Claude for the flagship, a cheap local brain, memory on, notifications to Telegram:

{
 "llm": { "provider": "anthropic", "model": "claude-sonnet-4-5" },
 "providers": {
 "anthropic": { "api_key_env": "ANTHROPIC_API_KEY" },
 "local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" }
 },
 "brain": { "mode": "advisory", "provider": "local", "model": "qwen2.5:3b" },
 "router": { "mode": "hybrid", "light": { "model": "claude-haiku-4-5" },
 "heavy": { "model": "claude-opus-4-5" } },
 "memory": { "enabled": true },
 "knowledge": { "enabled": true },
 "git": { "author_name": "Blumi", "author_email": "you@example.com" },
 "notify": { "enabled": true, "bot": { "transport": "telegram", "target": "123456789" } },
 "gateway": { "telegram": { "token": "123456:AA...", "allowed_chats": [123456789] } }
}

Auto-continue (token budget)

When a turn stops only because it hit the per-turn step cap, blumi auto-continues the same session and narrates each step — bounded by both llm.max_auto_continue (default 12) and llm.max_auto_continue_tokens (default 400k), whichever hits first. Tune live with /autocontinue <n> (0 disables).

The memory files (MEMORY.md / USER.md)

Beyond semantic memory, two markdown files in ~/.blumi/ are read as a frozen snapshot each session: project MEMORY.md (agent notes) and USER.md (about you). View/edit them in the TUI (/memory), the web/phone Control Center, or on disk.

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /