Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Configuration

ankurCES edited this page Jun 9, 2026 · 14 revisions

Configuration

blumi (a single-binary coding agent for macOS and Linux) is configured by one JSON file: ~/.blumi/settings.json. Everything blumi does — provider, model, permissions, memory, the gateway, the grid — is driven from that file, and every section is optional because blumi ships with sensible defaults. An empty file is valid, and for most people a single provider key (set via blumi login) is all that's needed to start. This page is the complete reference: every section, its fields, its defaults, and copy-pasteable examples.

TL;DR / Key facts

  • Where config lives: ~/.blumi/settings.json (global). Per-project overrides go in ./.blumi/settings.json.
  • File format: plain JSON, written with file mode 0600; it holds secrets, so never commit it.
  • Fastest setup: run blumi login — a wizard that picks a provider, takes a key, chooses a model, and writes the right fields.
  • Everything is optional: blumi has built-in defaults, so an empty settings.json is valid.
  • Load order (later wins): built-in defaults → ~/.blumi/settings.json./.blumi/settings.jsonBLUMI_-prefixed environment variables → per-invocation flags.
  • Override per run: flags like --provider, --model, --persona, and --yolo beat the file.
  • Apply changes live: ask the agent to reload_self (or use the web/phone "reload"); process-level settings need a restart.

The annotated JSON blocks below use // comments for clarity. Strip them if your editor is strict — settings.json is plain JSON.

How is blumi config loaded? (layering)

blumi merges configuration from four layers in order, where later layers win, using the figment library:

  1. Built-in defaults
  2. ~/.blumi/settings.json — global, your main file
  3. ./.blumi/settings.json — per-project overrides (commit-safe project tweaks; secrets stay global)
  4. Environment variables — prefix BLUMI_, nest with __ (e.g. BLUMI_LLM__MODEL=claude-opus-4-5, BLUMI_GRID__SECRET=..., BLUMI_PROVIDERS__OPENAI__API_KEY=...)

settings.json is written mode 0600 and holds secrets — never commit it (the repo's .blumi/ is gitignored). Per-invocation flags (--provider, --model, --persona, --yolo) beat all of the above.

What is the fastest way to configure blumi? (the login wizard)

The fastest way to configure blumi is the interactive login wizard, which writes settings.json for you:

blumi login

Pick a provider, paste a key (or endpoint), choose a model — it writes the right bits into settings.json. Re-run any time to switch. That's all most people need; the rest of this page is for tuning.

How do I apply config changes?

How you apply a change depends on the setting: most are hot-reloaded, but a few process-level settings need a restart.

  • TUI / gateway session: ask the agent to reload_self, or use the web/phone "reload" — it re-reads settings.json without losing the conversation.
  • Process-level settings (bind host/port, the web password, the grid identity) are read once at startup, so they need a restart: launchctl kickstart -k gui/$(id -u)/com.blumi.serve (macOS) / systemctl --user restart blumi-serve (Linux), or just relaunch blumi tui.

Providers & models

How do I configure a provider and API key?

blumi is provider-agnostic, so you configure a provider by adding an entry under providers in settings.json (or, more easily, by running blumi login). Built-in presets exist for the common providers so you usually set only a key. The presets, keyed by name in providers, are:

Preset name kind Notes
anthropic anthropic Claude — API-key auth only
openai openai_compat OpenAI and any OpenAI-compatible endpoint (set base_url)
gemini gemini Google Gemini (native client)
azure anthropic_foundry Azure AI Foundry (Anthropic models)
local openai_compat a local server (llama.cpp / Ollama-compatible) — no key
mock deterministic, for tests/demos

Each provider entry:

"providers": {
 "anthropic": { "api_key_env": "ANTHROPIC_API_KEY" }, // read key from env (preferred)
 "openai": { "api_key": "sk-...", "base_url": "https://api.openai.com/v1" },
 "local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" }
}
  • api_key_env (read the key from an env var) is preferred over a literal api_key so the key never sits in the file.
  • base_url points OpenAI-compatible clients at any endpoint (Ollama, llama.cpp, OpenRouter, ...).
  • kind is only needed for a fully custom provider name; the presets above set it for you.

How do I set the model? (llm — the active model + turn limits)

The llm section selects which provider and model are active and sets the per-turn limits. To change the model, set llm.model (or run blumi login, or pass --model for a single run):

"llm": {
 "provider": "anthropic", // which entry in "providers"
 "model": "claude-sonnet-4-5", // "" = let the provider pick/probe
 "context_size": 131072,
 "max_output_tokens": 16384,
 "temperature": 0.7,
 "top_p": 0.8,
 "top_k": 20,
 "max_iterations": 25, // tool steps allowed per turn
 "max_auto_continue": 12, // self-continue rounds when a turn hits the step cap (0 = off)
 "max_auto_continue_tokens": 400000, // token ceiling for one auto-continue run (0 = no cap)
 "wake_on_rollover": true, // a context rollover (compaction) refreshes the token budget so a long task keeps going
 "max_local_agents": 4 // max concurrent local sub-agents (overflow → grid or queue)
}

Override per run without editing: blumi --provider openai --model gpt-4o run "...".


Agent behavior

How do I control what the agent may do? (permissions)

The permissions section defines what the agent may do unattended, via per-tool allow / deny / ask pattern lists:

"permissions": {
 "yolo": false, // true = auto-approve EVERYTHING (use only sandboxed)
 "tools": {
 "Bash": { "deny": ["rm -rf*", "sudo*"], "ask": ["git push*"] },
 "FileWrite": { "allow": ["src/**"] }
 }
}

Per-tool allow / deny / ask pattern lists. Interactive by default (mutating tools surface an approval card); toggle YOLO live with ctrl+y / /yolo / --yolo. For real guardrails, pair with hooks.pre_tool_use.

How do I change the agent's style? (persona + personas)

The persona field selects the active agent style, and personas defines custom ones (each can override the model and temperature):

"persona": "default", // active persona (built-in or custom)
"personas": {
 "reviewer": {
 "description": "Careful code reviewer",
 "instructions": "Be terse. Prioritise correctness + security. Propose diffs.",
 "model": "claude-opus-4-5", // optional: switch model when active
 "temperature": 0.2 // optional override
 }
}

Built-ins include architect, pair, reviewer, team. Switch with --persona <name> or /persona.

Where do the agent's tools run? (executor)

The executor section chooses where tool calls execute: the local host, a Docker sandbox, or a remote machine over SSH:

"executor": {
 "backend": "local", // "local" (host) | "docker" (sandbox) | "ssh" (remote)
 "docker_image": "debian:stable-slim",
 "ssh_host": "user@box", // for backend = "ssh"
 "ssh_workdir": "/home/user/proj"
}

Use docker (or ssh to a throwaway box) when you want YOLO/gateway autonomy without risking the host.

How do I set up a local-LLM approval reviewer? (brain)

The brain section configures a cheap local model that vets approval prompts so the flagship model is not interrupted:

"brain": {
 "mode": "off", // "off" | "advisory" (annotate) | "auto" (decide)
 "provider": "local", // "" = reuse the main agent's provider
 "model": "qwen2.5:3b" // a small/cheap/local model is ideal here
}

A cheap model that vets approval prompts so the flagship isn't interrupted — escalates to you on uncertainty or danger.

How do I set up cost-aware model routing? (router)

The router section enables cost-aware routing that, per turn, picks a light model versus the flagship to cut cost. It is off by default:

"router": {
 "mode": "off", // "off" | "heuristic" | "hybrid" | "judge"
 "light": { "provider": "", "model": "claude-haiku-4-5" },
 "heavy": { "provider": "", "model": "claude-opus-4-5" },
 "judge": { "provider": "", "model": "" }, // "" = reuse brain.* then llm.*
 "subagent_tier": "light", // "light" | "heavy" | "inherit"
 "prefer_grid_light": false, // run the light tier on a grid peer's local model (free)
 "heuristics": {
 "heavy_chars": 1500, "light_chars": 280, "heavy_tool_count": 12,
 "escalate_iteration": 6, "heavy_keywords": [], "light_keywords": []
 }
}

Per turn, picks light vs flagship to cut cost. Off by default. Empty tier provider/model reuse the active llm.*. See Self-Management → Cost-aware routing.

How does blumi self-heal? (heal — self-healing & evolution)

The heal section controls self-healing: blumi auto-recovers failed tool calls, learns fixes, and mines recurring ones into recovery skills. It is on by default:

"heal": {
 "enabled": true,
 "recovery_budget": 2, // recovery attempts per turn
 "verify": false, // only mark a fix "verified" if a later step succeeds
 "learn": true, // store failure→fix episodes in memory
 "evolve": "auto", // "auto" | "propose" | "off" (mine recurring fixes → skills)
 "redact_paths": true
}

Auto-recovers failed tool calls, learns fixes, and mines recurring ones into recovery skills. On by default. See Self-Management → Self-healing.


Memory & knowledge

How do I configure memory, embeddings, and code search?

The embeddings, memory, and knowledge sections power recall, semantic memory, and code search, and all three are on by default. The bundled local embedder downloads a ~90 MB model on first use, then works fully offline (RAG, or retrieval-augmented generation, runs entirely on your machine). See Memory & Knowledge.

"embeddings": {
 "enabled": true,
 "backend": "local", // "local" (bundled ONNX) | "openai" (endpoint) | "grid" (peer)
 "provider": "", // for backend = "openai": a name from "providers"
 "model": "bge-small-en-v1.5",
 "dim": 384
},
"memory": {
 "enabled": true,
 "recall_k": 5, // memories injected per turn (RAG)
 "dedup_threshold": 0.92, // admission gate: near-duplicates are merged
 "max_per_namespace": 2000,
 "diffuse": true, // share non-`user` learnings across the grid
 "sweep_secs": 60, // governance cadence (eviction/consolidation)
 "resolve_conflicts": false, // opt-in: LLM supersedes contradictory memories in the sweep
 "retrospect": true, // daily: replay each session's new transcript → consolidate learnings
 "retrospect_hours": 24 // min hours between retrospection passes
},
"knowledge": {
 "enabled": true,
 "max_file_kb": 256, // skip files larger than this when indexing
 "exclude": ["target", "node_modules", ".git", "dist"],
 "graph": {
 "mode": "lite", // off | lite | structural (structural needs the `code-graph` build feature)
 "resolve_imports": true, // resolve use/import when building structural edges
 "max_edges_per_symbol": 64, // fan-out cap (0 = uncapped)
 "rpl_impact": true // feed a file's code-graph fan-in into the RPL blast radius
 }
}
  • Semantic memory accrues automatically (failure→fix episodes, the agent's memory tool); browse it with /memories (TUI) or the web/phone Control Center (where you can pin/edit/delete). The user namespace never leaves your node. Tell the agent "remember that ..." to force-store; pin an entry to exempt it from eviction.
  • Knowledge is empty until you index a repo: blumi knowledge ingest . then blumi knowledge status (or /knowledge in the TUI). Powers code_search / code_retrieve.

How do I set the embedder's execution provider? (acceleration)

The acceleration section picks the hardware execution provider for the bundled embedder (CPU, Apple CoreML, or CUDA):

"acceleration": {
 "mode": "auto", // "auto" | "cpu" | "apple" (CoreML) | "cuda"
 "embeddings_accel": "auto" // override just the bundled embedder
}

blumi accel doctor shows what was detected. See Memory & Knowledge → GPU acceleration.


Surfaces (web, phone, bots, grid)

How do I set the web / gateway password? (web)

The web section holds the authentication hash for the gateway and browser UI. Set it with blumi serve pair --password <pw> rather than editing the hash by hand:

"web": { "password_hash": "" } // set via `blumi serve pair --password <pw>` (argon2; never plaintext)

A password is required when binding to a non-loopback address. The same server serves the browser UI and the blugo phone app. See Gateway.

How do I enable voice? (voice — speech-to-text + text-to-speech)

The voice section configures speech-to-text (STT) and text-to-speech (TTS); it is disabled until you set enabled: true and supply a key:

"voice": {
 "enabled": false,
 "api_key": "",
 "stt_base_url": "https://api.openai.com/v1",
 "stt_model": "whisper-1",
 "tts_provider": "openai", // "openai" | "elevenlabs"
 "tts_base_url": "", // blank = provider default
 "tts_model": "tts-1",
 "tts_voice": "alloy", // OpenAI voice name, or an ElevenLabs voice id
 "tts_api_key": "" // separate TTS key (falls back to api_key)
}

See Voice.

How do I run blumi as a chat bot? (gateway)

The gateway section configures blumi as a chat bot across Telegram, Discord, Slack, and WhatsApp, each keyed by its own tokens:

"gateway": {
 "yolo": false, // auto-approve in bot sessions (only with a sandboxed executor!)
 "telegram": {
 "token": "123456:AA...", // @BotFather
 "allowed_chats": [], // [] = anyone who messages it; or [<your-chat-id>]
 "voice": false // transcribe voice notes + speak replies (needs voice.* too)
 },
 "discord": { "token": "", "allowed_channels": [] },
 "slack": { "bot_token": "xoxb-...", "app_token": "xapp-..." },
 "whatsapp": { "token": "", "phone_number_id": "", "verify_token": "", "webhook_port": 8080 }
}

Run one transport in the foreground (blumi gateway telegram) or all configured ones as a service (blumi gateway install). One bot per token — see Gateway → Messaging-bot gateway.

How do I set up a distributed grid? (grid)

The grid section forms a distributed, multi-node grid: several blumi serve nodes that share a secret discover each other and hand tasks to peers:

"grid": {
 "enabled": false,
 "secret": "", // same value on every node = same grid (or BLUMI_GRID__SECRET)
 "grid_id": "", // blank = derived from the secret digest
 "node_name": "", // blank = hostname
 "peers": ["10.0.0.150:7777"] // static peers (in addition to mDNS auto-discovery)
}

Several blumi serve nodes sharing a secret form a grid that hands tasks to peers. The secret is never advertised (only a non-sensitive digest). See Grid.

How do I attach remote instances and workspaces? (remote + workspaces)

The remote section lists remote blumi instances the TUI (terminal UI) can attach to as tabs, and workspaces lists directories scanned for sibling git repos in the TUI sidebar:

"remote": { "instances": [] }, // remote blumi instances the TUI can attach to as tabs
"workspaces": { "roots": ["~/code"] } // dirs scanned for sibling git repos in the TUI sidebar

How do I set the commit identity? (git)

The git section sets the author identity stamped on commits the agent makes:

"git": { "author_name": "Blumi", "author_email": "you@example.com" }

Stamped on commits the agent makes (GIT_AUTHOR_* / GIT_COMMITTER_*). Empty author_name disables the override.


Autonomy & notifications

How do I enable proactive discovery? (always_on)

The always_on section lets the gateway run a read-only discovery pass when idle and propose tasks plus a report. It is off by default:

"always_on": {
 "enabled": false,
 "autonomy": "propose", // "off" | "propose" (add tasks) | "auto" (reserved)
 "cadence_secs": 900,
 "min_interval_secs": 300,
 "skip_if_todos": 1, // skip while the board already has todos
 "max_open_discoveries": 5,
 "max_per_pass": 3
}

When idle, the gateway runs a read-only discovery pass and proposes tasks + a report. Off by default. See Self-Management → Always-on discovery.

How do I get completion notifications? (notify)

The notify section pings you when blumi loop or a discovery pass finishes, via desktop, a chat bot, or browser Web Push. It is off by default:

"notify": {
 "enabled": false,
 "on": ["loop", "discovery"], // which completions fire (also "turn"); [] = loop+discovery
 "desktop": true, // OS notification on the host
 "bot": { "transport": "telegram", "target": "<chat-id>" }, // proactive bot message
 "web_push": false // browser Web Push (VAPID; secure-context only)
}

Pings you when blumi loop / discovery finishes. Off by default. See Self-Management → Completion notifications.

How do I enable phone push notifications? (FCM)

Phone push uses Firebase Cloud Messaging (FCM) and is the one notification path that is not a settings.json setting. The blugo phone app's Dispatch feature gets a push when a node finishes a turn. This is not a config-file setting — it's enabled by file presence alone, so it's zero-config and independent of the notify block above (it never adds desktop/bot noise).

To turn push on, drop a Firebase service account on the gateway machine:

cp your-firebase-adminsdk.json ~/.blumi/fcm-service-account.json
chmod 600 ~/.blumi/fcm-service-account.json
  • The gateway auto-detects the file and sends turn-complete pushes via FCM HTTP v1 (a short-lived OAuth token is minted from the service account and cached in-process). The project_id is read from the file — nothing else to configure.
  • Device tokens are stored at ~/.blumi/fcm.json (the app registers them on connect) and are pruned automatically when FCM reports them stale (HTTP 404 / UNREGISTERED).
  • No file = silent no-op. Dispatch still works in-app; you just don't get backgrounded pushes.
  • Never commit the service account (or the app's google-services.json) — both are gitignored. The private key stays at ~/.blumi/fcm-service-account.json (chmod 600) and never enters git.

Optional override: set notify.fcm.service_account_path (and notify.fcm.project_id) if you keep the file somewhere other than ~/.blumi/fcm-service-account.json. Push is Android-only today.

How do I add lifecycle hooks? (hooks)

The hooks section runs your own shell commands at lifecycle points to inject context or gate tool calls. It is off by default:

"hooks": {
 "user_prompt_submit": [
 { "command": "git branch --show-current", "timeout_secs": 5 }
 ],
 "pre_tool_use": [
 { "command": "jq -e '.input.command|test(\"rm -rf\")|not' >/dev/null", "matcher": "Bash" }
 ]
}

user_prompt_submit injects each command's stdout as turn context; pre_tool_use can block a tool (non-zero exit). Off by default. See Self-Management → Lifecycle hooks.


Tools & integrations

How do I add MCP servers? (mcp_servers)

The mcp_servers section registers external Model Context Protocol (MCP) servers, launched over stdio, whose tools then appear to the agent:

"mcp_servers": {
 "github": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"],
 "env": { "GITHUB_TOKEN": "ghp_..." }, "enabled": true }
}

External MCP servers launched over stdio; their tools appear to the agent. {workspace} in args is substituted with the project path. See blumi mcp and CLI Usage.

How do I add language servers? (lsp_servers)

The lsp_servers section registers language servers (LSP, the Language Server Protocol) that power blumi's code-intelligence tool — definitions, references, and diagnostics:

"lsp_servers": {
 "rust": { "command": "rust-analyzer", "args": [], "extensions": ["rs"], "language_id": "rust" }
}

Power the Lsp code-intelligence tool (definitions, references, diagnostics).


Putting it together

What does a complete settings.json look like?

Here is a realistic everyday settings.json — Claude for the flagship model, a cheap local brain, memory on, and notifications to Telegram:

{
 "llm": { "provider": "anthropic", "model": "claude-sonnet-4-5" },
 "providers": {
 "anthropic": { "api_key_env": "ANTHROPIC_API_KEY" },
 "local": { "kind": "openai_compat", "base_url": "http://localhost:11434/v1" }
 },
 "brain": { "mode": "advisory", "provider": "local", "model": "qwen2.5:3b" },
 "router": { "mode": "hybrid", "light": { "model": "claude-haiku-4-5" },
 "heavy": { "model": "claude-opus-4-5" } },
 "memory": { "enabled": true },
 "knowledge": { "enabled": true },
 "git": { "author_name": "Blumi", "author_email": "you@example.com" },
 "notify": { "enabled": true, "bot": { "transport": "telegram", "target": "123456789" } },
 "gateway": { "telegram": { "token": "123456:AA...", "allowed_chats": [123456789] } }
}

How does auto-continue work? (token budget)

Auto-continue lets a turn keep going past the per-turn step cap, bounded by a token budget. When a turn stops only because it hit the per-turn step cap, blumi auto-continues the same session and narrates each step — bounded by both llm.max_auto_continue (default 12) and llm.max_auto_continue_tokens (default 400k), whichever hits first. Tune live with /autocontinue <n> (0 disables).

What are the memory files? (MEMORY.md / USER.md)

Beyond semantic memory, blumi reads two markdown files in ~/.blumi/ as a frozen snapshot each session: project MEMORY.md (agent notes) and USER.md (about you). View or edit them in the TUI (/memory), the web/phone Control Center, or on disk.


FAQ

Where does blumi store its config?

blumi stores its configuration in ~/.blumi/settings.json (a single plain-JSON file written with file mode 0600). Per-project overrides live in ./.blumi/settings.json, and BLUMI_-prefixed environment variables override both. Other state under ~/.blumi/ includes the memory files MEMORY.md and USER.md, and optional credentials such as fcm-service-account.json.

Do I need an API key?

You need an API key for hosted providers (Anthropic, OpenAI, Gemini, Azure), set most safely via api_key_env so the key stays out of the file. You do not need a key for a local, OpenAI-compatible server (such as Ollama or llama.cpp): use the local preset with a base_url and no key.

How do I change the model?

Set llm.model (and llm.provider) in settings.json, or run blumi login to pick interactively. To change the model for a single run without editing the file, pass the flag: blumi --provider openai --model gpt-4o run "...".

Can I run blumi fully offline?

Yes. Point llm at a local, keyless provider (the local preset with an Ollama/llama.cpp base_url), and keep embeddings.backend set to local — the bundled ONNX embedder downloads a ~90 MB model on first use and then works fully offline, so memory, RAG recall, and code search all run without a network.

How do I apply config changes without restarting?

Ask the agent to reload_self (or click "reload" in the web/phone UI) to re-read settings.json without losing the conversation. Only process-level settings — the bind host/port, the web password, and the grid identity — are read once at startup and require a restart.

Is it safe to commit settings.json?

No — settings.json holds secrets and is written mode 0600, so you should never commit it (the repo's .blumi/ is gitignored). Keep secrets in your global ~/.blumi/settings.json (or in environment variables via api_key_env) and reserve the commit-safe ./.blumi/settings.json for non-secret per-project tweaks.

How do I keep the agent from doing dangerous things?

Use the permissions section with per-tool deny / ask pattern lists, leave yolo set to false (its default), and for real guardrails add a hooks.pre_tool_use hook that blocks a tool by exiting non-zero. When you do want full autonomy, pair it with a sandboxed executor (docker or ssh).

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /