-
Notifications
You must be signed in to change notification settings - Fork 0
Self Management
blumi can evolve itself: author its own skills, edit its own config (validated), and reload in place to apply both — no restart, conversation preserved. These are agent tools the model calls during a turn; just ask it in chat ("add a skill for ...", "set llm.temperature to 0.3 and reload").
Status: the self-management tools below are live. Triggering them directly from the phone app (buttons for reload / restart / edit config / build skills) and a restart-the-gateway capability are rolling out incrementally.
Reads/writes ~/.blumi/settings.json by dotted key, and can add personas. Every change is
validated (it must deserialize as a valid config) before an atomic write, so a bad edit is
rejected rather than corrupting your config.
Process-level settings (bind host/port, the gateway password, the grid identity) are read once at startup, so changing them needs a restart, not just a reload.
Create / update / delete a SKILL.md under ~/.blumi/skills/<name>/. Skills are
progressive-disclosure instructions: their name + description sit in the system prompt; the agent
loads the full body on demand. See blumi skills and CLI Usage.
Emits a reload: blumi re-reads settings.json, re-scans skills, and rebuilds the session seeded
from the current conversation (messages, todos, token counts preserved). Use it after
self_config / manage_skill.
The always-on gateway is supervised: launchd KeepAlive (macOS) and systemd Restart=always
(Linux) auto-restart it on crash. That's crash recovery for free — see Gateway.
For a wedged-but-alive session, a reload (above) rebuilds it without losing the conversation.
Beyond crash recovery, blumi treats reliability as a bounded control problem (after the self-healing-orchestrators paper) and turns repeated failures into durable improvements — wired into the failure taxonomy and the semantic memory.
-
Reflex recovery. A failed tool result is classified (bad args, state conflict, crash,
empty) and gets a budgeted, targeted recovery action — re-read-then-retry, an argument fix from
the tool's hint, narrow-the-query, or escalate. Only idempotent (read-only) tools auto-retry;
mutating tools (Bash, FileWrite, ...) escalate rather than blind-retry. It composes with the
doom-loop guard, and each attempt emits an observability trace (
⚕ self-heal ...inline in the TUI). -
Learns from failures. A successful recovery is stored as a failure→fix episode in the
agentmemory namespace, so it diffuses across the grid ; a similar future failure recalls the known fix and injects it as trailing guidance. Paths/secrets are redacted first. -
Evolves. Recurring failure clusters are mined (on the gateway sweep) into auto-written
recovery skills (low-risk, with a notice — via the same
manage_skillactuator above); anything risky — config / providers / secrets / deletes — raises an approval instead. The audit trail is kept asevolutionmemories. -
Confirmed. With
heal.verifyon, a recovery is marked verified only when the retried tool actually succeeds on a later step (ground truth, not just "a fix was suggested"), and the fix that worked has its utility reinforced.
Configure it (~/.blumi/settings.json; defaults shown):
"heal": { "enabled": true, "recovery_budget": 2, "verify": false, "learn": true, "evolve": "auto", "redact_paths": true }
-
evolve—"auto"(apply low-risk skills automatically, with a notice) ·"propose"(mine + always ask) ·"off"(kill switch: still recover & learn, but never self-modify). -
recovery_budgetmax recovery attempts per turn ·verifyrequire cross-step success ·learnwrite failure→fix episodes ·redact_pathsscrub paths/secrets before storage ·enabledmaster switch.
See it: the TUI /heal overlay (recovery / evolution / proposal counts + recent items), the
inline ⚕ self-heal traces, the blugo Heal tab, or GET /api/heal on the gateway.
Per turn, blumi can pick a difficulty tier and route to a light vs flagship model — simple, mechanical work doesn't have to burn the flagship's price. Off by default. The mechanism is hybrid: a fast, zero-cost heuristic (prompt length, tool count, iteration depth, keyword hints) decides most turns, and only ambiguous turns consult a small local judge model (which fails safe to the cheap tier — an unreachable judge never silently upgrades you).
- Delegated sub-agents default to the cheap tier (
router.subagent_tier), so investigation/fan-out is cheap while the main agent stays sharp. - Model swaps are prompt-cache-safe: blumi only changes the model when the tier actually changes, and on a long turn it escalates Light→Heavy on deep iterations but never demotes mid-turn.
- On a grid,
router.prefer_grid_lightcan run the light tier on a peer's local model (free).
"router": { "mode": "hybrid", // off | heuristic | hybrid | judge "light": { "provider": "", "model": "claude-haiku-4-5" }, "heavy": { "provider": "", "model": "claude-opus-4-5" }, "judge": { "provider": "", "model": "" }, // empty = reuse brain.*, then llm.* "subagent_tier": "light", // light | heavy | inherit "prefer_grid_light": false }
See it: the TUI /route overlay (per-tier turns/tokens + $ saved vs all-heavy) or
GET /api/route. /route off|heuristic|hybrid|judge switches the mode live. Empty tier
provider/model reuse the active llm.*. (Cost is estimated from each model's list price; local /
unpriced models show as free.)
When you step away, blumi can keep finding what's worth doing. With it enabled, the always-on
gateway periodically runs a read-only discovery pass that surfaces candidate tasks for the
workspace, adds them to the task board as Discovered: todos,
and lands a markdown report in ~/.blumi/reports/ plus an agent-namespace discovery memory.
Off by default.
"always_on": { "enabled": true, "autonomy": "propose", // off | propose | auto "cadence_secs": 900, "min_interval_secs": 300, // rate-limit floor "skip_if_todos": 1, // skip while the board already has todos "max_open_discoveries": 5, "max_per_pass": 3 }
-
Safe by design: the pass runs one bounded turn with
yolo = false, so approval-requiring (mutating) tools are denied — it can read + reason but not change anything. It's gated by cadence + rate-limit + a board-busy check + an open-discovery cap, and stored text is path/secret-redacted. -
proposeadds Todo tasks + a report (you run them).autois reserved for autonomously running low-risk discoveries in an isolated git worktree/snapshot — a planned follow-up; today it behaves likepropose.
See it: blumi serve status (an always-on: line), GET /api/always-on (recent + reports), or
the TUI /discoveries overlay.
Claude-Code-style extension points. v1 implements UserPromptSubmit: when you submit a prompt,
blumi runs your configured shell commands and injects each command's stdout as background context
for that turn. The prompt is piped to the hook's stdin, the command runs in the workspace, and its
output lands as a cache-safe trailing message (never the cached system prefix). Use it to stamp every
turn with live context — current git branch, active ticket, deploy status, a lint summary, etc.
"hooks": { "user_prompt_submit": [ { "command": "git branch --show-current", "timeout_secs": 5 }, { "command": "cat .blumi/context.md" } ] }
Hooks are trusted — they're your own commands from settings.json (the same trust model as cron),
so blumi runs them as written. Each is bounded by timeout_secs (default 10s) so a hung hook can't
stall a turn; a hook that times out or exits non-zero is simply skipped.
pre_tool_use(a hook that can block a tool call) is reserved in the config but not yet wired — the tool-blocking path lands in a later, security-reviewed pass.
- Config writes are validated + atomic (temp file → rename), mode
0600. - Skill names are slug-jailed (no path traversal).
- These tools are powerful; under non-YOLO sessions, mutating actions still surface an approval card. Keep destructive operations behind "ask" in your permissions.