Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Self Management

ankurCES edited this page Jun 8, 2026 · 12 revisions

Self-Management

Self-management is the set of capabilities that let blumi — a local-first, provider-agnostic AI coding agent that runs as one Rust binary on your own machines — modify and recover itself at runtime. blumi can author its own skills, edit its own config (validated), reload in place, recover from crashes, learn from failures, and run an adversarial safety review before risky actions. This page documents every self-management tool and subsystem, with the exact ~/.blumi/settings.json keys that control each one.

TL;DR / Key facts:

  • blumi self-edits via three agent tools the model calls in a turn: self_config (edit settings), manage_skill (author skills), and reload_self (apply both in place).
  • A reload_self re-reads config and re-scans skills without losing the conversation (messages, todos, and token counts are preserved); no process restart is needed.
  • The always-on gateway auto-restarts on crash via launchd KeepAlive (macOS) or systemd Restart=always (Linux) — crash recovery is free.
  • Self-healing classifies failed tool results and applies budgeted, targeted recovery; only verified fixes (the tool later succeeds) are learned, mined into skills, and diffused across the grid.
  • RPL (Raskolnikov's Psychological Loop) is an opt-in, regret-minimizing adversarial review that judges a risky plan before it executes — off by default (rpl.enabled).
  • Cost-aware routing, always-on discovery, lifecycle hooks, and completion notifications are all individually configurable and off by default unless noted.
  • Process-level settings (bind host/port, gateway password, grid identity) are read once at startup and require a restart, not just a reload.

You drive all of this from chat — just ask blumi in plain language ("add a skill for ...", "set llm.temperature to 0.3 and reload"). These are agent tools the model calls during a turn; no restart, conversation preserved.

Status: the self-management tools below are live. Triggering them directly from the phone app (buttons for reload / restart / edit config / build skills) and a restart-the-gateway capability are rolling out incrementally.

The tools

self_config — edit settings

Reads/writes ~/.blumi/settings.json by dotted key, and can add personas. Every change is validated (it must deserialize as a valid config) before an atomic write, so a bad edit is rejected rather than corrupting your config.

Process-level settings (bind host/port, the gateway password, the grid identity) are read once at startup, so changing them needs a restart, not just a reload.

manage_skill — author skills

Create / update / delete a SKILL.md under ~/.blumi/skills/<name>/. Skills are progressive-disclosure instructions: their name + description sit in the system prompt; the agent loads the full body on demand. See blumi skills and CLI Usage.

reload_self — apply changes in place

Emits a reload: blumi re-reads settings.json, re-scans skills, and rebuilds the session seeded from the current conversation (messages, todos, token counts preserved). Use it after self_config / manage_skill.

Self-recovery

The always-on gateway is supervised: launchd KeepAlive (macOS) and systemd Restart=always (Linux) auto-restart it on crash. That's crash recovery for free — see Gateway. For a wedged-but-alive session, a reload (above) rebuilds it without losing the conversation.

Self-healing & evolution

Beyond crash recovery, blumi treats reliability as a bounded control problem (after the self-healing-orchestrators paper) and turns repeated failures into durable improvements — wired into the failure taxonomy and the semantic memory.

  • Reflex recovery. A failed tool result is classified (bad args, state conflict, crash, empty) and gets a budgeted, targeted recovery action — re-read-then-retry, an argument fix from the tool's hint, narrow-the-query, or escalate. Only idempotent (read-only) tools auto-retry; mutating tools (Bash, FileWrite, ...) escalate rather than blind-retry. It composes with the doom-loop guard, and each attempt emits an observability trace (⚕ self-heal ... inline in the TUI).
  • Learns from failures — only verified ones. A guided recovery is stored as a pending failure→fix hypothesis and promoted to a real agent-namespace episode only once the same tool is observed to succeed on a later step (see Confirmed below). Promoted fixes diffuse across the grid , and a similar future failure recalls the known fix as trailing guidance. Paths/secrets are redacted first; un-promoted guesses are reaped by the sweep, never recalled.
  • Evolves. Recurring failure clusters are mined (on the gateway sweep) into auto-written recovery skills (low-risk, with a notice — via the same manage_skill actuator above); anything risky — config / providers / secrets / deletes — raises an approval instead. The audit trail is kept as evolution memories.
  • Confirmed (the credit-assignment keystone). With heal.verify (now on by default), a recovery is promoted from pending to a recallable fix only when the retried tool actually succeeds on a later step — ground truth, not just "a fix was suggested" — with provenance. So the agent can tell a fix that worked from one that didn't, and only proven fixes are ever recalled, mined into skills, or diffused.

Configure it (~/.blumi/settings.json; defaults shown):

"heal": {
 "enabled": true,
 "recovery_budget": 2,
 "verify": true,
 "learn": true,
 "evolve": "auto",
 "redact_paths": true
}
  • evolve"auto" (apply low-risk skills automatically, with a notice) · "propose" (mine + always ask) · "off" (kill switch: still recover & learn, but never self-modify).
  • recovery_budget max recovery attempts per turn · verify require cross-step success · learn write failure→fix episodes · redact_paths scrub paths/secrets before storage · enabled master switch.

See it: the TUI /heal overlay (recovery / evolution / proposal counts + recent items), the inline ⚕ self-heal traces, the blugo Heal tab, or GET /api/heal on the gateway.

RPL-Judgement

Deep dive + the science of the judgement (with code references): Raskolnikov's Psychological Loop.

An opt-in, adversarial, regret-minimizing reasoning loop that wraps the agent's tool calls — "Raskolnikov's Psychological Loop". A standard agent maximizes success; an RPL agent minimizes regret: it debates a risky plan with an internal prosecutor before anything touches the live system. Off by default (rpl.enabled); it trades extra LLM calls for far fewer catastrophic actions, so turn it on for high-stakes work.

When enabled, before a materialized tool batch executes, blumi runs five phases:

  1. The Hypothesis (blast radius). Map what the batch would touch — files written, commands, network egress, VCS mutations, whether it's destructive / reversible — into a 0–100 severity. Read-only / low-blast batches skip the loop entirely (the cheap path), so cost stays bounded.
  2. The Fever Dream (simulation). For a batch over blast_threshold, predict the worst-case outcome + a paranoia score. (MVP: a dry prediction; real sandboxed branch execution in throwaway git worktrees is a planned follow-up.)
  3. The Porfiry Node (adversarial judge). The plan goes to an adversarial "Porfiry" LLM judge whose only job is to find the flaw — the ignored edge case, the un-re-read state, the irreversible step. It must approve, or the plan is bounced (the flaw is injected as guidance and the model re-plans), bounded by max_defend_rounds before proceeding under caution. A flaky / unreachable judge fails open, so it never deadlocks the agent.
  4. The Strike (actuation). The surviving plan flows through the normal typed tool pipeline, unchanged — same permission engine, same approvals.
  5. The Confession (regret → memory). After execution, blumi compares the predicted risk to the actual outcome and writes the Error Delta ("regret") back to memory as an rpl_delta episode — feeding value-based memory fitness and priming future blast-radius judgments. It evolves from the consequence.

Configure it (~/.blumi/settings.json; defaults shown — enabled is false):

"rpl": {
 "enabled": false,
 "blast_threshold": 40, // 0–100 severity a mutating batch must hit to be reviewed
 "branches": 3, // simulated branches per review (1–5)
 "max_defend_rounds": 2, // Porfiry reject → re-plan rounds before proceeding
 "judge_model": "", // empty = reuse the main model
 "sandbox": "dry" // dry (predict) | worktree (real sandboxed sim — planned)
}

RPL composes with — it doesn't replace — your permissions and pre_tool_use hooks: those still gate the Strike. RPL adds an upstream adversarial review so a risky plan is caught and re-thought before it ever reaches them.

Cost-aware routing

Per turn, blumi can pick a difficulty tier and route to a light vs flagship model — simple, mechanical work doesn't have to burn the flagship's price. Off by default. The mechanism is hybrid: a fast, zero-cost heuristic (prompt length, tool count, iteration depth, keyword hints) decides most turns, and only ambiguous turns consult a small local judge model (which fails safe to the cheap tier — an unreachable judge never silently upgrades you).

  • Delegated sub-agents default to the cheap tier (router.subagent_tier), so investigation/fan-out is cheap while the main agent stays sharp.
  • Model swaps are prompt-cache-safe: blumi only changes the model when the tier actually changes, and on a long turn it escalates Light→Heavy on deep iterations but never demotes mid-turn.
  • On a grid, router.prefer_grid_light can run the light tier on a peer's local model (free).
"router": {
 "mode": "hybrid", // off | heuristic | hybrid | judge
 "light": { "provider": "", "model": "claude-haiku-4-5" },
 "heavy": { "provider": "", "model": "claude-opus-4-5" },
 "judge": { "provider": "", "model": "" }, // empty = reuse brain.*, then llm.*
 "subagent_tier": "light", // light | heavy | inherit
 "prefer_grid_light": false
}

See it: the TUI /route overlay (per-tier turns/tokens + $ saved vs all-heavy) or GET /api/route. /route off|heuristic|hybrid|judge switches the mode live. Empty tier provider/model reuse the active llm.*. (Cost is estimated from each model's list price; local / unpriced models show as free.)

Always-on discovery

When you step away, blumi can keep finding what's worth doing. With it enabled, the always-on gateway periodically runs a read-only discovery pass that surfaces candidate tasks for the workspace, adds them to the task board as Discovered: todos, and lands a markdown report in ~/.blumi/reports/ plus an agent-namespace discovery memory. Off by default.

"always_on": {
 "enabled": true,
 "autonomy": "propose", // off | propose | auto
 "cadence_secs": 900,
 "min_interval_secs": 300, // rate-limit floor
 "skip_if_todos": 1, // skip while the board already has todos
 "max_open_discoveries": 5,
 "max_per_pass": 3
}
  • Safe by design: the pass runs one bounded turn with yolo = false, so approval-requiring (mutating) tools are denied — it can read + reason but not change anything. It's gated by cadence + rate-limit + a board-busy check + an open-discovery cap, and stored text is path/secret-redacted.
  • propose adds Todo tasks + a report (you run them). auto is reserved for autonomously running low-risk discoveries in an isolated git worktree/snapshot — a planned follow-up; today it behaves like propose.

See it: blumi serve status (an always-on: line), GET /api/always-on (recent + reports), or the TUI /discoveries overlay.

Lifecycle hooks

Claude-Code-style extension points. Two events are wired: user_prompt_submit (inject context) and pre_tool_use (block a tool call).

user_prompt_submit — inject turn context

When you submit a prompt, blumi runs your configured shell commands and injects each command's stdout as background context for that turn. The prompt is piped to the hook's stdin, the command runs in the workspace, and its output lands as a cache-safe trailing message (never the cached system prefix). Use it to stamp every turn with live context — current git branch, active ticket, deploy status, a lint summary, etc.

pre_tool_use — gate tool calls

Before a tool runs, matching hooks receive a {"tool": "...", "input": {...}} JSON payload on stdin. Exit non-zero to block the call (the hook's stderr/stdout becomes the denial reason the agent sees); exit zero to fall through to your normal permissions policy. A hook runs ahead of policy, so it can veto even an auto-allowed tool. Use it for guardrails — forbid rm -rf, block writes outside a path, require a clean working tree before a deploy, etc.

  • matcher filters by tool name (substring match; empty = every tool). E.g. "matcher": "Bash" only gates shell commands.
  • Fail-open on infra errors: if the hook can't be spawned or times out, the call is allowed — a broken guardrail can't brick the agent. Only an explicit non-zero exit blocks.
"hooks": {
 "user_prompt_submit": [
 { "command": "git branch --show-current", "timeout_secs": 5 },
 { "command": "cat .blumi/context.md" }
 ],
 "pre_tool_use": [
 { "command": "jq -e '.input.command | test(\"rm -rf\")|not' >/dev/null", "matcher": "Bash" }
 ]
}

Hooks are trusted — they're your own commands from settings.json (the same trust model as cron), so blumi runs them as written. Each is bounded by timeout_secs (default 10s) so a hung hook can't stall a turn. A user_prompt_submit hook that times out or exits non-zero is simply skipped; a pre_tool_use hook fails open (allows) on a spawn error or timeout, and blocks only on a clean non-zero exit.

Hooks are read when the session is built, so a freshly added hook takes effect on the next reload_self / restart. The agent can add its own hooks via self_config + reload_self.

Completion notifications

When an autonomous run finishes — blumi loop or an always-on discovery pass — blumi can fan out a short alert so you don't have to babysit it. Off by default. It's split into two kinds of channel:

Server-side push (reaches you with no app open), configured under notify:

  • Desktop — an OS notification on the box blumi runs on (macOS osascript, Linux notify-send).
  • Gateway bot — a proactive message to one of your configured gateway bots (Telegram / Discord / Slack / WhatsApp), reusing that transport's credentials. You just pick the transport + destination.
  • Browser Web Push (web_push: true) — push to subscribed browsers via VAPID. Each browser opts in with the Enable button in the web Control Center (Discovery tab); blumi keeps the VAPID keypair + subscriptions in ~/.blumi/push.json. ⚠️ Browser Web Push only works from a secure context — HTTPS or http://localhost — so a browser loading the gateway over a plain-HTTP LAN IP can't subscribe, and the Enable button explains why. It lights up once you put the gateway behind TLS (or open it on localhost).
"notify": {
 "enabled": true,
 "on": ["loop", "discovery"], // which completions fire (also: "turn"); empty = loop+discovery
 "desktop": true,
 "bot": { "transport": "telegram", "target": "123456789" },
 "web_push": true
}
  • target is a Telegram chat id, Discord channel id, Slack channel, or WhatsApp recipient phone.
  • A half-configured bot (missing token/target) is a silent no-op, never an error. Every channel is best-effort — a failure is logged, never fatal to the run.
  • blumi loop --notify still fires a one-off desktop notification for that run even when notify is off — handy for a quick foreground run.

Live-stream surfaces (no config — they ride the event stream a client is already watching, so they cover the interactive turn you started in that client):

  • Browser in-tab alert — when a turn finishes while the web tab is backgrounded, blumi flashes the page title, badges the favicon, plays a short ping, and drops a click-to-focus toast. It stays silent while the tab is focused, so it never interrupts active use.
  • blugo phone notification — the blugo app fires a heads-up local notification when a turn finishes while the app is backgrounded. Android 13+ asks for the notification permission on first launch; deny it and this is simply a no-op.

The server-side channels (desktop / bot / Web Push) are what reach you for background runs (blumi loop, always-on discovery), which execute off-session; the in-tab/phone surfaces cover the turn you kicked off and then walked away from.

Safety

  • Config writes are validated + atomic (temp file → rename), mode 0600.
  • Skill names are slug-jailed (no path traversal).
  • These tools are powerful; under non-YOLO sessions, mutating actions still surface an approval card. Keep destructive operations behind "ask" in your permissions.

FAQ

Can blumi edit its own configuration?

Yes. The self_config tool reads and writes ~/.blumi/settings.json by dotted key (and can add personas). Every change is validated — it must deserialize as a valid config — before an atomic write, so a bad edit is rejected rather than corrupting your config. Just ask blumi in chat, e.g. "set llm.temperature to 0.3".

Does reloading blumi lose my conversation?

No. reload_self re-reads settings.json, re-scans skills, and rebuilds the session seeded from the current conversation — messages, todos, and token counts are all preserved. It applies self_config and manage_skill changes in place with no process restart.

What changes require a restart instead of a reload?

Process-level settings read once at startup: the bind host/port, the gateway password, and the grid identity (grid_id). Changing those needs a full restart, not just a reload_self.

Is blumi's self-healing safe — can I turn it off?

Yes to both. Self-healing only auto-retries idempotent (read-only) tools; mutating tools (Bash, FileWrite, ...) escalate rather than blind-retry, and only verified fixes are ever learned or applied. Set heal.enabled = false to disable recovery entirely, or heal.evolve = "off" as a kill switch that keeps recovery and learning but never lets blumi self-modify.

How do I see what blumi has healed or discovered?

Use the TUI overlays — /heal (recovery / evolution / proposal counts) and /discoveries — or hit the gateway endpoints GET /api/heal and GET /api/always-on. The blugo phone app exposes a Heal tab as well.

What is RPL (Raskolnikov's Psychological Loop)?

RPL is an opt-in, adversarial, regret-minimizing reasoning loop that reviews a risky tool batch before it executes — mapping its blast radius and putting the plan on trial before an adversarial "Porfiry" judge. It's off by default (rpl.enabled); see Raskolnikov's Psychological Loop for the full deep dive.

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /