Self Management

ankurCES edited this page Jun 8, 2026 · 12 revisions

Self-Management

blumi can evolve itself: author its own skills, edit its own config (validated), and reload in place to apply both — no restart, conversation preserved. These are agent tools the model calls during a turn; just ask it in chat ("add a skill for ...", "set llm.temperature to 0.3 and reload").

Status: the self-management tools below are live. Triggering them directly from the phone app (buttons for reload / restart / edit config / build skills) and a restart-the-gateway capability are rolling out incrementally.

The tools

`self_config` — edit settings

Reads/writes ~/.blumi/settings.json by dotted key, and can add personas. Every change is validated (it must deserialize as a valid config) before an atomic write, so a bad edit is rejected rather than corrupting your config.

Process-level settings (bind host/port, the gateway password, the grid identity) are read once at startup, so changing them needs a restart, not just a reload.

`manage_skill` — author skills

Create / update / delete a SKILL.md under ~/.blumi/skills/<name>/. Skills are progressive-disclosure instructions: their name + description sit in the system prompt; the agent loads the full body on demand. See blumi skills and CLI Usage.

`reload_self` — apply changes in place

Emits a reload: blumi re-reads settings.json, re-scans skills, and rebuilds the session seeded from the current conversation (messages, todos, token counts preserved). Use it after self_config / manage_skill.

Self-recovery

The always-on gateway is supervised: launchd KeepAlive (macOS) and systemd Restart=always (Linux) auto-restart it on crash. That's crash recovery for free — see Gateway. For a wedged-but-alive session, a reload (above) rebuilds it without losing the conversation.

Self-healing & evolution

Beyond crash recovery, blumi treats reliability as a bounded control problem (after the self-healing-orchestrators paper) and turns repeated failures into durable improvements — wired into the failure taxonomy and the semantic memory.

Reflex recovery. A failed tool result is classified (bad args, state conflict, crash, empty) and gets a budgeted, targeted recovery action — re-read-then-retry, an argument fix from the tool's hint, narrow-the-query, or escalate. Only idempotent (read-only) tools auto-retry; mutating tools (Bash, FileWrite, ...) escalate rather than blind-retry. It composes with the doom-loop guard, and each attempt emits an observability trace (⚕ self-heal ... inline in the TUI).
Learns from failures — only verified ones. A guided recovery is stored as a pending failure→fix hypothesis and promoted to a real agent-namespace episode only once the same tool is observed to succeed on a later step (see Confirmed below). Promoted fixes diffuse across the grid , and a similar future failure recalls the known fix as trailing guidance. Paths/secrets are redacted first; un-promoted guesses are reaped by the sweep, never recalled.
Evolves. Recurring failure clusters are mined (on the gateway sweep) into auto-written recovery skills (low-risk, with a notice — via the same manage_skill actuator above); anything risky — config / providers / secrets / deletes — raises an approval instead. The audit trail is kept as evolution memories.
Confirmed (the credit-assignment keystone). With heal.verify (now on by default), a recovery is promoted from pending to a recallable fix only when the retried tool actually succeeds on a later step — ground truth, not just "a fix was suggested" — with provenance. So the agent can tell a fix that worked from one that didn't, and only proven fixes are ever recalled, mined into skills, or diffused.

Configure it (~/.blumi/settings.json; defaults shown):

"heal": {
 "enabled": true,
 "recovery_budget": 2,
 "verify": true,
 "learn": true,
 "evolve": "auto",
 "redact_paths": true
}

evolve — "auto" (apply low-risk skills automatically, with a notice) · "propose" (mine + always ask) · "off" (kill switch: still recover & learn, but never self-modify).
recovery_budget max recovery attempts per turn · verify require cross-step success · learn write failure→fix episodes · redact_paths scrub paths/secrets before storage · enabled master switch.

See it: the TUI /heal overlay (recovery / evolution / proposal counts + recent items), the inline ⚕ self-heal traces, the blugo Heal tab, or GET /api/heal on the gateway.

RPL-Judgement

An opt-in, adversarial, regret-minimizing reasoning loop that wraps the agent's tool calls — "Raskolnikov's Psychological Loop". A standard agent maximizes success; an RPL agent minimizes regret: it debates a risky plan with an internal prosecutor before anything touches the live system. Off by default (rpl.enabled); it trades extra LLM calls for far fewer catastrophic actions, so turn it on for high-stakes work.

When enabled, before a materialized tool batch executes, blumi runs five phases:

The Hypothesis (blast radius). Map what the batch would touch — files written, commands, network egress, VCS mutations, whether it's destructive / reversible — into a 0–100 severity. Read-only / low-blast batches skip the loop entirely (the cheap path), so cost stays bounded.
The Fever Dream (simulation). For a batch over blast_threshold, predict the worst-case outcome + a paranoia score. (MVP: a dry prediction; real sandboxed branch execution in throwaway git worktrees is a planned follow-up.)
The Porfiry Node (adversarial judge). The plan goes to an adversarial "Porfiry" LLM judge whose only job is to find the flaw — the ignored edge case, the un-re-read state, the irreversible step. It must approve, or the plan is bounced (the flaw is injected as guidance and the model re-plans), bounded by max_defend_rounds before proceeding under caution. A flaky / unreachable judge fails open, so it never deadlocks the agent.
The Strike (actuation). The surviving plan flows through the normal typed tool pipeline, unchanged — same permission engine, same approvals.
The Confession (regret → memory). After execution, blumi compares the predicted risk to the actual outcome and writes the Error Delta ("regret") back to memory as an rpl_delta episode — feeding value-based memory fitness and priming future blast-radius judgments. It evolves from the consequence.

Configure it (~/.blumi/settings.json; defaults shown — enabled is false):

"rpl": {
 "enabled": false,
 "blast_threshold": 40, // 0–100 severity a mutating batch must hit to be reviewed
 "branches": 3, // simulated branches per review (1–5)
 "max_defend_rounds": 2, // Porfiry reject → re-plan rounds before proceeding
 "judge_model": "", // empty = reuse the main model
 "sandbox": "dry" // dry (predict) | worktree (real sandboxed sim — planned)
}

RPL composes with — it doesn't replace — your permissions and pre_tool_use hooks: those still gate the Strike. RPL adds an upstream adversarial review so a risky plan is caught and re-thought before it ever reaches them.

Cost-aware routing

Per turn, blumi can pick a difficulty tier and route to a light vs flagship model — simple, mechanical work doesn't have to burn the flagship's price. Off by default. The mechanism is hybrid: a fast, zero-cost heuristic (prompt length, tool count, iteration depth, keyword hints) decides most turns, and only ambiguous turns consult a small local judge model (which fails safe to the cheap tier — an unreachable judge never silently upgrades you).

Delegated sub-agents default to the cheap tier (router.subagent_tier), so investigation/fan-out is cheap while the main agent stays sharp.
Model swaps are prompt-cache-safe: blumi only changes the model when the tier actually changes, and on a long turn it escalates Light→Heavy on deep iterations but never demotes mid-turn.
On a grid, router.prefer_grid_light can run the light tier on a peer's local model (free).

"router": {
 "mode": "hybrid", // off | heuristic | hybrid | judge
 "light": { "provider": "", "model": "claude-haiku-4-5" },
 "heavy": { "provider": "", "model": "claude-opus-4-5" },
 "judge": { "provider": "", "model": "" }, // empty = reuse brain.*, then llm.*
 "subagent_tier": "light", // light | heavy | inherit
 "prefer_grid_light": false
}

See it: the TUI /route overlay (per-tier turns/tokens + $ saved vs all-heavy) or GET /api/route. /route off|heuristic|hybrid|judge switches the mode live. Empty tier provider/model reuse the active llm.*. (Cost is estimated from each model's list price; local / unpriced models show as free.)

Always-on discovery

When you step away, blumi can keep finding what's worth doing. With it enabled, the always-on gateway periodically runs a read-only discovery pass that surfaces candidate tasks for the workspace, adds them to the task board as Discovered: todos, and lands a markdown report in ~/.blumi/reports/ plus an agent-namespace discovery memory. Off by default.

"always_on": {
 "enabled": true,
 "autonomy": "propose", // off | propose | auto
 "cadence_secs": 900,
 "min_interval_secs": 300, // rate-limit floor
 "skip_if_todos": 1, // skip while the board already has todos
 "max_open_discoveries": 5,
 "max_per_pass": 3
}

Safe by design: the pass runs one bounded turn with yolo = false, so approval-requiring (mutating) tools are denied — it can read + reason but not change anything. It's gated by cadence + rate-limit + a board-busy check + an open-discovery cap, and stored text is path/secret-redacted.
propose adds Todo tasks + a report (you run them). auto is reserved for autonomously running low-risk discoveries in an isolated git worktree/snapshot — a planned follow-up; today it behaves like propose.

See it: blumi serve status (an always-on: line), GET /api/always-on (recent + reports), or the TUI /discoveries overlay.

Lifecycle hooks

Claude-Code-style extension points. Two events are wired: user_prompt_submit (inject context) and pre_tool_use (block a tool call).

`user_prompt_submit` — inject turn context

When you submit a prompt, blumi runs your configured shell commands and injects each command's stdout as background context for that turn. The prompt is piped to the hook's stdin, the command runs in the workspace, and its output lands as a cache-safe trailing message (never the cached system prefix). Use it to stamp every turn with live context — current git branch, active ticket, deploy status, a lint summary, etc.

`pre_tool_use` — gate tool calls

Before a tool runs, matching hooks receive a {"tool": "...", "input": {...}} JSON payload on stdin. Exit non-zero to block the call (the hook's stderr/stdout becomes the denial reason the agent sees); exit zero to fall through to your normal permissions policy. A hook runs ahead of policy, so it can veto even an auto-allowed tool. Use it for guardrails — forbid rm -rf, block writes outside a path, require a clean working tree before a deploy, etc.

matcher filters by tool name (substring match; empty = every tool). E.g. "matcher": "Bash" only gates shell commands.
Fail-open on infra errors: if the hook can't be spawned or times out, the call is allowed — a broken guardrail can't brick the agent. Only an explicit non-zero exit blocks.

"hooks": {
 "user_prompt_submit": [
 { "command": "git branch --show-current", "timeout_secs": 5 },
 { "command": "cat .blumi/context.md" }
 ],
 "pre_tool_use": [
 { "command": "jq -e '.input.command | test(\"rm -rf\")|not' >/dev/null", "matcher": "Bash" }
 ]
}

Hooks are trusted — they're your own commands from settings.json (the same trust model as cron), so blumi runs them as written. Each is bounded by timeout_secs (default 10s) so a hung hook can't stall a turn. A user_prompt_submit hook that times out or exits non-zero is simply skipped; a pre_tool_use hook fails open (allows) on a spawn error or timeout, and blocks only on a clean non-zero exit.

Hooks are read when the session is built, so a freshly added hook takes effect on the next reload_self / restart. The agent can add its own hooks via self_config + reload_self.

Completion notifications

When an autonomous run finishes — blumi loop or an always-on discovery pass — blumi can fan out a short alert so you don't have to babysit it. Off by default. It's split into two kinds of channel:

Server-side push (reaches you with no app open), configured under notify:

Desktop — an OS notification on the box blumi runs on (macOS osascript, Linux notify-send).
Gateway bot — a proactive message to one of your configured gateway bots (Telegram / Discord / Slack / WhatsApp), reusing that transport's credentials. You just pick the transport + destination.
Browser Web Push (web_push: true) — push to subscribed browsers via VAPID. Each browser opts in with the Enable button in the web Control Center (Discovery tab); blumi keeps the VAPID keypair + subscriptions in ~/.blumi/push.json. ⚠️ Browser Web Push only works from a secure context — HTTPS or http://localhost — so a browser loading the gateway over a plain-HTTP LAN IP can't subscribe, and the Enable button explains why. It lights up once you put the gateway behind TLS (or open it on localhost).

"notify": {
 "enabled": true,
 "on": ["loop", "discovery"], // which completions fire (also: "turn"); empty = loop+discovery
 "desktop": true,
 "bot": { "transport": "telegram", "target": "123456789" },
 "web_push": true
}

target is a Telegram chat id, Discord channel id, Slack channel, or WhatsApp recipient phone.
A half-configured bot (missing token/target) is a silent no-op, never an error. Every channel is best-effort — a failure is logged, never fatal to the run.
blumi loop --notify still fires a one-off desktop notification for that run even when notify is off — handy for a quick foreground run.

Live-stream surfaces (no config — they ride the event stream a client is already watching, so they cover the interactive turn you started in that client):

Browser in-tab alert — when a turn finishes while the web tab is backgrounded, blumi flashes the page title, badges the favicon, plays a short ping, and drops a click-to-focus toast. It stays silent while the tab is focused, so it never interrupts active use.
blugo phone notification — the blugo app fires a heads-up local notification when a turn finishes while the app is backgrounded. Android 13+ asks for the notification permission on first launch; deny it and this is simply a no-op.

The server-side channels (desktop / bot / Web Push) are what reach you for background runs (blumi loop, always-on discovery), which execute off-session; the in-tab/phone surfaces cover the turn you kicked off and then walked away from.

Safety

Config writes are validated + atomic (temp file → rename), mode 0600.
Skill names are slug-jailed (no path traversal).
These tools are powerful; under non-YOLO sessions, mutating actions still surface an approval card. Keep destructive operations behind "ask" in your permissions.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self Management

Self-Management

The tools

`self_config` — edit settings

`manage_skill` — author skills

`reload_self` — apply changes in place

Self-recovery

Self-healing & evolution

RPL-Judgement

Cost-aware routing

Always-on discovery

Lifecycle hooks

`user_prompt_submit` — inject turn context

`pre_tool_use` — gate tool calls

Completion notifications

Safety

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blumi wiki

Clone this wiki locally

Self Management

Self-Management

The tools

self_config — edit settings

manage_skill — author skills

reload_self — apply changes in place

Self-recovery

Self-healing & evolution

RPL-Judgement

Cost-aware routing

Always-on discovery

Lifecycle hooks

user_prompt_submit — inject turn context

pre_tool_use — gate tool calls

Completion notifications

Safety

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blumi wiki

Clone this wiki locally

`self_config` — edit settings

`manage_skill` — author skills

`reload_self` — apply changes in place

`user_prompt_submit` — inject turn context

`pre_tool_use` — gate tool calls