Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
ankurCES edited this page Jun 8, 2026 · 2 revisions

Voice

Voice lets blumi — a local-first, bring-your-own-key (BYOK) AI coding agent — speak its replies (text-to-speech, TTS) and accept spoken input (speech-to-text, STT). It works in the web UI and the blugo phone app; configuration lives in the voice section of settings.json and is editable from the in-app Control Center → Voice.

TL;DR — Voice key facts

  • Two directions. Voice covers both TTS (hear blumi's replies aloud) and STT (talk to blumi by voice).
  • TTS providers. Text-to-speech runs through ElevenLabs (recommended) or OpenAI (or an OpenAI-compatible endpoint).
  • STT provider. Speech-to-text uses an OpenAI-compatible Whisper endpoint.
  • Bring your own key (BYOK). Voice needs your own provider API key(s); it is the one feature beyond the LLM that may require a key.
  • Keys stay on your machine. TTS is synthesized on the gateway (which holds the key) and streamed to the phone, and keys are write-only over the API — saved but never returned.
  • Optional. Voice is entirely optional — everything else in blumi works without it.

How do I hear blumi's replies? (text-to-speech / TTS)

To hear blumi speak, enable text-to-speech (TTS) and pick a provider. Two providers are supported:

ElevenLabs (recommended)

  1. Control Center → Voice → enable, pick provider elevenlabs.
  2. Paste your ElevenLabs API key.
  3. Tap "Authenticate & load voices" — this validates the key and fills a dropdown of your account's voices. Pick one.
  4. Save. Tap the 🔊 on any assistant message to hear it.

OpenAI (or compatible)

Pick provider openai, paste a TTS API key, and set a voice (e.g. alloy). Save.

Equivalent settings.json:

"voice": {
 "enabled": true,
 "tts_provider": "elevenlabs",
 "tts_api_key": "...",
 "tts_voice": "<voice_id>",
 "tts_model": "eleven_multilingual_v2"
}

How do I talk to blumi by voice? (speech-to-text / STT)

To talk to blumi, use speech-to-text (STT), which transcribes your microphone via an OpenAI-compatible Whisper endpoint. In Control Center → Voice, set the Mic key (and the app fills in the Whisper endpoint/model). Then tap the 🎤 in the composer, speak, and the transcript is dropped into the message box.

"voice": {
 "voice_api_key": "sk-...",
 "stt_base_url": "https://api.openai.com/v1",
 "stt_model": "whisper-1"
}

Notes

  • Keys are write-only over the API: the app shows saved ✓ but never returns the stored key. To change a voice later, re-enter the key to re-authenticate and reload the dropdown.
  • TTS is synthesized on the gateway (which holds the key) and streamed to the phone, so the key stays on your machine.
  • Voice is optional — everything else works without it.

FAQ

What speech engines does blumi use?

blumi uses ElevenLabs or OpenAI (or an OpenAI-compatible endpoint) for text-to-speech (TTS), and an OpenAI-compatible Whisper endpoint for speech-to-text (STT). ElevenLabs is the recommended TTS provider because the app can validate your key and load a dropdown of your account's voices.

Does blumi's voice need an API key?

Yes. Voice is bring-your-own-key (BYOK): you supply your own provider API key for TTS, STT, or both. It is the one feature beyond the LLM that may require a key — blumi's memory and code search do not.

Where are my voice API keys stored, and are they safe?

Keys stay on your machine. TTS is synthesized on the gateway (which holds the key) and streamed to the phone, so the key never leaves your machine. Keys are also write-only over the API — the app shows saved ✓ but never returns the stored key, so to change a voice later you re-enter the key to re-authenticate and reload the voice dropdown.

Where can I use voice — web UI or phone?

Both. Voice works in the web UI and in the blugo phone app. You configure it from the in-app Control Center → Voice, or directly in the voice section of settings.json.

Is voice required to use blumi?

No. Voice is entirely optional — everything else in blumi works without it. Enable it only when you want to hear replies (TTS) or speak your input (STT).

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /