Real-time speech translator for video/voice calls. Translates both sides of the conversation live — you speak your language, the other person hears theirs, and vice versa.
How it works: Your mic audio goes through Speech-to-Text, gets translated by an LLM, then synthesized back to speech and routed into your call. The same happens in reverse for the other person's audio.
Supports 29 languages with STT, translation, and TTS. Voice models from Piper — download any language directly from the web UI.
Note: macOS production today (14+). Windows port in active development (
docs/windows.md). Linux port started onfeat/linux— contributions welcome.
git clone https://github.com/org-event/OpenPolySphere.git cd OpenPolySphere ./scripts/bootstrap # dev deps + git hooks (like bun install) cargo run --release -p translator -- setup # download models (first time) cargo run --release -p translator # start server
Open http://127.0.0.1:5050 in Google Chrome.
Local mode (default): Whisper STT + Opus-MT translation — no API keys required. Cloud STT/translation optional via Settings.
./scripts/bootstrap is the first command after git clone — like bun install in a JS project. It installs the just task runner if needed, then runs just install.
What just install does:
| Step | macOS | Linux / Windows |
|---|---|---|
| Rust toolchain + rustfmt/clippy | yes | yes |
| Homebrew: espeak-ng, onnxruntime, bun, pre-commit | yes | skipped (see manual install) |
bun install --frozen-lockfile (ESLint for web/static/js) |
yes | yes |
pre-commit install → runs just check on commit |
yes | yes |
If just is already installed, you can run just install directly instead of ./scripts/bootstrap.
Common commands (run from the repo root):
| Command | Purpose |
|---|---|
./scripts/bootstrap |
First-time dev setup after clone |
just install |
Same as bootstrap (without installing just) |
just install-linux-deps |
One-time apt packages (Linux only) |
just fetch-ort |
ONNX Runtime download / path hints |
just check |
rustfmt, clippy, ESLint, Swift build (macOS only) |
just check-linux-clippy |
Full Linux clippy (native Linux, CI parity) |
just check-windows-clippy |
Full Windows clippy (native Windows, CI parity) |
just prepush |
fmt + JS + static cfg guards (all OS, pre-push hook) |
just build |
cargo build --release -p translator |
just run |
Start the server |
just setup |
Download Whisper, Opus-MT, and default Piper voices |
just |
List all recipes |
After install: just setup once, then just run (or cargo run --release -p translator). Optional: cp .env.example .env for cloud API keys.
Single Rust binary (translator): Axum web server on :5050 + in-process audio engine (STT, translation, TTS).
Browser (app.js) ←SSE→ Axum ←→ audio-core Engine ←→ CoreAudio / models
| Dependency | Purpose | Install |
|---|---|---|
| macOS 14+ | CoreAudio for audio I/O | — |
| Homebrew | Package manager | see brew.sh |
| Rust | App + audio engine | brew install rustup && rustup-init |
| espeak-ng | TTS phonemization | brew install espeak-ng |
| ONNX Runtime | Model inference | brew install onnxruntime |
| BlackHole | Virtual audio routing | Manual download |
| Xcode CLT | C compiler | xcode-select --install |
Optional API keys (cloud STT/translation): Deepgram, OpenRouter
If you prefer step-by-step setup:
xcode-select --install brew install rustup espeak-ng onnxruntime rustup-init -y --default-toolchain stable source ~/.cargo/env
Download and install from existential.audio/blackhole.
You need both:
- BlackHole 16ch — captures audio from your call app (Google Meet, Zoom, etc.)
- BlackHole 2ch — sends translated audio back to the call
Setup in your call app (Google Meet, Zoom, etc.):
- Open the call in Google Chrome (not Safari)
- Set BlackHole 2ch as the microphone in the call app
- Set BlackHole 16ch as the speakers in the call app
Note: Do NOT use a Multi-Output Device — it may cause audio issues. Set BlackHole devices directly in the call app settings.
TTS voices come from Piper. Run cargo run --release -p translator -- setup to download default voices, Whisper, and Opus-MT models. Additional voices can be downloaded from the web UI.
To download manually:
mkdir -p models/piper-en models/piper-ru # English (default) curl -sL https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx \ -o models/piper-en/en_US-ryan-medium.onnx curl -sL https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx.json \ -o models/piper-en/en_US-ryan-medium.onnx.json # Russian (default) curl -sL https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/ru/ru_RU/denis/medium/ru_RU-denis-medium.onnx \ -o models/piper-ru/ru_RU-denis-medium.onnx curl -sL https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/ru/ru_RU/denis/medium/ru_RU-denis-medium.onnx.json \ -o models/piper-ru/ru_RU-denis-medium.onnx.json
Browse all available voices at rhasspy.github.io/piper-samples.
cp .env.example .env
Edit .env:
DEEPGRAM_API_KEY=your_key_here
GROQ_API_KEY=your_key_here
ORT_DYLIB_PATH=/opt/homebrew/lib/libonnxruntime.dylib
cargo run --release -p translator -- setup # first time: download models cargo run --release -p translator # start server
Open http://127.0.0.1:5050 in Chrome.
- Live transcript — chat-style bubbles with original text and translation
- 29 languages — switch language pair from Settings, download voices with one click
- Voice selection — multiple voices per language with preview playback
- Audio monitor — hear translations in your browser (Chrome only)
- Start/Stop — control the engine without restarting
- Mute — independently mute outgoing or incoming pipelines
- Bookmarks — star important phrases, filter to show only starred
- Export — download the full transcript as a text file
- Compact/Full view — toggle between detailed and compact transcript
- Latency metrics — per-phrase STT, translation, TTS, and total latency
- Dark/Light theme — toggle with persistence
| Language | STT | Translation | TTS |
|---|---|---|---|
| Arabic | + | + | + |
| Catalan | + | + | + |
| Chinese | + | + | + |
| Czech | + | + | + |
| Danish | + | + | + |
| Dutch | + | + | + |
| English | + | + | + |
| Finnish | + | + | + |
| French | + | + | + |
| German | + | + | + |
| Greek | + | + | + |
| Hindi | + | + | + |
| Hungarian | + | + | + |
| Indonesian | + | + | + |
| Italian | + | + | + |
| Japanese | + | + | — |
| Korean | + | + | — |
| Latvian | + | + | + |
| Norwegian | + | + | + |
| Persian | + | + | + |
| Polish | + | + | + |
| Portuguese | + | + | + |
| Romanian | + | + | + |
| Russian | + | + | + |
| Spanish | + | + | + |
| Swedish | + | + | + |
| Turkish | + | + | + |
| Ukrainian | + | + | + |
| Vietnamese | + | + | + |
TTS requires downloading a Piper voice model for the language (one-click from the web UI). Japanese and Korean have STT and translation but no Piper TTS voice available.
"Engine not starting"
- Press Start after the page loads (server runs idle until then)
- For local mode: models in
models/— runcargo run --release -p translator -- setup - Verify
ORT_DYLIB_PATHpoints to your onnxruntime library - Run
cargo build -p translatorto check for build errors
"No audio from call"
- Ensure BlackHole 16ch is set up in a Multi-Output Device
- Check that your call app uses BlackHole 2ch as its microphone
"TTS not working"
- Verify
espeak-ngis installed:espeak-ng --version - Check that voice model files exist in
models/piper-{lang}/ - Download voices from Settings in the web UI
"No sound in monitor"
- Use Chrome — Safari does not support audio output routing required for monitor
- Check your system audio output is set to speakers (not BlackHole)
"OpenRouter key shows invalid"
- Only needed when cloud translation is enabled
- Keys in
.envwork even if the Settings field is empty
MIT — see LICENSE. Copyright (c) 2026 Kai Letov (original author).