Name	Name	Last commit message	Last commit date
Latest commit History 61 Commits
.cargo	.cargo
.clusterfuzzlite	.clusterfuzzlite
.github	.github
crates	crates
docs	docs
fuzz	fuzz
scripts	scripts
tools	tools
web/static	web/static
.editorconfig	.editorconfig
.env.example	.env.example
.gitignore	.gitignore
.pre-commit-config.yaml	.pre-commit-config.yaml
CHANGELOG.md	CHANGELOG.md
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md
CONTRIBUTING.md	CONTRIBUTING.md
Cargo.lock	Cargo.lock
Cargo.toml	Cargo.toml
Justfile	Justfile
LICENSE	LICENSE
README.md	README.md
README.ru.md	README.ru.md
USAGE.md	USAGE.md
bun.lock	bun.lock
eslint.config.mjs	eslint.config.mjs
osv-scanner.toml	osv-scanner.toml
package.json	package.json

Realtime Call Translator

README на русском

Real-time speech translator for video/voice calls. Translates both sides of the conversation live — you speak your language, the other person hears theirs, and vice versa.

How it works: Your mic audio goes through Speech-to-Text, gets translated by an LLM, then synthesized back to speech and routed into your call. The same happens in reverse for the other person's audio.

Supports 29 languages with STT, translation, and TTS. Voice models from Piper — download any language directly from the web UI.

macOS License GitHub stars

Note: macOS production today (14+). Windows port in active development (docs/windows.md). Linux port started on feat/linux — contributions welcome.

Quick Start

git clone https://github.com/org-event/OpenPolySphere.git
cd OpenPolySphere
./scripts/bootstrap # dev deps + git hooks (like bun install)
cargo run --release -p translator -- setup # download models (first time)
cargo run --release -p translator # start server

Open http://127.0.0.1:5050 in Google Chrome.

Local mode (default): Whisper STT + Opus-MT translation — no API keys required. Cloud STT/translation optional via Settings.

After clone (developers)

./scripts/bootstrap is the first command after git clone — like bun install in a JS project. It installs the just task runner if needed, then runs just install.

What just install does:

Step	macOS	Linux / Windows
Rust toolchain + rustfmt/clippy	yes	yes
Homebrew: espeak-ng, onnxruntime, bun, pre-commit	yes	skipped (see manual install)
`bun install --frozen-lockfile` (ESLint for `web/static/js`)	yes	yes
`pre-commit install` → runs `just check` on commit	yes	yes

If just is already installed, you can run just install directly instead of ./scripts/bootstrap.

Common commands (run from the repo root):

Command	Purpose
`./scripts/bootstrap`	First-time dev setup after clone
`just install`	Same as bootstrap (without installing `just`)
`just install-linux-deps`	One-time apt packages (Linux only)
`just fetch-ort`	ONNX Runtime download / path hints
`just check`	rustfmt, clippy, ESLint, Swift build (macOS only)
`just check-linux-clippy`	Full Linux clippy (native Linux, CI parity)
`just check-windows-clippy`	Full Windows clippy (native Windows, CI parity)
`just prepush`	fmt + JS + static cfg guards (all OS, pre-push hook)
`just build`	`cargo build --release -p translator`
`just run`	Start the server
`just setup`	Download Whisper, Opus-MT, and default Piper voices
`just`	List all recipes

After install: just setup once, then just run (or cargo run --release -p translator). Optional: cp .env.example .env for cloud API keys.

Architecture

Single Rust binary (translator): Axum web server on :5050 + in-process audio engine (STT, translation, TTS).

Browser (app.js) ←SSE→ Axum ←→ audio-core Engine ←→ CoreAudio / models

Requirements

Dependency	Purpose	Install
macOS 14+	CoreAudio for audio I/O	—
Homebrew	Package manager	see brew.sh
Rust	App + audio engine	`brew install rustup && rustup-init`
espeak-ng	TTS phonemization	`brew install espeak-ng`
ONNX Runtime	Model inference	`brew install onnxruntime`
BlackHole	Virtual audio routing	Manual download
Xcode CLT	C compiler	`xcode-select --install`

Optional API keys (cloud STT/translation): Deepgram, OpenRouter

Manual Installation

If you prefer step-by-step setup:

1. System packages

xcode-select --install
brew install rustup espeak-ng onnxruntime
rustup-init -y --default-toolchain stable
source ~/.cargo/env

2. BlackHole audio driver

Download and install from existential.audio/blackhole.

You need both:

BlackHole 16ch — captures audio from your call app (Google Meet, Zoom, etc.)
BlackHole 2ch — sends translated audio back to the call

Setup in your call app (Google Meet, Zoom, etc.):

Open the call in Google Chrome (not Safari)
Set BlackHole 2ch as the microphone in the call app
Set BlackHole 16ch as the speakers in the call app

Note: Do NOT use a Multi-Output Device — it may cause audio issues. Set BlackHole devices directly in the call app settings.

3. Download voice models

TTS voices come from Piper. Run cargo run --release -p translator -- setup to download default voices, Whisper, and Opus-MT models. Additional voices can be downloaded from the web UI.

To download manually:

mkdir -p models/piper-en models/piper-ru
# English (default)
curl -sL https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx \
 -o models/piper-en/en_US-ryan-medium.onnx
curl -sL https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx.json \
 -o models/piper-en/en_US-ryan-medium.onnx.json
# Russian (default)
curl -sL https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/ru/ru_RU/denis/medium/ru_RU-denis-medium.onnx \
 -o models/piper-ru/ru_RU-denis-medium.onnx
curl -sL https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/ru/ru_RU/denis/medium/ru_RU-denis-medium.onnx.json \
 -o models/piper-ru/ru_RU-denis-medium.onnx.json

Browse all available voices at rhasspy.github.io/piper-samples.

4. Environment variables

cp .env.example .env

Edit .env:

DEEPGRAM_API_KEY=your_key_here
GROQ_API_KEY=your_key_here
ORT_DYLIB_PATH=/opt/homebrew/lib/libonnxruntime.dylib

5. Build and run

cargo run --release -p translator -- setup # first time: download models
cargo run --release -p translator # start server

Open http://127.0.0.1:5050 in Chrome.

Web UI Features

Live transcript — chat-style bubbles with original text and translation
29 languages — switch language pair from Settings, download voices with one click
Voice selection — multiple voices per language with preview playback
Audio monitor — hear translations in your browser (Chrome only)
Start/Stop — control the engine without restarting
Mute — independently mute outgoing or incoming pipelines
Bookmarks — star important phrases, filter to show only starred
Export — download the full transcript as a text file
Compact/Full view — toggle between detailed and compact transcript
Latency metrics — per-phrase STT, translation, TTS, and total latency
Dark/Light theme — toggle with persistence

Supported Languages

Language	STT	Translation	TTS
Arabic	+	+	+
Catalan	+	+	+
Chinese	+	+	+
Czech	+	+	+
Danish	+	+	+
Dutch	+	+	+
English	+	+	+
Finnish	+	+	+
French	+	+	+
German	+	+	+
Greek	+	+	+
Hindi	+	+	+
Hungarian	+	+	+
Indonesian	+	+	+
Italian	+	+	+
Japanese	+	+	—
Korean	+	+	—
Latvian	+	+	+
Norwegian	+	+	+
Persian	+	+	+
Polish	+	+	+
Portuguese	+	+	+
Romanian	+	+	+
Russian	+	+	+
Spanish	+	+	+
Swedish	+	+	+
Turkish	+	+	+
Ukrainian	+	+	+
Vietnamese	+	+	+

TTS requires downloading a Piper voice model for the language (one-click from the web UI). Japanese and Korean have STT and translation but no Piper TTS voice available.

Troubleshooting

"Engine not starting"

Press Start after the page loads (server runs idle until then)
For local mode: models in models/ — run cargo run --release -p translator -- setup
Verify ORT_DYLIB_PATH points to your onnxruntime library
Run cargo build -p translator to check for build errors

"No audio from call"

Ensure BlackHole 16ch is set up in a Multi-Output Device
Check that your call app uses BlackHole 2ch as its microphone

"TTS not working"

Verify espeak-ng is installed: espeak-ng --version
Check that voice model files exist in models/piper-{lang}/
Download voices from Settings in the web UI

"No sound in monitor"

Use Chrome — Safari does not support audio output routing required for monitor
Check your system audio output is set to speakers (not BlackHole)

"OpenRouter key shows invalid"

Only needed when cloud translation is enabled
Keys in .env work even if the Settings field is empty

Folders and files

Latest commit

History

Repository files navigation

Realtime Call Translator

Quick Start

After clone (developers)

Architecture

Requirements

Manual Installation

1. System packages

2. BlackHole audio driver

3. Download voice models

4. Environment variables

5. Build and run

Web UI Features

Supported Languages

Troubleshooting

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages