A real-time text-to-speech server with HTTP API and local audio playback.
cmake -Bbuild -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$PWD/install -DCMAKE_POLICY_VERSION_MINIMUM=3.5
cmake --build build
cmake --install build- CMake 3.26+
- C++17 compiler
- PortAudio (
brew install portaudioon macOS,sudo apt install portaudio19-devon Linux)
./build/tts_server <model.onnx> <model.onnx.json> <espeak-ng-data>
The server listens on http://0.0.0.0:9999.
# Speak immediately curl -X POST http://localhost:9999/ -d "Hello world" # Stream text (buffers until punctuation) curl -X POST http://localhost:9999/stream -d "Hello, " curl -X POST http://localhost:9999/stream -d "world." curl -X POST http://localhost:9999/flush # Cancel playback curl -X POST http://localhost:9999/cancel
curl -X POST http://localhost:9999/synthesize -d "Hello world" -o output.raw # Convert to MP3 ffmpeg -f f32le -ar 22050 -ac 1 -i output.raw output.mp3
Copyright (c) 2026 Edge AI, LLC. All rights reserved.