Name	Name	Last commit message	Last commit date
Latest commit History 452 Commits
.github/workflows	.github/workflows
configuration	configuration
docs	docs
examples	examples
realtime-console/dist	realtime-console/dist
scripts	scripts
src/speaches	src/speaches
tests	tests
.dockerignore	.dockerignore
.envrc	.envrc
.gitignore	.gitignore
.pre-commit-config.yaml	.pre-commit-config.yaml
Dockerfile	Dockerfile
LICENSE	LICENSE
README.md	README.md
Taskfile.yaml	Taskfile.yaml
audio.wav	audio.wav
compose.cpu.yaml	compose.cpu.yaml
compose.cuda-cdi.yaml	compose.cuda-cdi.yaml
compose.cuda.yaml	compose.cuda.yaml
compose.observability.yaml	compose.observability.yaml
compose.yaml	compose.yaml
contributing.md	contributing.md
flake.lock	flake.lock
flake.nix	flake.nix
mkdocs.yml	mkdocs.yml
model_aliases.json	model_aliases.json
pyproject.toml	pyproject.toml
renovate.json	renovate.json
uv.lock	uv.lock

Name

Last commit message

Last commit date

Latest commit

History

realtime-console/dist

.pre-commit-config.yaml

compose.cuda-cdi.yaml

compose.cuda.yaml

compose.observability.yaml

Note

This project was previously named faster-whisper-server. I've decided to change the name from faster-whisper-server, as the project has evolved to support more than just ASR.

Speaches

speaches is an OpenAI API-compatible server supporting streaming transcription, translation, and speech generation. Speach-to-Text is powered by faster-whisper and for Text-to-Speech piper and Kokoro are used. This project aims to be Ollama, but for TTS/STT models.

Try it out on the HuggingFace Space

See the documentation for installation instructions and usage: speaches.ai

Features:

OpenAI API compatible. All tools and SDKs that work with OpenAI's API should work with speaches.
Audio generation (chat completions endpoint) | OpenAI Documentation
- Generate a spoken audio summary of a body of text (text in, audio out)
- Perform sentiment analysis on a recording (audio in, text out)
- Async speech to speech interactions with a model (audio in, audio out)
Streaming support (transcription is sent via SSE as the audio is transcribed. You don't need to wait for the audio to fully be transcribed before receiving it).
Dynamic model loading / offloading. Just specify which model you want to use in the request and it will be loaded automatically. It will then be unloaded after a period of inactivity.
Text-to-Speech via kokoro(Ranked #1 in the TTS Arena) and piper models.
GPU and CPU support.
Deployable via Docker Compose / Docker
Highly configurable
Realtime API

Please create an issue if you find a bug, have a question, or a feature suggestion.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

starlog/faster-whisper-server

Folders and files

Latest commit

History

Repository files navigation

Speaches

Features:

Demo

Streaming Transcription

Speech Generation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Speaches

Features:

Demo

Streaming Transcription

Speech Generation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages