Name	Name	Last commit message	Last commit date
Latest commit History 22 Commits
.claude	.claude
.github/workflows	.github/workflows
benchmarks/clips	benchmarks/clips
scripts	scripts
.gitignore	.gitignore
BENCHMARKS.md	BENCHMARKS.md
CHANGELOG.md	CHANGELOG.md
README.md	README.md
STT.spec	STT.spec
config.json	config.json
requirements.txt	requirements.txt
stt.py	stt.py
version_info.txt	version_info.txt

STT (Speech to text) 🎙️

A lightweight, fully offline hold-to-record voice-to-text app for Windows
(macOS support included; Linux untested but should work).

Sits in your system tray. Hold Right Alt, speak, release → transcribed text
is pasted into whatever window is focused.

🌐 Project page: marksoft.ro/stt

📥 Download

Pre-built binaries for the latest release — no Python install required. Or grab them from marksoft.ro/stt.

Platform	Download	Run
Windows 10/11 (x64)	STT-windows-x64.exe	Right-click → Run as administrator (global hotkeys need it)
macOS (Apple Silicon)	STT-macos.zip	Unzip → grant Accessibility in System Settings → Privacy & Security
Linux (x64)	STT-linux-x64	`chmod +x STT-linux-x64 && sudo ./STT-linux-x64`

Browse every version on the Releases page.

First launch downloads the Whisper base model (~150 MB) into your HuggingFace cache. Everything after that is fully offline.

Windows SmartScreen warning

On first launch Windows shows "Windows protected your PC — unrecognized app." This happens because the exe isn't code-signed (signing certs cost ~100ドル–500/yr for an open-source hobby project, so for now it's unsigned).

To run it anyway — once per version:

In the blue SmartScreen dialog, click More info (under the big message).
A new Run anyway button appears in the bottom-right. Click it.
Windows remembers your choice; subsequent launches skip the dialog.

The exe is built in public on GitHub Actions — you can inspect every build step at the Actions tab and compute the checksum yourself against what Actions produced.

Quick-start (from source)

1. Prerequisites

Requirement	Notes
Python 3.10+	3.11 recommended
pip	bundled with Python
Windows: Run as Administrator	The `keyboard` library needs elevated privileges for global hotkey hooks
macOS: Accessibility permission	System Settings → Privacy & Security → Accessibility → add Terminal / your app

2. Install dependencies

pip install -r requirements.txt

Windows note: if you see a CTranslate2 / ONNX error, make sure you have the
Microsoft Visual C++ Redistributable (x64) installed.

3. Run

# Windows – must be run as Administrator for global hotkeys
python stt.py
# macOS
python3 stt.py

The app will:

Start and appear in the system tray (blue microphone icon)
Download the base Whisper model on first run (~150 MB, one-time)
Turn amber while loading the model
Turn blue (idle) once ready

Usage

Action	What happens
Hold Right Alt	Recording starts (icon turns red)
Release Right Alt	Recording stops; transcription runs (icon turns green)
Transcription complete	Text is pasted into the focused window; icon returns to blue

Tray menu (right-click the icon)

Hotkey – shows the currently configured hotkey (read-only)
Switch Model → base / small – hot-switch the Whisper model (downloads if needed)
Quit – exit the app

Configuration (`config.json`)

{
 "model": "base",
 "hotkey": "right alt",
 "language": "en",
 "sample_rate": 16000,
 "channels": 1,
 "beam_size": 5
}

Key	Values	Description
`model`	`"base"` / `"small"`	Whisper model size. `base` ≈ 150 MB, `small` ≈ 500 MB
`hotkey`	Any key name	See keyboard key names
`language`	`"en"`, `"fr"`, ... or `null`	Force language or `null` for auto-detect
`sample_rate`	`16000`	Do not change unless your mic requires it
`channels`	`1`	Mono recommended
`beam_size`	`1`–`10`	Higher = more accurate but slower. `5` is a good default

Model sizes & trade-offs

Model	Size	Speed	Accuracy
`base`	~150 MB	Very fast	Good
`small`	~500 MB	Fast	Better
`medium`	~1.5 GB	Moderate	Very good
`large-v3`	~3 GB	Slow	Best

To use medium or large-v3, add them to the Switch Model sub-menu in
_build_menu() inside stt.py.

Packaging with PyInstaller

Install PyInstaller

pip install pyinstaller==6.10.0

Windows – single `.exe`

pyinstaller \
 --onefile \
 --noconsole \
 --name stt \
 --icon stt.ico \
 --add-data "config.json;." \
 stt.py

Remove --icon stt.ico if you don't have an icon file.
The --noconsole flag suppresses the console window on Windows.

The executable will be at dist\stt.exe. Copy config.json next to it
(it will be created automatically on first run if absent).

macOS – `.app` bundle

pyinstaller \
 --onefile \
 --windowed \
 --name STT \
 --add-data "config.json:." \
 stt.py

Bundle is at dist/STT.app.
You may need to sign it: codesign --deep --force --sign - dist/STT.app

Notes on frozen paths

stt.py detects whether it is running as a PyInstaller bundle via
sys._MEIPASS / sys.frozen and resolves config.json relative to the
executable (not the bundle's temp directory) so settings persist between
runs.

Releasing (maintainers)

Pre-built binaries for Windows, macOS, and Linux are produced automatically by .github/workflows/release.yml whenever a v* tag is pushed.

# 1. Tag the commit you want to ship
git tag v0.1.0
git push origin v0.1.0
# 2. GitHub Actions builds STT on windows-latest / macos-latest / ubuntu-latest
# and attaches the three binaries to a new release for that tag.
# Watch progress: https://github.com/NYOGamesCOM/STT/actions

Resulting release assets:

Asset	Platform	Build command
`STT-windows-x64.exe`	Windows 10/11 x64	`pyinstaller --onefile --noconsole`
`STT-macos.zip`	macOS (arm64)	`pyinstaller --onefile --windowed`
`STT-linux-x64`	Linux x64	`pyinstaller --onefile`

The workflow can also be triggered manually from the Actions tab (Run workflow → enter the tag name).

Permanent "latest" download URLs

GitHub exposes a releases/latest/download/<asset> redirect that always points to the newest tagged release — use these on your website so links never go stale:

https://github.com/NYOGamesCOM/STT/releases/latest/download/STT-windows-x64.exe
https://github.com/NYOGamesCOM/STT/releases/latest/download/STT-macos.zip
https://github.com/NYOGamesCOM/STT/releases/latest/download/STT-linux-x64

Version-pinned URLs are also available at releases/download/v0.1.0/<asset> if you need reproducibility.

Troubleshooting

Symptom	Fix
Hotkey not detected (Windows)	Run as Administrator
"Accessibility" error (macOS)	Grant Accessibility permission to Terminal / the `.app`
Model download hangs	Check internet connection; model is cached in `~/.cache/huggingface`
Pasted text is garbled	Try setting `"language": "en"` in `config.json`
Nothing pasted	Make sure the target app accepts Ctrl+V; try clicking it first
`sounddevice` error on Windows	Install portaudio or use the conda package

Benchmarks

We publish reproducible transcription benchmarks — latency, real-time factor, and word-error rate — for the tiny, base, and small models on CPU with int8 quantisation. The benchmark is a regular Python script; anyone can record their own reference clips and run it. See BENCHMARKS.md for methodology, how to run it, and results.

Contributors

NYOGamesCOM — creator & maintainer
Claude (Anthropic Opus 4.7) — overlay indicator, history UI, performance pass, build/release plumbing
- assisted by Cursor — editor & pair-programming companion

Built with ❤️ by MarkSoft.

License

MIT – do whatever you like.

Folders and files

Latest commit

History

Repository files navigation

STT (Speech to text) 🎙️

📥 Download

Windows SmartScreen warning

Quick-start (from source)

1. Prerequisites

2. Install dependencies

3. Run

Usage

Tray menu (right-click the icon)

Configuration (config.json)

Model sizes & trade-offs

Packaging with PyInstaller

Install PyInstaller

Windows – single .exe

macOS – .app bundle

Notes on frozen paths

Releasing (maintainers)

Permanent "latest" download URLs

Troubleshooting

Benchmarks

Contributors

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Configuration (`config.json`)

Windows – single `.exe`

macOS – `.app` bundle

Packages