Name	Name	Last commit message	Last commit date
Latest commit History 119 Commits
.claude	.claude
docs/superpowers	docs/superpowers
public	public
scripts	scripts
src-tauri	src-tauri
src	src
.gitignore	.gitignore
AGENTS.md	AGENTS.md
CLAUDE.md	CLAUDE.md
LICENSE	LICENSE
README.md	README.md
TODO.md	TODO.md
index.html	index.html
package-lock.json	package-lock.json
package.json	package.json
tsconfig.json	tsconfig.json
vite.config.ts	vite.config.ts

SpeakEasy

A desktop app for practicing foreign languages with AI. Speak, listen, and get corrections — with offline-first design and optional cloud TTS for higher quality voices. Includes a built-in web server for remote access via Tailscale or local network.

Supports 16 languages — English, Spanish, French, Chinese, Japanese, German, Korean, Portuguese (BR), Italian, Russian, Arabic, Hindi, Turkish, Indonesian, Vietnamese, and Polish — with two practice modes (Free Talk and Scenario Role-Play) and an optional Corrections toggle. The entire interface is localized in all 16 languages.

Features

Free Talk — open conversation practice in the target language
Scenario Mode — 20+ real-world situations per language (cafe, hotel, dentist, etc.) with scenario picker
Native Language — choose any of the 16 supported languages as your native language; all UI, corrections, translations, and scenario descriptions adapt accordingly
Corrections Toggle — enable in either mode to get grammar/meaning feedback in your native language
Replay — re-listen to any message (yours or the assistant's) via TTS
Translate — one-tap translation of assistant messages into your native language, pre-fetched during TTS playback for instant display
Word Lookup — click any target-language word for instant dictionary lookup; select multiple words for contextual explanation with grammar and examples
Personal Dictionary — save looked-up words to your dictionary; browse by language, replay pronunciation, and delete entries
Sample Responses — get 2 suggested replies with native language translations
AI Tutor — speak or type in your native language to get translations into the target language (auto-detected)
CEFR Difficulty — set your proficiency level (A1–C2) per language; AI adapts vocabulary and grammar complexity accordingly
Speaking Courage — gamified scoring that tracks word count, turn count, complexity, and response speed across sessions
External LLM — use Gemini API or any OpenAI-compatible endpoint as an alternative to local LLM
Dual TTS Engine — Edge TTS (online, high quality) or Kokoro (offline, fully private); switchable in settings
Web Interface — access from any device on your network via built-in Axum web server (port 3456); ideal for remote practice over Tailscale
Streaming TTS — sentence-by-sentence audio with natural pauses between sentences
Voice Preview — hear a sample phrase when selecting a voice in settings
Language Reset — switching practice language resets conversation and returns to the initial screen
UI Localization — interface language follows your native language setting (all 16 languages)
Japanese support — MeCab-based kanji-to-kana conversion for accurate TTS pronunciation
CJK support — CJK-aware punctuation handling and word counting

Architecture

Built with Tauri 2 (Rust backend + React frontend) and three embedded AI engines:

Engine	Purpose	Technology
STT	Speech-to-text	whisper.cpp via `whisper-rs` (bilingual detection)
LLM	Conversation	llama.cpp (`llama-server` sidecar) or Gemini API
TTS	Text-to-speech	Edge TTS (online) or Kokoro (offline)
Web	Remote access	Axum HTTP/WebSocket server with shared state
Dictionary	Word lookup cache + personal vocabulary	SQLite via `rusqlite`

Prerequisites

Rust (1.70+): https://rustup.rs
Node.js (18+): https://nodejs.org
espeak-ng — required for Kokoro TTS phonemization (the setup wizard can install it automatically):
- macOS: brew install espeak-ng
- Windows: downloaded automatically from the official release
- Linux: sudo apt install espeak-ng or equivalent
MeCab (optional) — improves Japanese TTS pronunciation by converting kanji to kana:
- macOS: brew install mecab mecab-ipadic
- Linux: sudo apt install mecab libmecab-dev mecab-ipadic-utf8
Tauri 2 system dependencies:
- macOS: Xcode Command Line Tools (xcode-select --install)
- Linux: See Tauri prerequisites
- Windows: See Tauri prerequisites

Getting Started

# Clone the repo
git clone https://github.com/yeonsh/speak-easy.git
cd speak-easy
# Install dependencies
npm install
# Run in development mode (desktop + web server)
npm run serve

This builds the frontend and starts the Tauri app with an embedded web server on port 3456.

Remote Access via Tailscale

Install Tailscale on both machines
Run npm run serve on your home machine
Access http://<tailscale-ip>:3456 from any device on your tailnet

The web interface shares all state with the desktop app — models load once and are available to both interfaces. The web server port is configurable via SPEAKEASY_WEB_PORT environment variable.

On first launch, the setup wizard will guide you through downloading all required models:

Whisper model (~150 MB) — for speech recognition
llama-server binary (~45 MB) — the LLM inference engine
GGUF language model — pick one:
- Qwen3 4B (~2.5 GB) — fast, good for casual practice
- Qwen3 30B-A3B (~17 GB) — higher quality conversations
espeak-ng — phonemizer for TTS (auto-install via Homebrew on macOS or MSI on Windows)
Kokoro TTS — two files covering all languages:
- Kokoro model (~325 MB) — the neural TTS engine
- Voice pack (~28 MB) — 50+ voices across all supported languages

Everything downloads with one click. All files are stored in ~/.speakeasy/.

Building for Production

npm run tauri build

The output is in src-tauri/target/release/bundle/.

Project Structure

src/ # React frontend
 components/ # UI: ChatView, MicButton, SetupWizard, etc.
 hooks/ # useLlm, useStt, useTts, useAudioRecorder
 lib/ # Types, per-language prompts, i18n, backend adapter
src-tauri/src/ # Rust backend
 lib.rs # Tauri command registration
 llm.rs # llama-server lifecycle management
 chat.rs # Streaming chat + TTS pipeline, explain/suggest/lookup commands
 gemini.rs # Gemini API integration (streaming + non-streaming)
 dictionary.rs # SQLite dictionary cache + personal vocabulary store
 courage.rs # Speaking courage scoring algorithm and trend analysis
 session.rs # Session persistence and review generation
 stt.rs # Whisper transcription with bilingual detection
 tts.rs # TTS engine dispatch (Kokoro/Edge), text cleaning, sentence splitting
 edge_tts.rs # Edge TTS via msedge-tts (online)
 downloads.rs # Model download with progress events
 settings.rs # Settings persistence
 web.rs # Axum web server (REST API, WebSocket, static files)
 event_bus.rs # Broadcast channel for Tauri-to-WebSocket event bridging

Data Directories

Path	Contents
`~/.speakeasy/models/`	Whisper models (`.bin`) and LLM models (`.gguf`)
`~/.speakeasy/voices/`	Kokoro TTS model (`kokoro-v1.0.onnx`) and voice pack (`voices-v1.0.bin`)
`~/.speakeasy/bin/`	Downloaded `llama-server` binary
`~/.speakeasy/settings.json`	User preferences (persisted across sessions)
`~/.speakeasy/dictionary.db`	SQLite cache for word lookups, personal vocabulary, sessions, and courage scores

License

See LICENSE.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yeonsh/speak-easy

Folders and files

Latest commit

History

Repository files navigation

SpeakEasy

Features

Architecture

Prerequisites

Getting Started

Remote Access via Tailscale

Building for Production

Project Structure

Data Directories

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpeakEasy

Features

Architecture

Prerequisites

Getting Started

Remote Access via Tailscale

Building for Production

Project Structure

Data Directories

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages