Name	Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/ISSUE_TEMPLATE	.github/ISSUE_TEMPLATE
ACE-Step	ACE-Step
bin	bin
brain	brain
fish-speech	fish-speech
manual	manual
models	models
routes	routes
settings	settings
static	static
templates	templates
voices	voices
(portable) LocalSoundsAPI-Multi.bat	(portable) LocalSoundsAPI-Multi.bat
(portable) LocalSoundsAPI-Single.bat	(portable) LocalSoundsAPI-Single.bat
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
[stand-alone-app]-API_client.py	[stand-alone-app]-API_client.py
[stand-alone-app]-print-project-files.py	[stand-alone-app]-print-project-files.py
[stand-alone-app]-project_backup.py	[stand-alone-app]-project_backup.py
audio_post.py	audio_post.py
audio_post_FISH.py	audio_post_FISH.py
audio_post_KOKORO.py	audio_post_KOKORO.py
audio_post_XTTS.py	audio_post_XTTS.py
config.py	config.py
launcher.bat	launcher.bat
launcher.py	launcher.py
logger.py	logger.py
main.py	main.py
pyproject.toml	pyproject.toml
requirements.txt	requirements.txt
save_utils.py	save_utils.py
text_utils.py	text_utils.py
tools.py	tools.py

LocalSoundsAPI

License: MIT Platform: Windows Python Stars

The ultimate portable, offline all-in-one audio studio Text-to-Speech · Transcription - Subtitles - Music Generation · Sound Effects · Video Production · AI Chatbot

LocalSoundsAPI Interface

LocalSoundsAPI gives you both a full-featured browser-based web interface and a complete local REST API — use it interactively or call it from scripts, other apps, or automation tools.

Everything runs locally from one folder — no installation, no internet needed after setup.

Included Engines (all fully local & offline)

XTTS v2 – Top-tier multilingual voice cloning with speaker embeddings
Fish Speech – Extremely fast and expressive cloned voices
Kokoro 82M – Lightning-fast English TTS with 20 premium built-in voices
Stable Audio Open 1.0 – Text-to-music and sound effects (CLAP-scored variants)
ACE-Step 3.5B – Advanced multi-line prompt music generation (style + lyrics)
Whisper – On-demand transcription & quality verification for every generated chunk
Local LLM Chatbot – Built-in llama.cpp assistant for writing prompts, scripts, lyrics, stories, and full projects
OpenRouter / LM Studio support – Optional cloud or external local backends for the chatbot

Key Features

Professional post-processing on every engine
De-reverb, de-essing, loudness normalization (-23 LUFS), intelligent silence trimming, peak limiting, and optional Whisper verification with automatic retries.
Full project system
Save jobs with progress tracking, automatic recovery (##recover##), and persistent job.json files.
Powerful built-in Chatbot
Helps you write perfect prompts, lyrics, stories, or entire scripts. Responses can be sent directly to any TTS or music engine with one click.
Per-model device selection
Every model (XTTS, Fish, Kokoro, Stable Audio, ACE-Step, Whisper, local LLM) can be loaded on CPU or any available GPU independently — perfect for mixing heavy and light models.
Run multiple instances Use (portable) LocalSoundsAPI-Multi.bat or the Launcher GUI to launch several copies on different ports — great for parallel generation or different model setups.
GUI Launcher A tkinter desktop app (launcher.bat) that auto-detects GPUs, manages multiple instances, downloads models and tools, and consolidates all server logs into one window — no more separate cmd windows.
Video production tool
Turn any audio + transcription into a subtitled video (horizontal/vertical, solid color, transparent, or image/video background).
Settings presets – Save and load all your favorite parameters instantly.

Quick Start – Fully Portable (No Installation)

Download the repository code
Go to the main repo → Code → Download ZIP.
Extract it to any folder you like (e.g., Desktop, Documents, or a USB drive). This is your main project folder.
Download the portable binaries from Releases
Go to Releases and download:
- portable-python-env-v1.7z
- bin.zip
Extract the binaries correctly
- Extract portable-python-env-v1.7z directly into your main project folder → it creates the python/ subfolder.
- Extract bin.zip into the existing bin/ folder (inside your main project folder) → it populates bin/ffmpeg/, bin/rubberband/, and bin/espeak-ng/.
Launch the app
- Launcher GUI (recommended): Double-click launcher.bat → Opens a desktop app where you can select GPUs, add instances on any port, start/stop them, view all logs in one place, and download models or tools.
- Single instance (simple): Double-click (portable) LocalSoundsAPI-Single.bat → Starts on port 5006 and opens http://127.0.0.1:5006 in your browser.
- Multiple instances (command-line): Double-click (portable) LocalSoundsAPI-Multi.bat → Asks how many instances and starting port, then opens separate cmd windows for each.

First run only: The app auto-downloads all models (~8–12 GB total). This happens on a need-to-use basis once and can take 10–40 minutes. Just let it finish.

That's it – completely offline and portable after the first run!

Important Folders

models/ – Place or auto-download TTS/music models here
voices/ – Your reference voice samples for cloning
projects_output/ – All saved jobs and final outputs
brain/ – Chatbot history, archives, and system prompts
settings/ – Your saved parameter presets
bin/ – Bundled ffmpeg, rubberband, eSpeak-ng
python/ – Complete portable Python environment

Project Structure

project-root/
├── ACE-Step/ # Bundled ACE-Step repo (music generation)
├── bin/ # Portable tools
│ ├── ffmpeg/
│ ├── rubberband/
│ └── espeak-ng/
├── brain/ # Chatbot memory
│ ├── context_history/ # Current + archived chats
│ └── system_prompt.json
├── fish-speech/ # Bundled Fish Speech repo
├── models/ # All models (auto-downloaded or placed here)
│ ├── XTTS-v2/
│ ├── fish-speech-1.5/
│ ├── kokoro-82m/
│ ├── stable-audio-open-1.0/
│ ├── ace_step/
│ └── clap-htsat-unfused/
├── projects_output/ # Saved jobs and final outputs
├── voices/ # Your reference voice samples
├── settings/ # Saved parameter presets
├── static/ # Web UI (CSS, JS, icons)
├── templates/ # HTML pages
├── routes/ # All Flask endpoints
├── python/ # Portable Python environment (from the 7z)
├── launcher.py # GUI launcher (instance manager, model downloads)
├── launcher.bat # Runs the launcher with portable Python
├── (portable) LocalSoundsAPI-Single.bat
├── (portable) LocalSoundsAPI-Multi.bat
├── main.py
├── config.py
└── requirements.txt

Why This Feels So Smooth

Completely self-contained – The bundled portable Python environment is isolated from your system Python. No pip installs, no conda environments, no dependency conflicts, no PATH headaches. Just extract and run.
Truly offline – After the initial model downloads (which you can do once), everything works 100% without internet.
No admin rights needed – Perfect for work/school computers or USB stick setups.
Instant multi-GPU support – Load heavy models on your best GPU and lighter ones (Whisper, Kokoro, Fish) on another or on CPU — all from the same interface.

Tips for the Best Experience

First run? Let the app auto-download the models you need (XTTS, Fish, Kokoro, Stable Audio, ACE-Step, CLAP, Whisper). It only happens once per model.
Low VRAM? Use the per-model device selectors — keep big models on your strongest GPU and run Whisper/Kokoro on CPU or a smaller card.
Want to generate faster? Launch multiple instances with LocalSoundsAPI-Multi.bat — one for TTS, one for music, one for the chatbot, etc.
Chatbot for content creation – Stuck on a prompt or lyric? Ask the built-in assistant — then click the little icons under its reply to send the text straight to XTTS, Fish, Kokoro, Stable Audio, or ACE-Step.
Save everything you like – Use the "Save Path" field to create permanent projects in projects_output/. Temporary generations disappear when you close the app (unless saved).

Enjoy a clean, powerful, completely local creative workflow — no cloud, no subscriptions, no compromises! 🎧✨

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aivrar/LocalSoundsAPI

Folders and files

Latest commit

History

Repository files navigation

LocalSoundsAPI

Included Engines (all fully local & offline)

Key Features

Quick Start – Fully Portable (No Installation)

Important Folders

Project Structure

Why This Feels So Smooth

Tips for the Best Experience

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LocalSoundsAPI

Included Engines (all fully local & offline)

Key Features

Quick Start – Fully Portable (No Installation)

Important Folders

Project Structure

Why This Feels So Smooth

Tips for the Best Experience

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages