Name	Name	Last commit message	Last commit date
Latest commit History 9 Commits
ACE-Step	ACE-Step
bin	bin
brain	brain
fish-speech	fish-speech
models	models
routes	routes
settings	settings
static	static
templates	templates
voices	voices
(portable) LocalSoundsAPI-Multi.bat	(portable) LocalSoundsAPI-Multi.bat
(portable) LocalSoundsAPI-Single.bat	(portable) LocalSoundsAPI-Single.bat
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
[stand-alone-app]-API_client.py	[stand-alone-app]-API_client.py
[stand-alone-app]-print-project-files.py	[stand-alone-app]-print-project-files.py
[stand-alone-app]-project_backup.py	[stand-alone-app]-project_backup.py
audio_post.py	audio_post.py
audio_post_FISH.py	audio_post_FISH.py
audio_post_KOKORO.py	audio_post_KOKORO.py
audio_post_XTTS.py	audio_post_XTTS.py
config.py	config.py
logger.py	logger.py
main.py	main.py
pyproject.toml	pyproject.toml
requirements.txt	requirements.txt
save_utils.py	save_utils.py
text_utils.py	text_utils.py
tools.py	tools.py

LocalSoundsAPI

The ultimate portable, offline all-in-one audio studio
Text-to-Speech · Transcription - Subtitles - Music Generation · Sound Effects · Video Production · AI Chatbot

LocalSoundsAPI gives you both a full-featured browser-based web interface and a complete local REST API — use it interactively or call it from scripts, other apps, or automation tools.

Everything runs locally from one folder — no installation, no internet needed after setup.

Included Engines (all fully local & offline)

XTTS v2 – Top-tier multilingual voice cloning with speaker embeddings
Fish Speech – Extremely fast and expressive cloned voices
Kokoro 82M – Lightning-fast English TTS with 20 premium built-in voices
Stable Audio Open 1.0 – Text-to-music and sound effects (CLAP-scored variants)
ACE-Step 3.5B – Advanced multi-line prompt music generation (style + lyrics)
Whisper – On-demand transcription & quality verification for every generated chunk
Local LLM Chatbot – Built-in llama.cpp assistant for writing prompts, scripts, lyrics, stories, and full projects
OpenRouter / LM Studio support – Optional cloud or external local backends for the chatbot

Key Features

Professional post-processing on every engine
De-reverb, de-essing, loudness normalization (-23 LUFS), intelligent silence trimming, peak limiting, and optional Whisper verification with automatic retries.
Full project system
Save jobs with progress tracking, automatic recovery (##recover##), and persistent job.json files.
Powerful built-in Chatbot
Helps you write perfect prompts, lyrics, stories, or entire scripts. Responses can be sent directly to any TTS or music engine with one click.
Per-model device selection
Every model (XTTS, Fish, Kokoro, Stable Audio, ACE-Step, Whisper, local LLM) can be loaded on CPU or any available GPU independently — perfect for mixing heavy and light models.
Run multiple instances
Use (portable) LocalSoundsAPI-Multi.bat to launch several copies on different ports — great for parallel generation or different model setups.
Video production tool
Turn any audio + transcription into a subtitled video (horizontal/vertical, solid color, transparent, or image/video background).
Settings presets – Save and load all your favorite parameters instantly.

Quick Start – Fully Portable (No Installation)

Download the repository code
Go to the main repo → Code → Download ZIP.
Extract it to any folder you like (e.g., Desktop, Documents, or a USB drive). This is your main project folder.
Download the portable binaries from Releases
Go to Releases and download:
- portable-python-env-v1.7z
- bin.zip
Extract the binaries correctly
- Extract portable-python-env-v1.7z directly into your main project folder → it creates the python/ subfolder.
- Extract bin.zip into the existing bin/ folder (inside your main project folder) → it populates bin/ffmpeg/, bin/rubberband/, and bin/espeak-ng/.
Launch the app
- Single instance (recommended for most users):
  Double-click (portable) LocalSoundsAPI-Single.bat
  → It always starts on port 5006 and opens http://127.0.0.1:5006 in your browser.
- Multiple instances (for running several generations in parallel):
  Double-click (portable) LocalSoundsAPI-Multi.bat
  → It will ask you:
  • How many instances do you want?
  • Starting from which port? (e.g., 5006, 5007, 5008...)
  Each instance gets its own port and browser tab.

First run only: The app auto-downloads all models (~8–12 GB total). This happens on a need-to-use basis once and can take 10–40 minutes. Just let it finish.

That's it – completely offline and portable after the first run!

Important Folders

models/ – Place or auto-download TTS/music models here
voices/ – Your reference voice samples for cloning
projects_output/ – All saved jobs and final outputs
brain/ – Chatbot history, archives, and system prompts
settings/ – Your saved parameter presets
bin/ – Bundled ffmpeg, rubberband, eSpeak-ng
python/ – Complete portable Python environment

Project Structure

project-root/
├── ACE-Step/ # Bundled ACE-Step repo (music generation)
├── bin/ # Portable tools
│ ├── ffmpeg/
│ ├── rubberband/
│ └── espeak-ng/
├── brain/ # Chatbot memory
│ ├── context_history/ # Current + archived chats
│ └── system_prompt.json
├── fish-speech/ # Bundled Fish Speech repo
├── models/ # All models (auto-downloaded or placed here)
│ ├── XTTS-v2/
│ ├── fish-speech-1.5/
│ ├── kokoro-82m/
│ ├── stable-audio-open-1.0/
│ ├── ace_step/
│ └── clap-htsat-unfused/
├── projects_output/ # Saved jobs and final outputs
├── voices/ # Your reference voice samples
├── settings/ # Saved parameter presets
├── static/ # Web UI (CSS, JS, icons)
├── templates/ # HTML pages
├── routes/ # All Flask endpoints
├── python/ # Portable Python environment (from the 7z)
├── (portable) LocalSoundsAPI-Single.bat
├── (portable) LocalSoundsAPI-Multi.bat
├── main.py
├── config.py
└── requirements.txt

Why This Feels So Smooth

Completely self-contained – The bundled portable Python environment is isolated from your system Python. No pip installs, no conda environments, no dependency conflicts, no PATH headaches. Just extract and run.
Truly offline – After the initial model downloads (which you can do once), everything works 100% without internet.
No admin rights needed – Perfect for work/school computers or USB stick setups.
Instant multi-GPU support – Load heavy models on your best GPU and lighter ones (Whisper, Kokoro, Fish) on another or on CPU — all from the same interface.

Tips for the Best Experience

First run? Let the app auto-download the models you need (XTTS, Fish, Kokoro, Stable Audio, ACE-Step, CLAP, Whisper). It only happens once per model.
Low VRAM? Use the per-model device selectors — keep big models on your strongest GPU and run Whisper/Kokoro on CPU or a smaller card.
Want to generate faster? Launch multiple instances with LocalSoundsAPI-Multi.bat — one for TTS, one for music, one for the chatbot, etc.
Chatbot for content creation – Stuck on a prompt or lyric? Ask the built-in assistant — then click the little icons under its reply to send the text straight to XTTS, Fish, Kokoro, Stable Audio, or ACE-Step.
Save everything you like – Use the "Save Path" field to create permanent projects in projects_output/. Temporary generations disappear when you close the app (unless saved).

Enjoy a clean, powerful, completely local creative workflow — no cloud, no subscriptions, no compromises! 🎧✨

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

audiohacking/LocalMetals

Folders and files

Latest commit

History

Repository files navigation

LocalSoundsAPI

Included Engines (all fully local & offline)

Key Features

Quick Start – Fully Portable (No Installation)

Important Folders

Project Structure

Why This Feels So Smooth

Tips for the Best Experience

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

License

audiohacking/LocalMetals

Folders and files

Latest commit

History

Repository files navigation

LocalSoundsAPI

Included Engines (all fully local & offline)

Key Features

Quick Start – Fully Portable (No Installation)

Important Folders

Project Structure

Why This Feels So Smooth

Tips for the Best Experience

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages