PyPI Python License Downloads CI
Stop paying humans to listen to corrupted audio files. Fix them automatically.
Sonic Gate is a CLI-first audio/video quality gate that uses deterministic audio analysis to catch corrupted, invalid, or low-quality audio files before they reach human reviewers or downstream pipelines.
Optional AI Probe: Includes an experimental Whisper-based speech quality probe (disabled by default) for users who want to detect language mismatches or speech quality issues.
- Traditional Analysis (Fast & Deterministic):
- LUFS loudness measurement (FFmpeg ebur128)
- Silence detection (pydub)
- Duration validation
- Format/corruption checking
- Video Support: Auto-extract audio from MP4, MOV, AVI, MKV, WebM
- Fix Mode: Auto-trim silence, normalize LUFS, non-destructive repairs
- Multiple Formats: Table, JSON, CSV, Markdown output
- Optional AI Probe: Whisper-based speech detection (off by default)
# Install the deterministic core (fast, no AI dependencies) pip install sonic-gate # With optional AI probe (includes Whisper) pip install "sonic-gate[ai]"
Or install from source:
git clone https://github.com/Codinglone/sonic-gate.git cd sonic-gate pip install -e .
# Analyze a single file (deterministic only, fast) sonic-gate interview.wav # Analyze a directory sonic-gate ./recordings/ # With custom config sonic-gate --config gate.yaml ./podcasts/ # Fix failed files automatically sonic-gate --fix ./recordings/ # JSON output for CI sonic-gate --format json ./files/ > report.json # Demo mode sonic-gate demo
rules: traditional: max_silence_seconds: 3.0 lufs_range: [-24, -16] ai_probe: enabled: false # Whisper is OFF by default output: format: table show_passed: false
rules: traditional: max_silence_seconds: 3.0 lufs_range: [-24, -16] ai_probe: enabled: true # Enable Whisper whisper_model: base # tiny/base/small/medium/large min_confidence: -1.0 # Logprob threshold (negative values) expected_language: en # Optional language check speaking_rate_range: [100, 180] fix: enabled: false output_dir: ./fixed normalize_lufs: -16.0 output: format: table show_passed: false
Note: The AI probe uses Whisper logprob-based confidence scores which are always negative. Typical values range from -0.5 (good) to -5.0 (poor). Adjust min_confidence based on your audio quality and language.
| Analyzer | Speed | Notes |
|---|---|---|
| Traditional (LUFS, silence, format) | ~4ms/file | Deterministic, always accurate |
| AI Probe (Whisper tiny) | ~200ms/file | Optional, experimental |
| Video extraction | +100ms/file | One-time FFmpeg extract |
Recommendation: Use traditional analysis for batch processing. Enable AI probe only when you need speech-specific checks.
- Python 3.9+
- FFmpeg (for LUFS and video support)
MIT