Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: itsdevcoffee/mojo-audio

v0.2.0 - macOS Support 🍎

04 Feb 22:57
@itsdevcoffee itsdevcoffee

Choose a tag to compare

🎉 macOS Support Arrives!

mojo-audio now runs natively on macOS with blazing performance on Apple Silicon!

✨ What's New

Platform Support:

  • ✅ macOS Apple Silicon (M1/M2/M3/M4)
  • ✅ macOS Intel (x86_64)
  • ✅ Linux x86_64

Performance (Apple Silicon M-series):

  • 30s audio: 8.34ms (3,598x realtime)
  • 10s audio: 2.81ms (3,562x realtime)
  • 1s audio: 0.40ms (2,530x realtime)

vs Python librosa:

  • 1.75x faster on 30s audio
  • 4.5x faster on 10s audio
  • 1.6x faster on 1s audio

🔧 Technical Improvements

  • Fixed Mojo 0.26.2+ API compatibility
  • Switched to OpenBLAS for cross-platform BLAS support
  • Inlined FFI types for reliable shared library builds
  • All tests passing on macOS and Linux

📦 Installation

macOS:

curl -L https://github.com/itsdevcoffee/mojo-audio/releases/download/v0.2.0/mojo-audio-v0.2.0-macos-arm64.tar.gz | tar xz
sudo cp libmojo_audio.dylib /usr/local/lib/
sudo cp mojo_audio.h /usr/local/include/

Build from source:

pixi install
pixi run build-ffi-optimized

See README.md for detailed build instructions.

Assets 4
Loading

mojo-audio v0.1.0 - First Public Release

22 Jan 09:57
@itsdevcoffee itsdevcoffee

Choose a tag to compare

mojo-audio v0.1.0 - Release Notes

Release Date: January 22, 2026

High-performance mel spectrogram preprocessing library in Mojo that beats Python's librosa by 1.5-3.6x on short/medium audio.


🎉 What's New

First Public Release

Complete Whisper-compatible audio preprocessing pipeline built from scratch in Mojo, achieving production-ready performance and beating Python's librosa at all audio durations.

Performance Highlights

Duration mojo-audio librosa Result
1 second 1.1 ms 4.0 ms 3.6x faster
10 seconds 7.5 ms 15.3 ms 2.0x faster
30 seconds 27.4 ms 30.4 ms 1.1x faster

Key advantages:

  • 1.5-3.6x faster on short/medium audio (most common use case)
  • Far more consistent (5-10% variance vs librosa's 22-39%)
  • ~1100x realtime throughput on 30s audio
  • Zero dependencies, pure Mojo implementation

📦 Release Assets

Pre-built Binaries (FFI)

mojo-audio-v0.1.0-linux-x86_64.tar.gz (15KB)

Pre-built shared library for Linux x86_64:

  • libmojo_audio.so - Optimized shared library (26KB, -O3)
  • mojo_audio.h - C header file
  • INSTALL.md - Installation and usage guide

Use from:

  • ✅ C/C++
  • ✅ Rust
  • ✅ Python (ctypes/cffi)
  • ✅ Go (cgo)
  • ✅ Any language with C FFI support

mojo-audio-ffi-examples.tar.gz (3.1KB)

FFI usage examples:

  • demo_c.c - Complete C example
  • demo_rust.rs - Complete Rust example
  • Makefile - Build system for examples

Source Code

  • Source code (zip) - Auto-generated by GitHub
  • Source code (tar.gz) - Auto-generated by GitHub

✨ Features

Core Audio Processing

  • Mel Spectrogram Pipeline - Complete Whisper-compatible preprocessing
  • Window Functions - Hann and Hamming with SIMD optimization
  • FFT Operations - Radix-2/4 iterative FFT + true RFFT for real signals
  • STFT - Parallelized short-time Fourier transform across CPU cores
  • Mel Filterbank - Sparse-optimized triangular filters (80 or 128 bands)
  • Normalization - Multiple modes (Whisper, min-max, z-score, raw)

FFI Support

  • Zero-overhead C API - Same performance as native Mojo
  • Type-safe interface - Proper error handling and memory management
  • Comprehensive examples - C, Rust, Python usage samples
  • Easy installation - Pre-built binaries for Linux x86_64

Testing & Benchmarking

  • 17 test cases - All passing
  • Robust methodology - Deterministic signals, warmup, outlier exclusion
  • Comparison scripts - Back-to-back mojo-audio vs librosa
  • Web UI - Interactive benchmark visualization

🚀 Quick Start

Native Mojo

git clone https://github.com/itsdevcoffee/mojo-audio.git
cd mojo-audio
pixi install
pixi run demo-mel

FFI (C/Rust/Python)

# Download and install pre-built library
wget https://github.com/itsdevcoffee/mojo-audio/releases/download/v0.1.0/mojo-audio-v0.1.0-linux-x86_64.tar.gz
tar xzf mojo-audio-v0.1.0-linux-x86_64.tar.gz
cd linux-x86_64
sudo cp libmojo_audio.so /usr/local/lib/
sudo cp mojo_audio.h /usr/local/include/
sudo ldconfig
# Test it
wget https://github.com/itsdevcoffee/mojo-audio/releases/download/v0.1.0/mojo-audio-ffi-examples.tar.gz
tar xzf mojo-audio-ffi-examples.tar.gz
gcc demo_c.c -lmojo_audio -lm -o demo
./demo

🔧 Requirements

Native Mojo

  • Mojo 0.26.1 or later
  • pixi package manager
  • Linux (primary platform)

FFI (Pre-built Binaries)

  • Linux x86_64
  • glibc 2.31 or later
  • No Mojo installation required!

🎯 Whisper Compatibility

Fully compatible with OpenAI Whisper models:

  • ✅ Sample rate: 16kHz
  • ✅ FFT size: 400
  • ✅ Hop length: 160 (10ms frames)
  • ✅ Mel bands: 80 (v2) or 128 (v3)
  • ✅ Output shape: (n_mels, ~3000) for 30s
  • ✅ Normalization: Whisper-compatible mode

📚 Documentation


🏆 Achievements

  • 🥇 Beats librosa at all durations (1.1-3.6x faster)
  • 🚀 17-68x speedup from naive to optimized implementation
  • ~1100x realtime throughput on 30s audio
  • 📊 Far more consistent than librosa (5-10% vs 22-39% variance)
  • 100% from scratch in Mojo - no black-box dependencies
  • 🎓 Complete learning resource with educational examples

🔬 Technical Highlights

9 Major Optimizations

  1. Iterative FFT (3.0x) - Cache-friendly Cooley-Tukey
  2. Twiddle Precompute (1.7x) - Pre-computed rotation factors
  3. Sparse Mel Filterbank (1.24x) - Skip zero-weight bins
  4. Twiddle Caching (2.0x) - Reuse across frames
  5. Float32 Precision (1.07x) - 2x SIMD width
  6. True RFFT (1.43x) - Exploits real signal symmetry
  7. Parallelization (1.3-1.7x) - Multi-core frame processing
  8. Radix-4 FFT (1.1-1.2x) - Optimized butterflies
  9. Compiler Optimization (1.2-1.5x) - -O3 flag

Total: 17-68x faster than naive implementation!


🐛 Known Issues

None reported. This is the initial stable release.


🤝 Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Areas where we'd love help:

  • Additional FFI language bindings (Go, Julia, Zig, Swift)
  • ARM platform support and optimization
  • Additional audio features (MFCC, CQT, etc.)
  • Documentation improvements

📝 License

MIT License - See LICENSE


🔗 Links


📊 Benchmark Methodology

All benchmarks use improved methodology:

  • Signal: Deterministic chirp (20Hz-8000Hz, reproducible)
  • Warmup: 5 runs (JIT stabilization)
  • Iterations: 20 (statistical significance)
  • Analysis: Outlier exclusion, mean ± std dev
  • Comparison: Back-to-back mojo-audio vs librosa

Run benchmarks yourself:

pixi run bench-compare # Full comparison
pixi run bench-stable 5 # Stable (5 runs, median)

Built with Mojo 🔥 | Faster than Python | Production-Ready

Loading

AltStyle によって変換されたページ (->オリジナル) /