Otosaku DSP
- 9 followers
- Cyprus
- otosaku.dsp@gmail.com
Pinned Loading
-
OtosakuKWS-iOS
OtosakuKWS-iOS PublicLightweight on-device keyword spotting engine for iOS using CoreML and real-time audio streaming.
-
OtosakuStreamingASR-iOS
OtosakuStreamingASR-iOS PublicOtosakuStreamingASR-iOS is a real-time speech recognition engine for iOS, built with Swift and Core ML. It uses a fast and lightweight streaming Conformer model optimized for on-device inference. D...
-
OtosakuTTS-iOS
OtosakuTTS-iOS PublicSwift library for offline text-to-speech synthesis on iOS/macOS. Generate natural speech directly on device using CoreML-optimized FastPitch and HiFiGAN models. No internet required, fully private.
-
NeMoConformerASR-iOS
NeMoConformerASR-iOS PublicOn-device speech-to-text for iOS/macOS powered by NVIDIA NeMo Conformer CTC Small (13M params). Pure Swift + CoreML implementation with automatic audio padding, chunking for long audio, and real-ti...
Swift 2
-
NeMoSpeaker-iOS
NeMoSpeaker-iOS PublicSwift library for Speaker Embedding extraction and verification using NVIDIA NeMo TitaNet model converted to CoreML. Extract 192-dim speaker embeddings, verify speakers, and perform real-time speak...
Swift 3
-
NeMoVAD-iOS
NeMoVAD-iOS PublicSwift library for Voice Activity Detection (VAD) using NVIDIA NeMo MarbleNet model converted to CoreML. Detect speech segments in real-time on iOS/macOS with high accuracy and low latency.
Swift 2
Repositories
- NeMoConformerASR-Android Public
Kotlin library for on-device speech recognition using NVIDIA NeMo Conformer CTC model with ONNX Runtime
Otosaku/NeMoConformerASR-Android’s past year of commit activity - NeMoFeatureExtractor-Android Public
Otosaku/NeMoFeatureExtractor-Android’s past year of commit activity - NeMoConformerASR-iOS Public
On-device speech-to-text for iOS/macOS powered by NVIDIA NeMo Conformer CTC Small (13M params). Pure Swift + CoreML implementation with automatic audio padding, chunking for long audio, and real-time recognition.
Otosaku/NeMoConformerASR-iOS’s past year of commit activity - NeMoSpeaker-iOS Public
Swift library for Speaker Embedding extraction and verification using NVIDIA NeMo TitaNet model converted to CoreML. Extract 192-dim speaker embeddings, verify speakers, and perform real-time speaker diarization on iOS/macOS.
Otosaku/NeMoSpeaker-iOS’s past year of commit activity - NeMoVAD-iOS Public
Swift library for Voice Activity Detection (VAD) using NVIDIA NeMo MarbleNet model converted to CoreML. Detect speech segments in real-time on iOS/macOS with high accuracy and low latency.
Otosaku/NeMoVAD-iOS’s past year of commit activity - NeMoFeatureExtractor-iOS Public
Otosaku/NeMoFeatureExtractor-iOS’s past year of commit activity - OtosakuTTS-iOS Public
Swift library for offline text-to-speech synthesis on iOS/macOS. Generate natural speech directly on device using CoreML-optimized FastPitch and HiFiGAN models. No internet required, fully private.
Otosaku/OtosakuTTS-iOS’s past year of commit activity - OtosakuPOSTagger-iOS Public
Swift library for Part-of-Speech tagging using BERT-based CoreML models. Fast, accurate POS tagging for iOS/macOS with automatic model management and clean API.
Otosaku/OtosakuPOSTagger-iOS’s past year of commit activity - OtosakuFeatureExtractor-iOS Public
Lightweight Swift library for log-Mel spectrogram extraction with Accelerate & CoreML)
Otosaku/OtosakuFeatureExtractor-iOS’s past year of commit activity