-
Notifications
You must be signed in to change notification settings - Fork 745
-
Hi Kolja,
First off, thank you so much for your amazing work on RealtimeSTT - I really love what you put together! 🙌 (And I am trying really hard to make it work for me. 😄)
Quick question: I’ve been testing on an M1 Max MacBook Pro using faster-whisper-large-v3-turbo-ct2, but I’m consistently getting around 4 seconds latency per transcription.
With superwhisper (using ggml-large-v3-turbo), I’m seeing <1s latency, for the same voice-text, at the same time and on the same hardware.
Could this difference be due to the model format (CT2 vs GGML)? Or is there something I might be missing in the RealtimeSTT config? (Which would be my hope. 🙃)
I’ve tried enabling and adjusting so many settings 😅 but haven’t had luck reducing latency.
If you have any recommended settings for fastest response on Apple Silicon, I’d really appreciate it!
Thanks again 🙏
Jil
Beta Was this translation helpful? Give feedback.
All reactions
Superwhisper uses apple-specific optimizations and takes full advantage of MPS and CoreML. faster-whisper does not use these apple-specific low-level tools, it's fast in the cross-platform sense but I doubt it will ever reach superwhisper inference speed.
This said maybe I can optimize RealtimeSTT more for Apple. Will make a new release soon exposing faster-whispers cpu_threads and num_workers parameters. With compute_type="int8" we can probably make use of multithreading on a Mac. I'm hoping that will speed things up on Mac - but I'm not sure if and how much that will be the case.
Replies: 1 comment 1 reply
-
Superwhisper uses apple-specific optimizations and takes full advantage of MPS and CoreML. faster-whisper does not use these apple-specific low-level tools, it's fast in the cross-platform sense but I doubt it will ever reach superwhisper inference speed.
This said maybe I can optimize RealtimeSTT more for Apple. Will make a new release soon exposing faster-whispers cpu_threads and num_workers parameters. With compute_type="int8" we can probably make use of multithreading on a Mac. I'm hoping that will speed things up on Mac - but I'm not sure if and how much that will be the case.
Beta Was this translation helpful? Give feedback.
All reactions
-
🎉 1
-
Thanks @KoljaB – that sounds really promising! 🙌 Appreciate the insight and all the work you're putting into this. Looking forward to the next release – but no pressure on my account, only if it makes sense for you 😊
Beta Was this translation helpful? Give feedback.