Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

About Karaoke-style movie generation #1206

ashyrv started this conversation in General
Discussion options

Hey everyone! I have a question about the karaoke-style speech recognition. Do you think it will work in real time? It seems it's really good with recorded audio but will it work in real-time speech? Any feedback or answers appreciated.

Thank you in advance!

You must be logged in to vote

Replies: 2 comments

Comment options

Whisper is not real-time, nor is it likely to ever be real-time. It works by processing audio 30 seconds at a time, and that processing, even when GPU-accelerated, can take a significant fraction of a second, or even several seconds.

You must be logged in to vote
0 replies
Comment options

Thanks
...
On Wed, 5 Jun 2024, 00:05 ulatekh, ***@***.***> wrote: Whisper is not real-time, nor is it likely to ever be real-time. It works by processing audio 30 seconds at a time, and that processing, even when GPU-accelerated, can take a significant fraction of a second, or even several seconds. — Reply to this email directly, view it on GitHub <#1206 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AUDBSE67MG25ECR2KBXS4RTZFYT2ZAVCNFSM6AAAAAA36EDHT6VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TMNRZHA2TM> . You are receiving this because you authored the thread.Message ID: ***@***.***>
You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants

AltStyle によって変換されたページ (->オリジナル) /