Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
0 votes
0 answers
149 views

I'm working on a real-time speech processing pipeline using pyannote-audio, and I’m using the pyannote/speaker-diarization-3.1 pipeline with Hugging Face token authentication. My code captures live ...
2 votes
1 answer
267 views

I specifically am trying to build an application that can run an html-javascript file that can recognize the speech input from a microphone, transcribe it, and assign it to a speaker, continuously ...
0 votes
0 answers
19 views

I’m using Azure’s Speaker Recognition API for speaker identification in my Python script, but I’m encountering a 404 error with the message: Resource not found This error occurs when I try to identify ...
0 votes
0 answers
262 views

How can I use the data from speech_recognition's listen() function as an embedding to compare with previously recorded .wav files of different speakers talking so that I can print (speaker): (...
0 votes
0 answers
395 views

I try to fine-tune pre train Pyannote model for VAD task. I can fine-tune it for Segmentation task and everything goes well and I can improve the model results. Here is the code how I fine-tune it: ...
0 votes
1 answer
200 views

How do I cache the NVIDIA Nemo model diar_msdd_telephonic.nemo so it is pre downloaded to be referenced in my config file. I am using msdd_model = NeuralDiarizer(cfg=create_config(temp_path)).to("...
1 vote
0 answers
164 views

pyannote/segmentation-3.0 suggests to use pyannote/speaker-diarization-3.0 since it has better embedding model for diarization. I am trying to use this in client-side JS. It seems like I am supposed ...
1 vote
2 answers
5k views

I try to use Pyannotes models offline. I was loading and applying models like this: from pyannote.audio import Pipeline access_token = 'xxxxxxxxxxx' model = Pipeline.from_pretrained( "...
0 votes
1 answer
165 views

I have a base of audio samples matched with concrete speaker like nick_sample1.mp3 nick_sample2.mp3 ... nick_sampleN.mp3 john_sample1.mp3 john_sample2.mp3 ... john_sampleK.mp3 The task is to match a ...
2 votes
1 answer
348 views

I have task where given an audio file I have to perform speaker diarization on the audio file and then I have to perform the transcription accordingly. For speaker diarization I am using pyannote, ...
2 votes
2 answers
9k views

I was trying to use whisperx to do speaker diarization. I did it sucessfully on google colab but I'm encountering this error while tyring to transcribe the audio file. Traceback (most recent call last)...
1 vote
0 answers
562 views

I'm using this script to diarize then transcribe speach using pyannote.audio and whisper. Using pyannote 2.1, it works perfectly, but then, when I change the version used to the latest (3.1), I get ...
1 vote
1 answer
195 views

Azure speech private preview for diarization was earlier setting "unknown" speaker tag until it recognise a long 7 seconds statement from a speaker, with the api in public preview it started tagging ...
1 vote
0 answers
300 views

This question is about Speaker diarization. I'm trying to make a script that separates a mp4 file into different segments depending on different speakers. (The input mp4 file contains the dialogue of ...
5 votes
1 answer
6k views

I am looking for Offline / locally saved model for speaker diarization with Hugging face without Authentication. I have gone through google and found no relevant links for the same. Is there any link/...

15 30 50 per page
1
2

AltStyle によって変換されたページ (->オリジナル) /