Newest 'speech' Questions

1. Home
2. Questions
3. AI Assist
4. Tags
5. Challenges
6. Chat
7. Articles
8. Users
9. Companies
11. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Bring the best of human thought and AI automation together at your work. Learn more

955 questions

0 votes

0 answers

54 views

PyInstaller executable throws "OSError: [WinError 50] The request is not supported" when using speech_recognition (FLAC error)

I’m building a Python voice assistant using the speech_recognition library. Everything works perfectly when I run the code from PyCharm or the terminal, but when I convert it to an .exe using Auto Py ...

Konstantin_Violinov's user avatar

Konstantin_Violinov

asked Oct 25 at 15:36

0 votes

1 answer

53 views

Android: Google Recognizer Intent: EXTRA_PREFER_OFFLINE and API 33+

Consider this Kotlin code to init a Google speech recognizer: recognizerIntent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH) .apply { putExtra( ...

Yanay Lehavi's user avatar

Yanay Lehavi

asked Sep 15 at 0:25

0 votes

1 answer

45 views

save audio file in iOS 18 instead of iOS 12

I'm able to get text to speech to audio file using the following code for iOS 12 iPhone 8 to create a car file: audioFile = try AVAudioFile( forWriting: saveToURL, settings: pcmBuffer.format.settings, ...

daniel's user avatar

daniel

1,046

asked Mar 31 at 5:50

1 vote

1 answer

81 views

Google Speech returning all words ever spoken, instead of just the words from the transcript

Using Google Speech in Python, I'm able to get a transcript for each phrase spoken using result.alternatives[0].transcript, but when I try to look at the words for the phrase, result.alternatives[0]....

JackKalish's user avatar

JackKalish

1,609

asked Mar 4 at 0:03

0 votes

0 answers

42 views

How does PESQ handle time alignment for trimmed degraded audio?

I am working on speech quality assessment and using PESQ (Perceptual Evaluation of Speech Quality) to calculate MOS scores for different audio samples. I tested PESQ by providing a reference and a ...

mohammadjavad taheri's user avatar

mohammadjavad taheri

asked Feb 24 at 12:58

0 votes

0 answers

53 views

Google Cloud Speech To Text Streaming from audio input in Python

Iam having trouble with Streaming Speech to text using the Google Speech To Text API.It works great transcribing English and return the final_transcript very well. The problem is the other languages ...

David Wayne's user avatar

David Wayne

asked Feb 10 at 13:00

0 votes

0 answers

20 views

can not recognize speech from audio file with Speech framework

I've been using PyObjC for recognizing text from audio file applying Speech framework. So I check documentation and create this small script, but he was returning an error. What's my error? import ...

Алексндр Босов's user avatar

Алексндр Босов

asked Feb 1 at 18:44

1 vote

0 answers

65 views

Speech Recognition Profile Training using SAPI 5.4

I am developing a c++ DLL that is used by a c# app via interop. The DLL's purpose is to train the "Default" speech recognition profile. The c# app only sends Training Text and receives ...

Slip's user avatar

Slip

asked Oct 6, 2024 at 21:25

2 votes

2 answers

413 views

SFSpeechRecognitionResult discards previous transcripts when making long pauses

I've encountered same problem which is described in this thread. Since iOS 18, when I use SFSpeechAudioBufferRecognitionRequest, returned not final SFSpeechRecognitionResult discards previously ...

Robert Dresler's user avatar

Robert Dresler

11.2k

asked Sep 20, 2024 at 6:06

0 votes

0 answers

166 views

Get lyrics of song by google speech to text

In my Nodejs server am using Google's speech to text API to get the lyrics of song, but it doesn't seem to work well with music. I loose most part of words, so my question is, does this api work with ...

Armen Sanoyan's user avatar

Armen Sanoyan

2,072

asked Sep 19, 2024 at 12:37

1 vote

0 answers

97 views

How to Integrate Google Cloud Speech-to-Text with Pusher in a Laravel Application?

I'm working on a Laravel 11 application where I need to stream audio from the frontend to Google Cloud Speech-to-Text and then broadcast the transcriptions using Pusher. Frontend Code: let ...

Peter's user avatar

Peter

asked Sep 3, 2024 at 13:09

-1 votes

1 answer

99 views

Concatenating buffers from Azure Speech Service tts

I have a huge text, coming from an academic paper, that I want to transform into audio. Because the audio is too big, I split it into 4096 characters chunks. Then I send it to the OpenAi tts api chunk ...

Flavius Biras's user avatar

Flavius Biras

asked Aug 26, 2024 at 16:13

3 votes

1 answer

94 views

How to detect audio retakes using Python?

I have a lot of audio recordings for lectures where I say the same thing multiple times, mostly it's incomplete statements like: "this is the part" (and then retrying) "this is the part ...

Joan Venge's user avatar

Joan Venge

334k

asked Aug 12, 2024 at 12:14

2 votes

0 answers

30 views

LDA is predicting same topics for all data

I'm using the German political speech dataset to train the LDA model. My goal here is to categorize each speech into some topics. But the problem is that the generated topics are too similar, and all ...

Ryu Ahmed's user avatar

Ryu Ahmed

asked Jul 25, 2024 at 18:07

1 vote

0 answers

48 views

How to increase the amount of time Android speech recognition listens for?

I want to increase the amount of time Android speech recognition. I tried these 3 tags but it not working. TAG 1: EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS TAG 2: ...

Akshay Vadchhak's user avatar

Akshay Vadchhak

asked Jun 21, 2024 at 7:32

15 30 50 per page

2 3 4 5

...

64 Next

CollectivesTM on Stack Overflow

PyInstaller executable throws "OSError: [WinError 50] The request is not supported" when using speech_recognition (FLAC error)

Android: Google Recognizer Intent: EXTRA_PREFER_OFFLINE and API 33+

save audio file in iOS 18 instead of iOS 12

Google Speech returning all words ever spoken, instead of just the words from the transcript

How does PESQ handle time alignment for trimmed degraded audio?

Google Cloud Speech To Text Streaming from audio input in Python

can not recognize speech from audio file with Speech framework

Speech Recognition Profile Training using SAPI 5.4

SFSpeechRecognitionResult discards previous transcripts when making long pauses

Get lyrics of song by google speech to text

How to Integrate Google Cloud Speech-to-Text with Pusher in a Laravel Application?

Concatenating buffers from Azure Speech Service tts

How to detect audio retakes using Python?

LDA is predicting same topics for all data

How to increase the amount of time Android speech recognition listens for?

Hot Network Questions