Feature Request: Add SenseVoice/FunASR as STT option #40

Open

Description

opened

Hi! Verbi's modular architecture for experimenting with different STT/LLM/TTS components is excellent.

I'd like to suggest adding SenseVoice as a new STT option. It fits Verbi's modular philosophy well:

Why SenseVoice:

Integration example:

from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall")
result = model.generate(input="audio.wav")
text = result[0]["text"]

Or via OpenAI-compatible API:

funasr-server --device cuda
# POST http://localhost:8000/v1/audio/transcriptions

Would be a great addition to the existing Deepgram/AssemblyAI/Groq STT options.

No one assigned

No labels

No projects

No milestone

None yet

No branches or pull requests