Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

streaming http server #3306

glaszig started this conversation in General
Discussion options

hi folks. for a week i've been dabbling with cpp for the first time since school because i wanted to make the server in the examples capable of streaming transcribed segments instead of having to wait for the entire thing. i wanted to leverage the new_segment_callback and got quite far but it seems i cannot figure out some pointer and/or memory management-related stuff.

i wrote a transcriber class which launches in a detached thread. it gets added to the new_segment_callback_user_data to be able to send a segment back to the transcriber which adds it to it's std::queue and then takes it off the queue and writes it to httplib's DataSink instance which you get by using httplib::Response::set_chunked_content_provider().

it works for the first segment but crashes during the second:

# build
cmake --build build --target whisper-server
# run
lldb build/bin/whisper-server -- --model ggml-medium-q5_0.bin -l auto -pr -pc
# upload mp3
curl 127.0.0.1:8080/inference -F file="@call.mp3" -F response_format=json -v
[00:00:00.000 --> 00:00:15.000] [Ringtone]

crash:

whisper server listening at http://127.0.0.1:8080
Received request: call.mp3
Successfully loaded call.mp3
system_info: n_threads = 4 / 8 | WHISPER : COREML = 0 | OPENVINO = 0 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | ACCELERATE = 1 | REPACK = 1 |
operator(): processing 'call.mp3' (2476800 samples, 154.8 sec), 4 threads, 1 processors, lang = auto, task = transcribe, timestamps = 1 ...
Running whisper.cpp inference on call.mp3
set_chunked_content_provider
Transcriber::stream
Transcriber::getNextData
whisper_full_with_state: auto-detected language: de (p = 0.989747)
Transcriber::handleSegment
stream_new_segment 1
stream_new_segment 2
stream: [00:00:00.000 --> 00:00:15.000] [Ringtone]
Transcriber::publishSegment
Transcriber::handleSegment
stream_new_segment 1
stream_new_segment 2
stream: [00:00:15.000 --> 00:00:17.000] Hello?
Transcriber::publishSegment
Process 43124 stopped
* thread #13, stop reason = EXC_BAD_ACCESS (code=1, address=0x2e6a0001aee12bcc)
 frame #0: 0x00000001aeeba7cc libsystem_pthread.dylib`pthread_mutex_lock + 12
libsystem_pthread.dylib`pthread_mutex_lock:
-> 0x1aeeba7cc <+12>: ldr x8, [x0]
 0x1aeeba7d0 <+16>: mov w9, #0x545a
 0x1aeeba7d4 <+20>: movk w9, #0x4d55, lsl #16
 0x1aeeba7d8 <+24>: cmp x8, x9
Target 0: (whisper-server) stopped.
(lldb)

i can't figure out what's wrong with the mutex. must be some threading/scoping issue. maybe somebody with actual experience wants to help me out...

code: https://github.com/glaszig/whisper.cpp/tree/server-streaming
diff: https://github.com/ggml-org/whisper.cpp/compare/master...glaszig:server-streaming?expand=1

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant

AltStyle によって変換されたページ (->オリジナル) /