-
Notifications
You must be signed in to change notification settings - Fork 4.6k
-
hi folks. for a week i've been dabbling with cpp for the first time since school because i wanted to make the server in the examples capable of streaming transcribed segments instead of having to wait for the entire thing. i wanted to leverage the new_segment_callback
and got quite far but it seems i cannot figure out some pointer and/or memory management-related stuff.
i wrote a transcriber class which launches in a detached thread. it gets added to the new_segment_callback_user_data
to be able to send a segment back to the transcriber which adds it to it's std::queue
and then takes it off the queue and writes it to httplib
's DataSink
instance which you get by using httplib::Response::set_chunked_content_provider()
.
it works for the first segment but crashes during the second:
# build cmake --build build --target whisper-server # run lldb build/bin/whisper-server -- --model ggml-medium-q5_0.bin -l auto -pr -pc # upload mp3 curl 127.0.0.1:8080/inference -F file="@call.mp3" -F response_format=json -v [00:00:00.000 --> 00:00:15.000] [Ringtone]
crash:
whisper server listening at http://127.0.0.1:8080
Received request: call.mp3
Successfully loaded call.mp3
system_info: n_threads = 4 / 8 | WHISPER : COREML = 0 | OPENVINO = 0 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | DOTPROD = 1 | ACCELERATE = 1 | REPACK = 1 |
operator(): processing 'call.mp3' (2476800 samples, 154.8 sec), 4 threads, 1 processors, lang = auto, task = transcribe, timestamps = 1 ...
Running whisper.cpp inference on call.mp3
set_chunked_content_provider
Transcriber::stream
Transcriber::getNextData
whisper_full_with_state: auto-detected language: de (p = 0.989747)
Transcriber::handleSegment
stream_new_segment 1
stream_new_segment 2
stream: [00:00:00.000 --> 00:00:15.000] [Ringtone]
Transcriber::publishSegment
Transcriber::handleSegment
stream_new_segment 1
stream_new_segment 2
stream: [00:00:15.000 --> 00:00:17.000] Hello?
Transcriber::publishSegment
Process 43124 stopped
* thread #13, stop reason = EXC_BAD_ACCESS (code=1, address=0x2e6a0001aee12bcc)
frame #0: 0x00000001aeeba7cc libsystem_pthread.dylib`pthread_mutex_lock + 12
libsystem_pthread.dylib`pthread_mutex_lock:
-> 0x1aeeba7cc <+12>: ldr x8, [x0]
0x1aeeba7d0 <+16>: mov w9, #0x545a
0x1aeeba7d4 <+20>: movk w9, #0x4d55, lsl #16
0x1aeeba7d8 <+24>: cmp x8, x9
Target 0: (whisper-server) stopped.
(lldb)
i can't figure out what's wrong with the mutex. must be some threading/scoping issue. maybe somebody with actual experience wants to help me out...
code: https://github.com/glaszig/whisper.cpp/tree/server-streaming
diff: https://github.com/ggml-org/whisper.cpp/compare/master...glaszig:server-streaming?expand=1
Beta Was this translation helpful? Give feedback.