@remotion/whisper-web

warning

Unstable API: This package is experimental for the moment. As we test it, we might make a few changes to the API and switch to a WebGPU-based backend in the future.

Similar to @remotion/install-whisper-cpp but for the browser. Allows you to transcribe audio locally in the browser, with the help of WASM.

Installation

npm
bun
pnpm
yarn

npm i --save-exact @remotion/whisper-web@4.0.398

pnpm i @remotion/whisper-web@4.0.398

bun i @remotion/whisper-web@4.0.398

yarn --exact add @remotion/whisper-web@4.0.398

This assumes you are currently using v4.0.398 of Remotion.
Also update remotion and all `@remotion/*` packages to the same version.
Remove all ^ character in front of the version numbers of it as it can lead to a version conflict.

Required configuration

@remotion/whisper-web uses a WebAssembly (WASM) backend that requires SharedArrayBuffer support. To enable this functionality, you need to configure cross-origin isolation headers in your application. This is a security requirement for using SharedArrayBuffer in modern browsers.

See: Important Considerations

Vite

To use @remotion/whisper-web with Vite, you need to make two important changes:

Exclude the package from Vite's dependency optimization to prevent known issues
Configure the required security headers for SharedArrayBuffer support

vite.config.ts
tsx
import {defineConfig} from'vite';
exportdefaultdefineConfig({
optimizeDeps: {
// turn off dependency optimization: https://github.com/vitejs/vite/issues/11672#issuecomment-1397855641
 exclude: ['@remotion/whisper-web'],
 },
// required by SharedArrayBuffer
server: {
 headers: {
'Cross-Origin-Embedder-Policy': 'require-corp',
'Cross-Origin-Opener-Policy': 'same-origin',
 },
 },
// ...
});

Example usage

Transcribing with @remotion/whisper-web
tsx
import {import transcribe">transcribe, canUseWhisperWeb, resampleTo16Khz, downloadWhisperModel} from'@remotion/whisper-web';
constfile=newFile([], 'audio.wav');
constmodelToUse='tiny.en';
const {supported, detailedReason} =awaitcanUseWhisperWeb(modelToUse);
if (!supported) {
thrownewError(`Whisper Web is not supported in this environment: ${detailedReason}`);
}
console.log('Downloading model...');
awaitdownloadWhisperModel({
 model: modelToUse,
onProgress: ({progress}) => console.log(`Downloading model (${Math.round(progress*100)}%)...`),
});
console.log('Resampling audio...');
constchannelWaveform=awaitresampleTo16Khz({
 file,
onProgress: (p) => console.log(`Resampling audio (${Math.round(p*100)}%)...`),
});
console.log('Transcribing...');
const {transcription} =awaitimport transcribe">transcribe({
 channelWaveform,
 model: modelToUse,
onProgress: (p) => console.log(`Transcribing (${Math.round(p*100)}%)...`),
});
console.log(transcription..map(callbackfn: (value: TranscriptionItemWithTimestamp, index: number, array: TranscriptionItemWithTimestamp[]) => string, thisArg?: any): string[]">map((t) => t.text).join(' '));