@remotion/install-whisper-cpp
Available from v4.0.115
With Whisper.cpp, you can transcribe audio locally on your machine.
This package provides easy to use cross-platform functions to install Whisper.cpp and a model.
- npm
- bun
- pnpm
- yarn
npm i --save-exact @remotion/install-whisper-cpp@4.0.398
pnpm i @remotion/install-whisper-cpp@4.0.398
bun i @remotion/install-whisper-cpp@4.0.398
yarn --exact add @remotion/install-whisper-cpp@4.0.398
Also update
remotion and all `@remotion/*` packages to the same version.Remove all
^ character in front of the version numbers of it as it can lead to a version conflict.Example usage
Install Whisper 1.5.5 (the latest version at the time of writing that we find works well and supports token-level timestamps) and the medium.en model to the whisper.cpp folder.
install-whisper.cpptsximportpath from'path';import {downloadWhisperModel ,installWhisperCpp ,> import transcribe">transcribe ,convertToCaptions } from'@remotion/install-whisper-cpp';constto =path .join (process .cwd (), 'whisper.cpp');awaitinstallWhisperCpp ({to ,version : '1.5.5',});awaitdownloadWhisperModel ({model : 'medium.en',folder :to ,});// Convert the audio to a 16KHz wav file first if needed:// import {execSync} from 'child_process';// execSync('ffmpeg -i /path/to/audio.mp4 -ar 16000 /path/to/audio.wav -y');const {transcription } =await({ inputPath, whisperPath, whisperCppVersion, model, modelFolder, translateToEnglish, tokenLevelTimestamps, printOutput, tokensPerItem, language, splitOnWord, signal, onProgress, flashAttention, additionalArgs, }: { inputPath: string; whisperPath: string; whisperCppVersion: string; model: WhisperModel; tokenLevelTimestamps: true; modelFolder?: string; translateToEnglish?: boolean; printOutput?: boolean; tokensPerItem?: undefined; language?: Language | null; splitOnWord?: boolean; signal?: AbortSignal; onProgress?: TranscribeOnProgress; flashAttention?: boolean; additionalArgs?: AdditionalArgs; }): Promise > import transcribe">transcribe({model : 'medium.en',whisperPath :to ,whisperCppVersion : '1.5.5',inputPath : '/path/to/audio.wav',tokenLevelTimestamps : true,});for (consttoken oftranscription ) {console .log (token .timestamps .from ,token .timestamps .to ,token .text );}// Optional: Apply our recommended postprocessingconst {captions } =convertToCaptions ({transcription ,combineTokensWithinMilliseconds : 200,});for (constline ofcaptions ) {console .log (line .text ,line .startInSeconds );}Functions
License
MIT
See also