WhisperCLI is a command-line tool for transcribing audio from files or microphone input using OpenAI's Whisper speech recognition models via the Whisper.net library.
- Transcribe audio and video files to subtitles or text
- Generate subtitles for all media files in a folder
- Record and transcribe audio directly from microphone
- Support for various audio and video formats (mp3, mp4, mkv, avi, etc.)
- Automatic downloading of Whisper models
- Automatic downloading of FFmpeg
- Support for different Whisper model sizes (default: LargeV3Turbo)
- Cross-platform functionality (Windows and Unix)
- Progress reporting during conversion and transcription
- .NET 9.0
Download the latest release from the releases page and extract it to your preferred location.
- Clone the repository
- Navigate to the Sources directory
- Build the project:
dotnet build - Publish the project (optional):
dotnet publish -c Release
WhisperCLI [options] [inputFilePath]
-m, --model: Model to use for transcription (default: LargeV3Turbo)-i, --microphone-index: Index of microphone to use for recording (default: 0)-s, --stop-key: Key to stop recording when using microphone input (default: Spacebar)-f, --format: Output format for file/folder transcription:srt,vtt, ortxt(default:srt; microphone always writestxt)--folder: Process all media files in the specified folder-r, --recursive: Include subfolders when using--folderinputFilePath: Path to the audio/video file or folder to transcribe (if omitted without--folder, uses microphone input)
# Transcribe an audio file
WhisperCLI input.mp3
# Transcribe an audio file as plain text
WhisperCLI -f txt input.mp3
# Generate subtitles for all media files in the current folder
WhisperCLI --folder .
# Generate subtitles for all media files in a folder and its subfolders
WhisperCLI --folder "D:\Media" -r
# Transcribe a video file with a specific model
WhisperCLI -m Small video.mp4
# Record from microphone and transcribe
WhisperCLI
# Use a specific microphone (device index 2)
WhisperCLI -i 2
# Use a different key to stop recording (Enter key)
WhisperCLI -s Enter
- TinyEn
- Tiny
- BaseEn
- Base
- SmallEn
- Small
- MediumEn
- Medium
- LargeV1
- LargeV2
- LargeV3
- LargeV3Turbo (default)
- The program downloads the specified Whisper model if not already present (stored in your temp directory)
- FFmpeg is downloaded automatically if not already present
- The input audio/video file is converted to the proper WAV format using FFmpeg
- The audio is processed using the Whisper model
- The transcription is saved in the selected output format in the same location as the input file
- Pass
--folder <path>to process every FFmpeg-detectable media file in that folder - Add
-ror--recursiveto include subfolders - Each output file is saved next to its source media file
- The program downloads the specified Whisper model if not already present
- Audio is recorded from the selected microphone until the stop key is pressed
- The recording is saved as a WAV file in your temp directory
- The audio is processed using the Whisper model
- The transcription is saved as plain text alongside the recording
- Whisper.net
- Xabe.FFmpeg
- NAudio (for microphone recording)
- Serilog
- CommandLineParser
- CUDA runtime support (optional for GPU acceleration)
This project is licensed under the MIT License.