Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

bvdcode/WhisperCLI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

59 Commits

Repository files navigation

WhisperCLI

WhisperCLI is a command-line tool for transcribing audio from files or microphone input using OpenAI's Whisper speech recognition models via the Whisper.net library.

Features

  • Transcribe audio and video files to subtitles or text
  • Generate subtitles for all media files in a folder
  • Record and transcribe audio directly from microphone
  • Support for various audio and video formats (mp3, mp4, mkv, avi, etc.)
  • Automatic downloading of Whisper models
  • Automatic downloading of FFmpeg
  • Support for different Whisper model sizes (default: LargeV3Turbo)
  • Cross-platform functionality (Windows and Unix)
  • Progress reporting during conversion and transcription

Requirements

  • .NET 9.0

Installation

Using Published Release

Download the latest release from the releases page and extract it to your preferred location.

Building from Source

  1. Clone the repository
  2. Navigate to the Sources directory
  3. Build the project:
    dotnet build
    
  4. Publish the project (optional):
    dotnet publish -c Release
    

Usage

Basic Usage

WhisperCLI [options] [inputFilePath]

Command Line Options

  • -m, --model: Model to use for transcription (default: LargeV3Turbo)
  • -i, --microphone-index: Index of microphone to use for recording (default: 0)
  • -s, --stop-key: Key to stop recording when using microphone input (default: Spacebar)
  • -f, --format: Output format for file/folder transcription: srt, vtt, or txt (default: srt; microphone always writes txt)
  • --folder: Process all media files in the specified folder
  • -r, --recursive: Include subfolders when using --folder
  • inputFilePath: Path to the audio/video file or folder to transcribe (if omitted without --folder, uses microphone input)

Examples

# Transcribe an audio file
WhisperCLI input.mp3
# Transcribe an audio file as plain text
WhisperCLI -f txt input.mp3
# Generate subtitles for all media files in the current folder
WhisperCLI --folder .
# Generate subtitles for all media files in a folder and its subfolders
WhisperCLI --folder "D:\Media" -r
# Transcribe a video file with a specific model
WhisperCLI -m Small video.mp4
# Record from microphone and transcribe
WhisperCLI
# Use a specific microphone (device index 2)
WhisperCLI -i 2
# Use a different key to stop recording (Enter key)
WhisperCLI -s Enter

Available Models

  • TinyEn
  • Tiny
  • BaseEn
  • Base
  • SmallEn
  • Small
  • MediumEn
  • Medium
  • LargeV1
  • LargeV2
  • LargeV3
  • LargeV3Turbo (default)

How It Works

For File Input

  1. The program downloads the specified Whisper model if not already present (stored in your temp directory)
  2. FFmpeg is downloaded automatically if not already present
  3. The input audio/video file is converted to the proper WAV format using FFmpeg
  4. The audio is processed using the Whisper model
  5. The transcription is saved in the selected output format in the same location as the input file

For Folder Input

  1. Pass --folder <path> to process every FFmpeg-detectable media file in that folder
  2. Add -r or --recursive to include subfolders
  3. Each output file is saved next to its source media file

For Microphone Input

  1. The program downloads the specified Whisper model if not already present
  2. Audio is recorded from the selected microphone until the stop key is pressed
  3. The recording is saved as a WAV file in your temp directory
  4. The audio is processed using the Whisper model
  5. The transcription is saved as plain text alongside the recording

Dependencies

License

This project is licensed under the MIT License.

Acknowledgements

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

AltStyle によって変換されたページ (->オリジナル) /