Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

codeIntrovert/GDG-APL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

28 Commits

Repository files navigation

Set Up and Run OpenAI’s Whisper for Speech Recognition & Sentiment Analysis

This project utilizes OpenAI’s Whisper for speech recognition and applies sentiment analysis on transcribed text. Whisper is a powerful automatic speech recognition (ASR) system capable of transcribing and understanding audio input with high accuracy.

Skill Tags

Python, OpenAI Whisper, Speech Recognition, Sentiment Analysis, Natural Language Processing (NLP), Deep Learning, Audio Processing, Automation, Machine Learning, Voice-to-Text, Real-time Transcription, Command Line Tools, AI

Relevant Links

Visual Studio Code: https://code.visualstudio.com/ Python: https://www.python.org/downloads/ Homebrew: https://brew.sh/


Getting Started

Follow these steps to set up Whisper and perform speech recognition and sentiment analysis.

1. Install Dependencies (500MB required)

Whisper requires several dependencies, including PyTorch and ffmpeg. Install them before proceeding.

For Windows (using Chocolatey)

First, install Chocolatey if you haven’t (using admin shell):

Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))

Then, install ffmpeg using:

choco install ffmpeg

For Mac/Linux

Use Homebrew:

brew install ffmpeg

2. Clone the Repository

Download the project from GitHub:

git clone https://github.com/codeIntrovert/GDG-APL
cd ./GDG-APL/

3. Set Up Python Environment

Create a virtual environment to manage dependencies:

py -m venv env

Activate the virtual environment:

  • Windows:
    env\Scripts\activate
  • Mac/Linux:
    source env/bin/activate

4. Install Required Packages

pip install -r requirements.txt

5. Set Up Your API Key

To securely store your API key, create a .env file in the project directory and add:

API_KEY=your_gemini_api_key_here

Usage

Speech Recognition with Whisper

Run the following command to transcribe an audio file:

py src/whisper_main.py

Project Structure

πŸ“‚ whisper-project
β”œβ”€β”€ πŸ“‚ env/ # Virtual environment
β”œβ”€β”€ πŸ“‚ models/ # Whisper models (optional)
β”œβ”€β”€ πŸ“‚ data/ # Audio files & transcripts
β”œβ”€β”€ πŸ“œ whisper_transcribe.py # Speech-to-text script
β”œβ”€β”€ πŸ“œ sentiment_analysis.py # Sentiment analysis script
β”œβ”€β”€ πŸ“œ requirements.txt # Dependencies
β”œβ”€β”€ πŸ“œ README.md # Documentation
└── πŸ“œ .env # API keys (if needed)

Troubleshooting

Common Issues and Fixes

❌ ModuleNotFoundError: No module named 'whisper'

βœ… Ensure you have installed Whisper:

pip install openai-whisper

❌ ffmpeg not found

βœ… Ensure ffmpeg is installed and available in your system’s PATH.
Try running:

ffmpeg -version

If not found, reinstall using Chocolatey (Windows) or Homebrew (Mac/Linux).

❌ API Errors with Sentiment Analysis

βœ… Ensure you are using a correct API key if required (e.g., for GPT-based sentiment analysis). Store it securely in .env.


Additional Resources


License

This project is licensed under the MIT License.


[Work in Progress]

This project is actively being developed, and additional features will be added soon.


Happy Coding! πŸŽ€βž‘οΈπŸ“œπŸ€–

About

Reference Codes for GDG SSTC AI Premier League Workshop

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /