A minimal Python CLI tool for transcribing Thai audio files using the Typhoon ASR API.
- CLI interface for transcribing Thai audio files
- Support for multiple audio formats (.wav, .mp3, .flac, .ogg, .opus)
- Configurable output formats (plain text or JSON with metadata)
- Environment-based configuration
- Comprehensive error handling and logging
- Optimized for Thai language transcription
- Python 3.11 or higher
- Typhoon ASR API key (get from https://playground.opentyphoon.ai/asr)
- Clone or download this repository:
git clone <repository-url> cd ThaiTranscriber
- Create and activate a virtual environment:
python3 -m venv venv
source venv/bin/activate- Install dependencies:
pip install -r requirements.txt
- Configure your API key:
# Copy the example environment file cp .env.example .env # Edit .env and add your Typhoon ASR API key # Get your API key from: https://playground.opentyphoon.ai/asr
Important: Always activate the virtual environment before running the script:
source venv/bin/activateOr run directly with the venv Python:
./venv/bin/python transcribe.py --file audio.wav
Transcribe an audio file to text:
python transcribe.py --file audio.wav
Transcription JSON output goes to transcriptions/ and summaries to summaries/.
Save transcription with metadata in JSON format:
python transcribe.py --file audio.mp3 --output-format json
Generate both text and JSON outputs:
python transcribe.py --file audio.wav --output-format both
Specify a custom output file:
python transcribe.py --file audio.wav --output transcript.txt
Save to a specific directory:
python transcribe.py --file audio.wav --output-dir ./transcriptions/
# Use a custom .env file python transcribe.py --file audio.wav --env-file production.env # Override language setting python transcribe.py --file audio.wav --language th # Adjust temperature for sampling python transcribe.py --file audio.wav --temperature 0.0 # Enable debug logging python transcribe.py --file audio.wav --log-level DEBUG # Quiet mode (no console output) python transcribe.py --file audio.wav --quiet
Create a .env file in the project directory with the following variables:
| Variable | Required | Default | Description |
|---|---|---|---|
TYPHOON_API_KEY |
Yes | - | Your Typhoon ASR API key |
TYPHOON_BASE_URL |
No | https://api.opentyphoon.ai/v1 |
API endpoint |
TYPHOON_MODEL |
No | typhoon-asr-realtime |
Model name |
TYPHOON_LANGUAGE |
No | th |
Language code (Thai) |
TYPHOON_RESPONSE_FORMAT |
No | json |
API response format |
TYPHOON_TEMPERATURE |
No | 0.0 |
Sampling temperature (0.0-1.0) |
TYPHOON_ENABLE_TIMESTAMPS |
No | true |
Enable word-level timestamps |
TYPHOON_ENABLE_WORD_CONFIDENCE |
No | true |
Enable confidence scores |
TYPHOON_LOG_LEVEL |
No | INFO |
Logging level |
All configuration can be overridden via command-line arguments:
python transcribe.py --help
ThaiTranscriber/
├── transcribe.py # Main CLI entry point (requires venv)
├── src/
│ ├── __init__.py # Package initialization
│ ├── client.py # Typhoon ASR API client wrapper
│ ├── config.py # Configuration management
│ └── utils.py # Utility functions
├── transcriptions/ # JSON transcription outputs (gitignored)
├── summaries/ # Summary and translation documents (gitignored)
├── venv/ # Python virtual environment (gitignored)
├── requirements.txt # Python dependencies
├── .env.example # Environment configuration template
├── .gitignore # Git ignore rules
└── README.md # This file
- WAV (.wav)
- MP3 (.mp3)
- FLAC (.flac)
- OGG (.ogg)
- OPUS (.opus)
Plain text transcription:
สวัสดีครับ ยินดีต้อนรับ
Transcription with metadata:
{
"text": "สวัสดีครับ ยินดีต้อนรับ",
"language": "th",
"duration": 2.5
}The tool provides clear error messages for common issues:
- Missing API Key: Prompts to configure
TYPHOON_API_KEY - Authentication Errors: Validates API key
- Rate Limits: Informs about API rate limits (100 requests/minute)
- Invalid Audio Format: Lists supported formats
- File Not Found: Validates file paths
- Network Errors: Reports timeout and connection issues
Logging is configured to show:
- Timestamp
- Module name
- Log level
- Message
Available log levels:
DEBUG: Detailed diagnostic informationINFO: General information (default)WARNING: Warning messagesERROR: Error messages
Set via environment variable or command-line:
python transcribe.py --file audio.wav --log-level DEBUG
- Provider: OpenTyphoon AI
- Endpoint: https://api.opentyphoon.ai/v1
- Model: typhoon-asr-realtime
- Rate Limit: 100 requests per minute
- Documentation: https://docs.opentyphoon.ai/th/asr/
- Visit https://playground.opentyphoon.ai/asr
- Sign up or log in
- Generate an API key
- Add it to your
.envfile
- Audio Quality: Use high-quality audio recordings
- Format: WAV or FLAC for best quality
- Sample Rate: 16kHz or higher recommended
- Background Noise: Minimize background noise
- Temperature: Keep at 0.0 for deterministic results
- Check API documentation for file size limits
- Consider splitting very long audio files
- Use appropriate timeouts for large files
Use pip3 instead:
pip3 install -r requirements.txt
- Verify
.envfile exists in project directory - Check that
TYPHOON_API_KEYis set in.env - Ensure no typos in variable name
- Verify no extra spaces around the API key
- Get a new API key from https://playground.opentyphoon.ai/asr
- Update your
.envfile - Ensure the API key is copied correctly
Wait 60 seconds before making more requests. The API allows 100 requests per minute.
Ensure your audio file is in a supported format: .wav, .mp3, .flac, .ogg, or .opus
This project is provided as-is for use with the Typhoon ASR API.
- Typhoon ASR API: OpenTyphoon AI (https://opentyphoon.ai)
- OpenAI SDK: Used for API communication
For issues related to:
- This tool: Check the troubleshooting section above
- Typhoon ASR API: Visit https://docs.opentyphoon.ai/th/asr/
- API access: Contact OpenTyphoon AI support