Name	Name	Last commit message	Last commit date
Latest commit History 4 Commits
src	src
.env.example	.env.example
.gitignore	.gitignore
CLAUDE.md	CLAUDE.md
IMPLEMENTATION_PLAN.md	IMPLEMENTATION_PLAN.md
README.md	README.md
requirements.txt	requirements.txt
transcribe.py	transcribe.py

ThaiTranscriber

A minimal Python CLI tool for transcribing Thai audio files using the Typhoon ASR API.

Features

CLI interface for transcribing Thai audio files
Support for multiple audio formats (.wav, .mp3, .flac, .ogg, .opus)
Configurable output formats (plain text or JSON with metadata)
Environment-based configuration
Comprehensive error handling and logging
Optimized for Thai language transcription

Requirements

Python 3.11 or higher
Typhoon ASR API key (get from https://playground.opentyphoon.ai/asr)

Installation

Clone or download this repository:

git clone <repository-url>
cd ThaiTranscriber

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Configure your API key:

# Copy the example environment file
cp .env.example .env
# Edit .env and add your Typhoon ASR API key
# Get your API key from: https://playground.opentyphoon.ai/asr

Usage

Important: Always activate the virtual environment before running the script:

source venv/bin/activate

Or run directly with the venv Python:

./venv/bin/python transcribe.py --file audio.wav

Basic Transcription

Transcribe an audio file to text:

python transcribe.py --file audio.wav

Transcription JSON output goes to transcriptions/ and summaries to summaries/.

JSON Output with Metadata

Save transcription with metadata in JSON format:

python transcribe.py --file audio.mp3 --output-format json

Save Both Formats

Generate both text and JSON outputs:

python transcribe.py --file audio.wav --output-format both

Custom Output Path

Specify a custom output file:

python transcribe.py --file audio.wav --output transcript.txt

Custom Output Directory

Save to a specific directory:

python transcribe.py --file audio.wav --output-dir ./transcriptions/

Advanced Options

# Use a custom .env file
python transcribe.py --file audio.wav --env-file production.env
# Override language setting
python transcribe.py --file audio.wav --language th
# Adjust temperature for sampling
python transcribe.py --file audio.wav --temperature 0.0
# Enable debug logging
python transcribe.py --file audio.wav --log-level DEBUG
# Quiet mode (no console output)
python transcribe.py --file audio.wav --quiet

Configuration

Environment Variables

Create a .env file in the project directory with the following variables:

Variable	Required	Default	Description
`TYPHOON_API_KEY`	Yes	-	Your Typhoon ASR API key
`TYPHOON_BASE_URL`	No	`https://api.opentyphoon.ai/v1`	API endpoint
`TYPHOON_MODEL`	No	`typhoon-asr-realtime`	Model name
`TYPHOON_LANGUAGE`	No	`th`	Language code (Thai)
`TYPHOON_RESPONSE_FORMAT`	No	`json`	API response format
`TYPHOON_TEMPERATURE`	No	`0.0`	Sampling temperature (0.0-1.0)
`TYPHOON_ENABLE_TIMESTAMPS`	No	`true`	Enable word-level timestamps
`TYPHOON_ENABLE_WORD_CONFIDENCE`	No	`true`	Enable confidence scores
`TYPHOON_LOG_LEVEL`	No	`INFO`	Logging level

Command-Line Arguments

All configuration can be overridden via command-line arguments:

python transcribe.py --help

Project Structure

ThaiTranscriber/
├── transcribe.py # Main CLI entry point (requires venv)
├── src/
│ ├── __init__.py # Package initialization
│ ├── client.py # Typhoon ASR API client wrapper
│ ├── config.py # Configuration management
│ └── utils.py # Utility functions
├── transcriptions/ # JSON transcription outputs (gitignored)
├── summaries/ # Summary and translation documents (gitignored)
├── venv/ # Python virtual environment (gitignored)
├── requirements.txt # Python dependencies
├── .env.example # Environment configuration template
├── .gitignore # Git ignore rules
└── README.md # This file

Supported Audio Formats

WAV (.wav)
MP3 (.mp3)
FLAC (.flac)
OGG (.ogg)
OPUS (.opus)

Output Formats

Text Format (.txt)

Plain text transcription:

สวัสดีครับ ยินดีต้อนรับ

JSON Format (.json)

Transcription with metadata:

{
 "text": "สวัสดีครับ ยินดีต้อนรับ",
 "language": "th",
 "duration": 2.5
}

Error Handling

The tool provides clear error messages for common issues:

Missing API Key: Prompts to configure TYPHOON_API_KEY
Authentication Errors: Validates API key
Rate Limits: Informs about API rate limits (100 requests/minute)
Invalid Audio Format: Lists supported formats
File Not Found: Validates file paths
Network Errors: Reports timeout and connection issues

Logging

Logging is configured to show:

Timestamp
Module name
Log level
Message

Available log levels:

DEBUG: Detailed diagnostic information
INFO: General information (default)
WARNING: Warning messages
ERROR: Error messages

Set via environment variable or command-line:

python transcribe.py --file audio.wav --log-level DEBUG

API Information

Provider: OpenTyphoon AI
Endpoint: https://api.opentyphoon.ai/v1
Model: typhoon-asr-realtime
Rate Limit: 100 requests per minute
Documentation: https://docs.opentyphoon.ai/th/asr/

Getting an API Key

Visit https://playground.opentyphoon.ai/asr
Sign up or log in
Generate an API key
Add it to your .env file

Best Practices

For Best Transcription Accuracy

Audio Quality: Use high-quality audio recordings
Format: WAV or FLAC for best quality
Sample Rate: 16kHz or higher recommended
Background Noise: Minimize background noise
Temperature: Keep at 0.0 for deterministic results

For Large Files

Check API documentation for file size limits
Consider splitting very long audio files
Use appropriate timeouts for large files

Troubleshooting

"pip not found"

Use pip3 instead:

pip3 install -r requirements.txt

"TYPHOON_API_KEY environment variable is required"

Verify .env file exists in project directory
Check that TYPHOON_API_KEY is set in .env
Ensure no typos in variable name
Verify no extra spaces around the API key

"Authentication failed"

Get a new API key from https://playground.opentyphoon.ai/asr
Update your .env file
Ensure the API key is copied correctly

"Rate limit exceeded"

Wait 60 seconds before making more requests. The API allows 100 requests per minute.

"Invalid audio format"

Ensure your audio file is in a supported format: .wav, .mp3, .flac, .ogg, or .opus

License

This project is provided as-is for use with the Typhoon ASR API.

Credits

Typhoon ASR API: OpenTyphoon AI (https://opentyphoon.ai)
OpenAI SDK: Used for API communication

Support

For issues related to:

This tool: Check the troubleshooting section above
Typhoon ASR API: Visit https://docs.opentyphoon.ai/th/asr/
API access: Contact OpenTyphoon AI support

Folders and files

Latest commit

History

Repository files navigation

ThaiTranscriber

Features

Requirements

Installation

Usage

Basic Transcription

JSON Output with Metadata

Save Both Formats

Custom Output Path

Custom Output Directory

Advanced Options

Configuration

Environment Variables

Command-Line Arguments

Project Structure

Supported Audio Formats

Output Formats

Text Format (.txt)

JSON Format (.json)

Error Handling

Logging

API Information

Getting an API Key

Best Practices

For Best Transcription Accuracy

For Large Files

Troubleshooting

"pip not found"

"TYPHOON_API_KEY environment variable is required"

"Authentication failed"

"Rate limit exceeded"

"Invalid audio format"

License

Credits

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages