Caption Extractors for Podcasts - A Rust CLI tool to extract transcripts from Apple Podcasts on macOS.
- List all episodes with available transcripts in your Apple Podcasts library
- Export transcripts to plain text with optional speaker labels and timestamps
- Batch export all transcripts at once
- Search across all transcripts for keywords
# Clone or navigate to the project cd ~/Developers/experiments/capers # Build and install cargo install --path . # Or just build cargo build --release # Binary will be at: target/release/capers
- macOS (Apple Podcasts desktop app)
- Rust toolchain (for building from source)
# List all episodes that have transcripts capers list # Filter by podcast or episode title capers list --filter "My Favorite Podcast"
Output:
ID EPISODE PODCAST DURATION
------------------------------------------------------------------------------------------
14 Episode Title Here Podcast Name 01:34:14
...
# Export by episode ID (shown in list output) capers export 14 # Export by title search capers export "Episode Title" # Save to file capers export 14 -o transcript.txt # Include timestamps capers export 14 --timestamps # Without speaker labels capers export 14 --no-speakers # Both timestamps and no speakers capers export 14 --timestamps --no-speakers
Output formats:
Default (with speakers):
[SPEAKER_1]
Hello and welcome to the show...
[SPEAKER_2]
Thanks for having me...
With timestamps:
[SPEAKER_1] [0:00] Hello and welcome to the show...
[SPEAKER_2] [0:15] Thanks for having me...
Plain text (no speakers):
Hello and welcome to the show... Thanks for having me...
# Export to default ./transcripts directory capers export-all # Custom output directory capers export-all -o ~/Documents/podcast-transcripts # With timestamps capers export-all --timestamps
Files are named: Podcast Name - Episode Title.txt
# Search for a keyword capers search "artificial intelligence" # Adjust context window (characters around match) capers search "keyword" --context 100
Output:
Podcast Name - Episode Title (3 matches)
------------------------------------------------------------
...context around the first match with keyword highlighted...
...context around the second match...
... and 1 more matches
Total: 3 matches
Apple Podcasts stores transcripts locally in TTML (Timed Text Markup Language) format. capers reads:
-
SQLite database at
~/Library/Group Containers/243LU875E5.groups.com.apple.podcasts/Documents/MTLibrary.sqlite- contains episode metadata and transcript file paths -
TTML cache at
~/Library/Group Containers/243LU875E5.groups.com.apple.podcasts/Library/Cache/Assets/TTML/- contains the actual transcript files
Transcripts are only available for episodes you've opened in Apple Podcasts. The app downloads transcripts on-demand when you view an episode. To get a transcript:
- Open Apple Podcasts
- Navigate to the episode
- Play it or view the transcript in the app
- Now
caperscan access it
- Make sure you've opened some episodes in Apple Podcasts that have transcripts
- Not all podcasts have transcripts - Apple generates them automatically for supported languages
- This tool only works on macOS with the Apple Podcasts app installed
- Make sure you've used Apple Podcasts at least once
- The transcript may not be cached yet - open the episode in Apple Podcasts first
- The cache may have been cleared - reopen the episode in Apple Podcasts
MIT