Pi-first post-production tooling for podcast and video-podcast editors who want transcripts, chapters, clip candidates, cuts, and social review exports from local media.
- Launches
piwith podcast-specific skills, prompts, and a startup widget. - Transcribes local audio/video files with Whisper-compatible backends.
- Prepares long transcripts into timecoded chunks for editorial review.
- Scans video episodes for likely interstitials and non-host inserts.
- Generates chapters, clip candidates, cut reports, show notes, quotes, and proper noun checks.
- Cuts selected highlight ranges into review exports for TikTok, Reels, YouTube Shorts, trailers, or social posts.
- Uploads finished episodes to YouTube with metadata composed from analysis artifacts.
Generated transcripts, scans, thumbnails, notes, and clip exports go under gitignored dist/ by default.
Clone the repo and install the local Node tooling:
git clone https://github.com/modem-dev/podguy.git
cd podguy
npm installInstall the required system tools:
brew install uv ffmpeg
Notes:
ffmpegis used for fixtures, transcription backends, and clip cutting.uvruns the Python scripts and optional transcription dependency groups. The firstuvcommand creates a local.venv/and may download Python automatically — this is normal and only happens once.- The video scanner is macOS-only and uses Swift / AVFoundation / Vision. Everything else works cross-platform (Linux: install
uvandffmpegvia your package manager).
Set up a real transcription backend when you are ready to transcribe episodes:
uv sync --group transcribe-mlx # Apple Silicon uv sync --group transcribe-faster # Cross-platform uv sync --group transcribe-whisper # OpenAI Whisper package
Create an optional show profile:
cp podguy.example.toml podguy.toml
Start podguy from the repo root:
./podguy
On first launch, type /login inside pi to connect your model provider (or use your usual API key setup).
Then ask pi for a concrete episode task:
Analyze "episode-006-draft.mp4" as ep006.
Generate chapters for ep006 in timestamp-title format.
Find likely TikTok/Shorts clips for ep006 and cut vertical review exports.
For broad requests, podguy should clarify between:
- quick pass: optional video scan + transcript + prepared transcript artifacts + short summary
- full review: quick pass + chapters + clips + cuts + show notes + quotes + proper noun review
You don't need to memorize these — pi runs them for you when you ask in natural language. They're here for reference and debugging.
swift scripts/scan_podcast.swift "episode-006-draft.mp4" dist/analysis/ep006/scan 0.5
open dist/analysis/ep006/scan/report.htmlKey outputs:
interstitial_candidates.csvnon_host_candidates.csvreport.htmlthumbs/
Scanner results are heuristic review aids, not exact edit points.
uv run python scripts/transcribe_video.py --list-backends
uv run --group transcribe-mlx python scripts/transcribe_video.py \
"episode-006-draft.mp4" \
dist/analysis/ep006/transcript \
--backend mlx-whisperKey outputs:
segments.jsontranscript.txttranscript.srttranscript.vttsummary.txt
Use --backend mock only for tests and setup validation.
uv run python scripts/prepare_transcript_analysis.py \ dist/analysis/ep006/transcript \ --output-dir dist/analysis/ep006 \ --slug ep006 \ --plain-output-names
Key outputs:
dist/analysis/ep006/transcript_chunks.mddist/analysis/ep006/transcript_index.json
These are the main inputs for chaptering and editorial analysis.
After pi writes dist/analysis/ep006/clips.md, cut original-aspect review exports:
uv run python scripts/cut_clips.py \
"episode-006-draft.mp4" \
dist/analysis/ep006/clips.md \
dist/analysis/ep006/clips/cutsFor simple vertical Shorts/TikTok/Reels review exports:
uv run python scripts/cut_clips.py \
"episode-006-draft.mp4" \
dist/analysis/ep006/clips.md \
dist/analysis/ep006/clips/shorts \
--aspect vertical \
--pad-start 1 \
--pad-end 1The cutter writes generated media plus manifest.json. Vertical and square modes use center-crop framing, so treat them as review exports unless the framing has been checked.
One-time setup: create a Google Cloud project with the YouTube Data API v3 enabled, create a Desktop-app OAuth client, and save the downloaded JSON to ~/.config/podguy/youtube/client_secret.json. On the OAuth consent screen, keep the app in Testing mode and add your own Google account as a test user — otherwise the auth flow fails with access_denied. Then authenticate:
uv sync --group youtube uv run --group youtube python scripts/youtube_publish.py auth
Upload an episode (private by default; use --dry-run first to preview the request):
uv run --group youtube python scripts/youtube_publish.py upload \ "episode-006-final.mp4" \ --title "Ep 6: Why this market flipped" \ --description-file dist/analysis/ep006/youtube-description.md \ --chapters-file dist/analysis/ep006/chapters.md
Other subcommands cover thumbnails, SRT captions, playlists, scheduled publishing (--publish-at), status checks, and metadata updates. Defaults like privacy, category, tags, and a description footer come from the [youtube] section of podguy.toml.
Each upload costs 1600 of the default 10000 daily YouTube API quota units, and videos uploaded through unverified API projects may stay locked private until the project passes a YouTube API audit.
Use the Cordkillers open-license video-podcast excerpt for local evaluation:
scripts/download_sample_media.sh
This writes to:
dist/test-fixtures/open-license/cordkillers-572/
The default sample is a 3m50s excerpt from 00:08:00 of Cordkillers 572, licensed CC BY-SA 4.0. The range includes multiple podcast layouts, lower thirds, chat/sidebar graphics, a Patreon bumper, and an outro/interstitial card. The script writes ATTRIBUTION.md next to the generated media.
podguy.toml lets you define show-specific context without changing the workflow:
show_name = "Example Podcast" show_slug = "example" hosts = ["Host One", "Host Two"] tone = "curious, direct, practical" audience = "builders and technical operators" chapter_style = "concise descriptive titles" preferred_review = "quick_pass"
podcast.toml is also accepted as a compatible profile name.
podguy: launcher for pi with repo-local skills, prompts, and startup extension.src/podguy-post-production/SKILL.md: main editorial workflow skill.src/podguy-clip-cutter/SKILL.md: social clip export workflow skill.src/podguy-youtube-publisher/SKILL.md: YouTube upload workflow skill.src/podguy-startup.ts: pi startup widget.prompts/: optional prompt shortcuts.scripts/: deterministic scanner, transcript, prep, fixture, sample, clip-cutting, and YouTube publishing tools.tests/: smoke tests wrapped by Vitest.AGENTS.md: repo guidance for coding agents.
Run the full validation surface:
npm run format:check
npm run lint
npm run typecheck
npm testRun the shell smoke tests directly:
bash scripts/test.sh
CI runs the same checks on macOS because the scanner depends on macOS media APIs.
Small, focused PRs are welcome. Before opening a PR, run the validation commands above.
For workflow or heuristic changes, include:
- media type and OS/backend details
- expected vs actual output
- relevant transcript, scan, or manifest paths when available
See CHANGELOG.md for user-visible changes and AGENTS.md for repo maintenance guidance.
This repo does not have a published security policy yet. If you find a sensitive issue, do not open a public issue. Contact the maintainers privately first.
Sponsored by Modem.
MIT. See LICENSE.
Use GitHub issues for bugs, questions, and workflow discussion.