Screen Studio-style effects for screen recordings, from the command line — auto-zoom on clicks, automatic speed-up of idle time, keystroke overlays, cursor smoothing, and a 9:16 vertical export that follows the action. Headless, scriptable: Python + ffmpeg + a single-file Swift event logger. No app, no subscription.
macOS-only, end to end. This is not just the logger:
render.pyencodes with Apple'sh264_videotoolbox,polish.pyloads/System/Library/Fonts/Helvetica.ttc, and the capture path relies onscreencaptureand the Swiftevents-loglogger — all macOS. It is not portable to Linux/Windows as written.
Screen Studio's signature effects need to know where you clicked and what you typed. Pixels alone can't tell you that — so this tool has two halves:
events-log(Swift) runs while you record and logs cursor positions (60 Hz), clicks, and keystrokes to a JSONL file. It drops all key events while macOS secure input is active (password fields, sudo) and is meant to run only for the duration of a recording.polish.py(Python) post-processes the recording using that log:
| Flag | Effect | Needs events? |
|---|---|---|
--speedup |
Compress idle spans (no input AND frozen pixels — a playing animation is never compressed) | no (pixel-only fallback) |
--zoom |
Eased auto-zoom onto click clusters | yes |
--keys |
Accumulating keystroke chips, rendered like real typing | yes |
--smooth-cursor |
Replace the jittery real cursor with an eased synthetic one (record cursor-free for best results) | yes |
--vertical |
Additional ×ばつ1920 output whose crop window follows the action | best with |
pip install -r requirements.txt # numpy + Pillow (both required) brew install ffmpeg # H.264 encode/decode (system dependency, not pip) swiftc -O events-log.swift -o events-log # build the Swift event logger (once)
numpy is required: render.py — the renderer every export shells out to — uses it for
per-frame camera math, backdrop gradients, and click-sound mixing. ffmpeg is a system
dependency (install via Homebrew), not a pip package.
# build the logger (once) swiftc -O events-log.swift -o events-log # record anything (screencapture, OBS, ScreenCaptureKit...) while logging: ./events-log demo.events.jsonl & LOGGER=$! screencapture -v -V 30 demo.mov # or any recorder kill $LOGGER # polish: python3 polish.py demo.mov --events demo.events.jsonl \ --speedup --zoom --keys --vertical # -> demo.polished.mov + demo.polished.vertical.mov (1080x1920)
--speedup alone works on any existing screen recording — no event log needed.
Requirements: macOS (see the macOS-only note above), Python 3 with numpy + Pillow
(pip install -r requirements.txt), and ffmpeg. The --speedup-only polish.py path
itself uses no numpy, but the headline renderer (render.py) does, so numpy is required
for any export.
Pairs with macos-screen-recorder-system-audio (sck-record --no-cursor) for system-audio capture and cursor-free footage for --smooth-cursor.
render.py is a non-destructive, single-pass camera renderer — the Screen Studio
approach. The recording is never modified; a render spec (zoom regions + ease ramp +
fps + aspect) drives a virtual camera that zooms/pans over the original high-res
frames, sampled with LANCZOS, exported to a smaller target — so a zoom still
reads ≥1:1 source pixels and stays crisp (measured ×ばつ sharper than cropping a finished
video, more with native-retina capture). Easing is cosine ease-in/out per region
(smooth, predictable, no overshoot), tuned by --ramp (ease-in/out duration in seconds),
output at 60fps, and it does horizontal (16:9/1:1) and 9:16 vertical
(--aspect 9:16, zoom + follow) from the same code.
python3 render.py SRC.mp4 --regions regions.json --out out.mp4 \
--aspect 16:9 --fps 60 --ramp 0.5 --cursor --speedup
# vertical: same command with --aspect 9:16regions.json = [{"t0","t1","z","cx","cy"}]. (polish.py is the older ffmpeg-filter
path — it does the --speedup/idle-detection work and needs no numpy itself, but it is
not a substitute for render.py: every Studio export and every high-quality zoom/pan
render goes through render.py, which requires numpy.)
python3 studio.py [recording.mp4] opens a local web UI: an NLE-style timeline with a
fixed ruler (bar width = source duration; edits never rescale it, so upstream content is
always planted). Auto-detected zoom regions are draggable blocks — drag to move, drag
edges to retime, scroll over one to set its zoom level, double-click an empty track to add,
double-click a block to delete (or select + Delete key). Idle spans are auto-detected as the
intersection of input-gaps and frozen pixels (no keyboard/mouse input and no pixel
change — a playing animation is never flagged idle) and become
speed blocks: their source range is locked (which footage is sped never changes) but
their rate is editable — select one for an inspector speed slider, or drag its right
edge = rate-stretch (FCP retime handle / Premiere Rate Stretch). Changing a rate ripples
everything downstream along the planted ruler; empty track grows at the right as the cut
gets shorter. Configurable ease transition, default zoom, and frame styling (backdrop
gradient + padding + rounded corners + drop shadow). Preview is live canvas compositing;
export uses render.py and outputs 60fps at chosen aspect. Synthetic cursor is always
smooth; click ripple + a real recorded click sound (CC0, freesound #735771) are included.
OS-chosen free port, all local.
The logger needs Accessibility / Input Monitoring for clicks + keys (System Settings → Privacy & Security); without the grant it degrades to cursor-move sampling. The recorder needs Screen Recording. This is a demo-production tool: run the logger only while recording, and treat the events file like the recording itself.
make-fixture.py synthesizes a fake screen recording with scripted cursor travel, clicks, typing, and idle spans, plus a ground-truth events.jsonl — every feature is validated against known truth:
python3 make-fixture.py fixture.mp4 fixture.events.jsonl python3 polish.py fixture.mp4 --events fixture.events.jsonl --speedup --zoom --keys --vertical
MIT