Native Rust browser automation CLI for Chrome/Chromium via CDP. gsd-browser keeps a persistent background daemon, auto-starts on first use, and exposes 90+ top-level commands for navigation, interaction, authenticated live viewing, annotations, recording bundles, snapshots with versioned refs, assertions, structured extraction, network control, visual diffing, tracing, and stateful auth flows.
Built for AI agents, CI pipelines, and developers who want deterministic browser control without adopting a full browser test framework.
gsd-browser mcp is now a first-class, extremely powerful browser automation platform for agents. It exposes 50+ tools, live resources (real snapshot/refs/state/timeline data), and executable prompts over local stdio or hosted Streamable HTTP (Model Context Protocol).
Completed advancements include:
- Full coverage of the rich surface: versioned refs + multiple snapshot modes, semantic
act+find_best, advanced forms, robust assertions + waits,browser_batchfor atomic flows, live viewer + full human collaboration (takeover, annotations, goal banners, step/abort/pause/resume, sensitive mode), first-class recording & evidence bundles, visual regression, HAR/trace/PDF export, network mocking & blocking, device emulation, encrypted auth vault + state save/restore, structured extraction, prompt injection scanning, action cache for long-term self-healing, multi-tab/frame management, rich diagnostics (debug_bundleetc.), and more. - Resources that actually query the daemon for live context (
gsd-browser://latest-snapshot,current-state,active-recordings,timeline, etc.). - Executable prompts encoding best-practice multi-step workflows (
robust_login_flow,full_page_audit,autonomous_research_task,evidence_creation_workflow,debug_stuck_agent_flow, etc.). - Standardized high-value response envelopes on every tool call:
summary,structured_data,suggested_next_actions,evidence_refs. - Seamless reuse of the proven daemon client (auto-start, named sessions for isolation + persistent cache/state, robust error handling).
This is designed to be the high-end browser backend for serious agent platforms.
Get started in seconds:
gsd-browser mcp
Point Cursor, Claude Desktop, VS Code + Copilot, or any MCP client at it.
Host it for remote/cloud clients:
export GSD_BROWSER_MCP_AUTH_TOKEN="$(openssl rand -hex 32)" gsd-browser mcp --http --host 0.0.0.0 --port 8788
Expose /mcp over HTTPS and send Authorization: Bearer <token> from the remote MCP client.
To use OpenGSD console tokens and usage tallying:
export GSD_BROWSER_MCP_AUTH_VERIFY_URL="https://mcp.opengsd.dev/api/mcp/tokens/verify" gsd-browser mcp --http --host 0.0.0.0 --port 8788
Each tools/call request is validated against the console, counted against the user's free quota, and rejected when the console returns a throttle response.
Tailored setup + config snippets:
./scripts/mcp-quickstart.sh cursor # claude | vscode | genericKey documentation:
- docs/mcp.md — Full capabilities, architecture, client configs, quickstart script.
- docs/AGENT-BEST-PRACTICES.md — Golden rules, workflow patterns, "When to Use What" table, self-healing, response envelopes, prompt/resource usage (essential reading for agents).
- docs/examples/mcp-client-config.json — Ready-to-paste example.
- Root SKILL.md and the
gsd-browser-skill/pack — Complete underlying command semantics and curated workflows (the MCP tools are a direct mapping).
Run gsd-browser mcp and unleash one of the most powerful browser surfaces available for agentic work.
npm install -g @opengsd/gsd-browser
Download from GitHub Releases:
| Platform | Asset |
|---|---|
| macOS (Apple Silicon) | gsd-browser-darwin-arm64 |
| macOS (Intel) | gsd-browser-darwin-x64 |
| Linux (ARM64) | gsd-browser-linux-arm64 |
| Linux (x64) | gsd-browser-linux-x64 |
| Windows (x64) | gsd-browser-windows-x64.exe |
git clone https://github.com/open-gsd/gsd-browser.git
cd gsd-browser
cargo install --path cliInstall the CLI and register the Codex Plugin in one pass:
curl -fsSL https://raw.githubusercontent.com/open-gsd/gsd-browser/main/install.sh | bash -s -- --codex-pluginThe installer writes the plugin to ~/plugins/gsd-browser, updates the personal Codex marketplace at ~/.agents/plugins/marketplace.json, and runs codex plugin add gsd-browser@<marketplace> when the Codex CLI is available. Without --codex-plugin, the interactive installer offers OpenAI Codex Plugin alongside the agent skill options.
The crates.io package (gsd-browser) is not published yet. Use GitHub release assets or a source build.
The one-line installer (curl -fsSL https://raw.githubusercontent.com/open-gsd/gsd-browser/main/install.sh | bash) also sets up the gsd-browser-skill/ pack for coding agents and documents the MCP path in its header.
The daemon starts automatically on first use.
# Navigate to a page gsd-browser navigate https://example.com # Snapshot interactive elements and assign refs like @v1:e1 gsd-browser snapshot # On example.com the only interactive element is the "More information..." link gsd-browser click-ref @v1:e1 # Wait for navigation and assert the result gsd-browser wait-for --condition network_idle gsd-browser assert --checks '[{"kind":"url_contains","text":"iana.org"}]' # Capture a PNG gsd-browser screenshot --output page.png --format png
For the modern agent experience, prefer the MCP server (see top of this README).
gsd-browser view starts an authenticated localhost workbench for the active session. The URL is bound to the session, viewer id, loopback origin, expiry, and viewer capabilities. Use view --print-only when another tool needs the URL.
gsd-browser view gsd-browser view --print-only gsd-browser control-state gsd-browser takeover gsd-browser release-control gsd-browser sensitive-on gsd-browser sensitive-off
The viewer streams the real Chrome page, forwards pointer, wheel, keyboard, text, and paste input while in Control mode, creates annotations in Annotate mode, and starts/stops local recording bundles in Record mode. Sensitive mode keeps local human control available while cloud frame capture and evidence surfaces use redaction policy.
Annotations and recordings stay local to the daemon state directory:
gsd-browser annotations gsd-browser annotation-get <id> gsd-browser annotation-clear <id> gsd-browser annotation-resolve <id> gsd-browser annotation-export --output annotations.json gsd-browser record-start --name checkout-bug gsd-browser record-stop gsd-browser recordings gsd-browser recording-get <id> gsd-browser recording-export <id> --output <path> gsd-browser recording-discard <id> gsd-browser recording-validate <id-or-path> --json
(MCP equivalents: browser_view, browser_annotation_request, browser_record_start, etc. — among the highest-leverage features for collaborative agent work.)
gsd-browser currently exposes 90+ top-level commands (the MCP server exposes the most valuable subset as 50+ discoverable tools with agent-optimized descriptions and envelopes):
| Area | Commands |
|---|---|
| Navigation | navigate, back, forward, reload |
| Logs & JavaScript | console, network, dialog, eval |
| Interaction | click, type, press, hover, scroll, select-option, set-checked, drag, set-viewport, upload-file |
| Inspection | accessibility-tree, find, page-source |
| Waits | wait-for |
| Snapshots & refs | snapshot, get-ref, click-ref, hover-ref, fill-ref |
| Assertions & batching | assert, diff, batch |
| Pages & frames | list-pages, switch-page, close-page, list-frames, select-frame |
| Forms & semantic actions | analyze-form, fill-form, find-best, act |
| Live workbench | goal, view, control-state, takeover, release-control, pause, resume, step, abort, sensitive-on, sensitive-off |
| Annotations | annotations, annotation-get, annotation-clear, annotation-resolve, annotation-export, annotation-request |
| Recording bundles | record-start, record-stop, record-pause, record-resume, recordings, recording-get, recording-export, recording-discard, recording-validate |
| Diagnostics | timeline, session-summary, debug-bundle |
| Screenshots & document output | screenshot, zoom-region, save-pdf |
| Visual regression | visual-diff |
| Structured extraction | extract |
| Network control | mock-route, block-urls, clear-routes |
| Device & browser state | emulate-device, save-state, restore-state |
| Auth vault | vault-save, vault-login, vault-list |
| Recording & traces | generate-test, har-export, trace-start, trace-stop |
| Safety, caching & daemon management | action-cache, check-injection, daemon |
| MCP, cloud & updates | mcp, cloud-methods, update |
- Persistent daemon with automatic startup for fast repeated commands
- Durable named sessions with explicit health reporting and no silent session replacement
- Versioned refs from
snapshotfor deterministic interaction (@v1:e1,@v2:e3) - Explicit assertions with
assertand multi-step automation withbatch - Shared inspection semantics across
snapshot,find,wait-for,assert, and ref-driven actions - Semantic
find-bestandactflows covering 15 built-in intents - Named sessions via
--sessionfor isolated parallel browser workers - Authenticated local viewer with human takeover, pause/step/abort, annotations, sensitive mode, and bounded recording bundles
- Structured JSON output on every command via
--json - Visual diffing, HAR export, PDF generation, and CDP tracing in the same tool
- Saved browser state plus encrypted credential replay through the auth vault
- Prompt injection scanning for agent-facing browsing workflows
- Action cache for self-healing intent mappings across sessions (especially powerful with named MCP sessions)
- Full MCP server with resources, prompts, and agent-optimized envelopes
gsd-browser merges configuration in this order:
- Built-in defaults
- User config:
~/.gsd-browser/config.toml - Project config:
./gsd-browser.toml - Environment variables:
GSD_BROWSER_* - CLI flags
Example gsd-browser.toml:
[browser] path = "/usr/bin/chromium" cdp_url = "http://localhost:9222" # attach to existing Chrome instead of launching headless = true [daemon] port = 9222 host = "127.0.0.1" [screenshot] quality = 90 format = "png" full_page = false [settle] timeout_ms = 500 poll_ms = 40 quiet_window_ms = 100 [logs] max_buffer_size = 1000 [artifacts] dir = "./browser-artifacts" [timeline] enabled = true max_entries = 500
Supported environment variable overrides use GSD_BROWSER_<SECTION>_<FIELD> naming:
export GSD_BROWSER_BROWSER_PATH=/usr/bin/chromium export GSD_BROWSER_BROWSER_CDP_URL=http://localhost:9222 export GSD_BROWSER_BROWSER_HEADLESS=true export GSD_BROWSER_DAEMON_PORT=9333 export GSD_BROWSER_SCREENSHOT_QUALITY=90 export GSD_BROWSER_SETTLE_TIMEOUT_MS=1000 export GSD_BROWSER_ARTIFACTS_DIR=./browser-artifacts export GSD_BROWSER_VAULT_KEY=your-encryption-key
For MCP usage, place the relevant GSD_BROWSER_* variables in your MCP client's server env configuration.
gsd-browser defaults to the stable chromiumoxide CDP client for maximum compatibility.
gsd-browser --stealth navigate https://bot.sannysoft.com # or gsd-browser --backend stealth navigate ... # config.toml [browser] stealth = true backend = "chaser-oxide" # or "stealth", "chromey"
Effects when enabled:
- Anti-detection Chrome flags (
--disable-blink-features=AutomationControlled, IsolateOrigins, etc.) - Realistic UA + hardware (cores, memory, platform) spoofing
- CDP signal patches (webdriver, cdc_ markers, chrome object, permissions, WebGL)
- Client Hints and locale/language consistency
- (Future) human-like mouse curves via input_dispatch when chaser-oxide backend active
The following require explicit cargo features (the published binary always ships the stable default):
chromiumoxide-backend(default) — current stablechromey-backend— fresher CDP definitions, adblock, fingerprint crate (drop-in, sameuse chromiumoxide)chaser-backend/stealthfeature — protocol-level stealth,ChaserPagehuman input (compile with--features stealth)ferrous-backend— ergonomic locator/wait API (launch path experimental)
To build with an alternative:
cargo install --path cli --no-default-features --features chromey-backend
# or for full stealth
cargo install --path cli --no-default-features --features stealthSee also the audit and superpowers plans for the "dependency/stealth refresh" item.
Trade-off: stealth backends may lag the main chromiumoxide feature surface or have different perf characteristics. The daemon handlers/refs/viewer remain unchanged regardless of backend.
- The CLI parses commands and sends them to a local daemon over a loopback HTTP channel.
- The daemon maintains the browser lifecycle, page/frame routing, network hooks, action timeline, and session manifest state.
--session <name>creates isolated daemon and browser instances for parallel workflows.- The MCP stdio server (
gsd-browser mcp) is a thin, high-fidelity adapter over the exact same daemon client used by the CLI.
Recommended 2026+ path: Connect via the MCP server (gsd-browser mcp). It gives you automatic discovery of 50+ tools, resources, and prompts with rich envelopes and best-practice guidance. See the dedicated sections at the top of this README, plus docs/mcp.md and especially docs/AGENT-BEST-PRACTICES.md.
When using the CLI directly (or for reference):
- The daemon auto-starts. You almost never need
gsd-browser daemon start. gsd-browser daemon healthreports the current session state and does not auto-start the daemon.- Use
--jsonwhen you need structured output. - Prefer
snapshotthenclick-reforfill-reffor stable interaction, and re-snapshot after page changes. (MCP: read thelatest-snapshotresource.) - Use
assertandbatchwhen you need deterministic pass/fail automation. find-bestandactcover 15 built-in semantic intents for common navigation, form, dialog, auth, and pagination actions.- The live viewer + annotations + recordings + human takeover are first-class superpowers for collaborative or auditable work.
- Read SKILL.md for the full command reference and workflow patterns (this is the source of truth for MCP tool semantics).
- Install the curated
gsd-browser-skill/pack (via the main installer) for coding agents.
Licensed under either of:
at your option.