Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

OpenClaw skill for scraping any URL using the Decodo Web Scraping API.

Notifications You must be signed in to change notification settings

Decodo/decodo-openclaw-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

9 Commits

Repository files navigation

Decodo Scraper OpenClaw Skill

Python Version License powered by Decodo GitHub Repo stars

Overview

This OpenClaw skill integrates Decodo's Web Scraping API into any OpenClaw-compatible AI agent or LLM pipeline. It exposes these tools that agents can call directly:

Tool Description Perfect for
google_search Real-time Google Search (SERP) results as structured JSON. Market research, competitor analysis, news monitoring, fact-checking, RAG pipelines.
universal Scrape & parse any public webpage into clean Markdown. Summarizing articles, content aggregation, building custom datasets, general web browsing for AI agents.
amazon Parse Amazon product page data (price, reviews, specs, ASIN). eCommerce monitoring, price tracking, competitive intelligence, product research.
amazon_search Search Amazon for products by keyword and get parsed results. Discovering products, tracking trends, and broad market analysis.
youtube_subtitles Extract subtitles/transcripts from YouTube videos (by video ID). Video summarization, content analysis, sentiment tracking, accessibility.
reddit_post Fetch a Reddit post's content, comments, and metadata (by post URL). Social listening, community sentiment analysis, trend tracking, and gathering user feedback.
reddit_subreddit Scrape Reddit subreddit listings (by subreddit URL). Monitoring specific communities, content discovery, niche market research.

Backed by Decodo's residential and datacenter proxy infrastructure, the skill handles JavaScript rendering, bot detection bypass, and geo-targeting out of the box.

Why use Decodo for your OpenClaw agent?

  • Zero blocks & CAPTCHAs. Backed by Decodo's proxy infrastructure from 125M+ locations, the skill automatically handles JavaScript rendering, bot detection, and CAPTCHA bypass.
  • Real-time data. Access fresh, up-to-the-minute web data directly within your AI agent's workflow.
  • LLM-optimized output. Data is returned in structured JSON or clean Markdown, making it easy for LLMs to understand and process.
  • Scalability. Designed for high-volume data collection, ensuring your agent can scale from small tasks to complex projects.
  • Minimal Friction. Easy setup with a single authentication token.

Features

  • Real-time Google Search results scraping
  • Universal URL scraping
  • Amazon product page parsing (by URL)
  • Amazon search (by query)
  • YouTube subtitles/transcript by video ID
  • Reddit post content by URL
  • Reddit subreddit listing by URL
  • Structured JSON or Markdown results
  • Simple CLI interface compatible with any OpenClaw agent runtime
  • Designed for scalable AI agent web scraping
  • Minimal dependencies — just Python with Requests
  • Authentication via a single Base64 token from the Decodo dashboard

Prerequisites

Setup

  1. Clone this repo.
git clone https://github.com/Decodo/decodo-openclaw-skill.git
  1. Install dependencies.
pip install -r requirements.txt
  1. Set your Decodo auth token as an environment variable (or create a .env file in the project root):
# Linux/macOS Terminal
export DECODO_AUTH_TOKEN="your_base64_token"
# Windows (PowerShell)
$env:DECODO_AUTH_TOKEN="your_base64_token"
# .env file
DECODO_AUTH_TOKEN=your_base64_token

OpenClaw agent integration

This skill ships with a SKILL.md file that defines all tools in the OpenClaw skill format. OpenClaw-compatible agents automatically discover and invoke the tools from this file without additional configuration.

To register the skill with your OpenClaw agent, point it at the repo root — the agent will read SKILL.md and expose google_search, universal, amazon, amazon_search, youtube_subtitles, reddit_post, and reddit_subreddit as callable tools.

Usage

Google Search

Search Google and receive structured JSON. Results are grouped by type: organic (main results), ai_overviews (AI-generated summaries), paid (ads), related_questions, related_searches, discussions_and_forums, and others depending on the query.

python3 tools/scrape.py --target google_search --query "your query"

Scrape a URL

Fetch and convert any webpage to a clean Markdown file:

python3 tools/scrape.py --target universal --url "https://example.com/article"

Amazon product page

Fetch parsed data from an Amazon product page (e.g., ads, product details). Use the product URL:

python3 tools/scrape.py --target amazon --url "https://www.amazon.com/dp/B09H74FXNW"

Amazon search

Search Amazon and get parsed results (e.g., results list, delivery_postcode):

python3 tools/scrape.py --target amazon_search --query "laptop"

YouTube subtitles

Fetch subtitles/transcript for a YouTube video (use the video ID, e.g., from ?v=VIDEO_ID):

python3 tools/scrape.py --target youtube_subtitles --query "dFu9aKJoqGg"

Reddit post

Fetch a Reddit post’s content (use the full post URL):

python3 tools/scrape.py --target reddit_post --url "https://www.reddit.com/r/nba/comments/17jrqc5/serious_next_day_thread_postgame_discussion/"

Reddit subreddit

Fetch a Reddit subreddit listing (use the subreddit URL):

python3 tools/scrape.py --target reddit_subreddit --url "https://www.reddit.com/r/nba/"

Related resources

Decodo Web Scraping API documentation

OpenClaw documentation

ClaWHub – OpenClaw skill registry

License

All code is released under the MIT License.

Releases

No releases published

Packages

No packages published

Contributors 3

Languages

AltStyle によって変換されたページ (->オリジナル) /