Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

code2ahm/crawlscope

Repository files navigation

CrawlScope logo

CrawlScope

See it. Fix it. Ship it.

Free, instant website auditing — SEO · Performance · Accessibility · Technical Health · Core Web Vitals


CrawlScope


CrawlScope screenshot


What is CrawlScope?

CrawlScope is a lightweight, opinionated website auditing tool. Enter a URL — get a prioritised, actionable audit report in under 30 seconds. No accounts, no database, no subscriptions, no saved history. Just scan and fix.

It runs a real Lighthouse audit inside a headless Chromium browser via Puppeteer, combines it with deep HTML analysis via Cheerio, and surfaces findings across 5 categories with severity-ranked fixes.


Features

  • 50+ checks across SEO, Performance, Accessibility, Technical Health, and Content Structure
  • Real Lighthouse scores — Performance, SEO, Accessibility, Best Practices
  • Core Web Vitals — LCP, CLS, INP, TTFB, FCP with thresholds and recommendations
  • Priority Fixes — every issue includes why it matters, how to fix it, and expected impact
  • Desktop & mobile screenshots captured during the live scan
  • Page metadata — title, H1, description, word count, technologies detected
  • Export to Markdown, HTML, PDF (print), and JSON
  • Zero auth · Zero database · Zero tracking

Tech Stack

Frontend

Technology Version Purpose
Next.js ^15.0.0 App Router, SSR, API Routes
React ^18.3.1 UI framework
TypeScript ^5.5.4 Type safety throughout
Tailwind CSS ^3.4.10 Utility-first styling
Framer Motion ^11.3.0 Animations & transitions
Lucide React ^0.460.0 Icon library

Backend / Scanning

Technology Version Purpose
Lighthouse ^12.2.0 Performance, SEO, A11y, Best Practices scoring
Puppeteer ^25.1.0 Headless Chromium — screenshots + page load
Cheerio ^1.0.0 Fast HTML parsing for SEO & content checks

UI Primitives

Technology Purpose
@radix-ui/react-tabs Accessible tab components
@radix-ui/react-progress Progress indicators
@radix-ui/react-tooltip Tooltip primitives
clsx + tailwind-merge Conditional class merging

Project Structure

crawlscope/
├── src/
│ ├── app/
│ │ ├── api/
│ │ │ └── scan/
│ │ │ └── route.ts # POST /api/scan — runs Lighthouse + Puppeteer + Cheerio
│ │ ├── features/
│ │ │ └── page.tsx # /features page
│ │ ├── docs/
│ │ │ └── page.tsx # /docs page
│ │ ├── api-page/
│ │ │ └── page.tsx # /api-page (API reference)
│ │ ├── privacy/
│ │ │ └── page.tsx # /privacy page
│ │ ├── globals.css
│ │ ├── layout.tsx # Root layout with Geist font
│ │ └── page.tsx # Landing page + scan orchestration
│ ├── components/
│ │ ├── shared/
│ │ │ ├── Nav.tsx # Sticky nav with active link state
│ │ │ └── Footer.tsx # Shared footer
│ │ ├── LandingPage.tsx # Hero, URL input, feature cards, mock dashboard
│ │ ├── ScanLoader.tsx # 12-step animated progress screen
│ │ ├── ResultsDashboard.tsx # Full audit report with exports
│ │ ├── ScoreRing.tsx # Animated SVG score ring
│ │ ├── VitalsSection.tsx # Core Web Vitals cards
│ │ ├── AuditTabs.tsx # SEO / Perf / A11y / Technical / Content tabs
│ │ ├── PriorityFixCard.tsx # Expandable fix card with why/fix/impact
│ │ ├── CheckRow.tsx # Pass / warn / fail check row
│ │ └── ScreenshotsSection.tsx # Desktop / mobile screenshot toggle
│ ├── lib/
│ │ ├── scanner.ts # Core scan engine (Puppeteer + Lighthouse + Cheerio)
│ │ └── utils.ts # cn(), scoreColor(), vitalStatusClasses(), etc.
│ └── types/
│ └── audit.ts # All TypeScript interfaces for AuditReport
├── public/
├── next.config.mjs
├── tailwind.config.ts
├── tsconfig.json
└── package.json

Getting Started

Prerequisites

  • Node.js 18.x or higher
  • npm / yarn / pnpm

Installation

# Clone the repo
git clone https://github.com/code2ahm/crawlscope.git
cd crawlscope
# Install dependencies
npm install
# Puppeteer downloads Chromium automatically on install (~170 MB)
# If it fails, run: npx puppeteer browsers install chrome

Development

npm run dev

Open http://localhost:3000.

Production Build

npm run build
npm run start

How a Scan Works

User submits URL
 │
 ▼
POST /api/scan
 │
 ├── 1. Puppeteer launches headless Chromium
 │ • Navigates to URL (networkidle2)
 │ • Captures desktop screenshot (1280px)
 │ • Captures mobile screenshot (390px)
 │ • Measures load time + page size
 │
 ├── 2. Lighthouse runs on the open Chrome port
 │ • Performance score
 │ • SEO score
 │ • Accessibility score
 │ • Best Practices score
 │ • Core Web Vitals (LCP, CLS, TBT→INP, TTFB, FCP)
 │ • 20+ individual audits (render-blocking, WebP, etc.)
 │
 ├── 3. Cheerio parses the raw HTML
 │ • Title, meta description, H1–H6 structure
 │ • Canonical, OG tags, JSON-LD, lang attribute
 │ • Image alt text coverage
 │ • Internal link count
 │ • Form label coverage
 │ • Technology fingerprinting
 │
 └── 4. Results assembled into AuditReport
 • Overall score (weighted average)
 • Priority fixes ranked by severity
 • 50+ pass/warn/fail checks across 5 categories
 • Returned as JSON to the client

API Reference

The scan endpoint is a standard Next.js Route Handler.

POST /api/scan

Request body

{
 "url": "https://example.com"
}

Response — success

{
 "success": true,
 "report": {
 "url": "https://example.com",
 "domain": "example.com",
 "scannedAt": "2025年06月07日T10:30:00.000Z",
 "scanDuration": 18420,
 "overall": 84,
 "lighthouse": {
 "performance": 71,
 "seo": 92,
 "accessibility": 68,
 "bestPractices": 87
 },
 "vitals": { "lcp": {...}, "cls": {...}, "inp": {...}, "ttfb": {...}, "fcp": {...} },
 "stats": { "critical": 2, "warnings": 4, "passed": 31, "total": 37 },
 "priorityFixes": [...],
 "seoChecks": [...],
 "performanceChecks": [...],
 "accessibilityChecks": [...],
 "technicalChecks": [...],
 "contentChecks": [...],
 "screenshots": { "desktop": "data:image/jpeg;base64,...", "mobile": "data:image/jpeg;base64,..." },
 "meta": {
 "title": "Example Domain",
 "description": "...",
 "h1": "Example Domain",
 "wordCount": 312,
 "loadTime": 1840,
 "pageSize": 48,
 "statusCode": 200,
 "technologies": ["Next.js", "Google Analytics"]
 }
 }
}

Response — error

{
 "success": false,
 "error": "Invalid URL format"
}

Status codes

Code Meaning
200 Scan completed successfully
400 Missing or invalid URL
500 Scan failed (timeout, unreachable host, etc.)

Note: Scans typically take 15–45 seconds depending on page complexity and server speed. The route handler timeout is set to 120 seconds (maxDuration = 120).


Export Formats

From the results dashboard, every report can be exported as:

Format Contents
Markdown Full report — scores table, all checks by category, priority fixes, vitals
HTML Standalone styled page, no dependencies, renders in any browser
PDF Opens print dialog in a new tab, formatted for A4 via @page CSS
JSON Raw AuditReport object, pretty-printed — useful for integrations

Deployment

Vercel (recommended)

npm i -g vercel
vercel

Set the function timeout in vercel.json for the scan route:

{
 "functions": {
 "src/app/api/scan/route.ts": {
 "maxDuration": 120
 }
 }
}

Important: Puppeteer requires a Chromium binary. On Vercel, use @sparticuz/chromium with puppeteer-core instead of the full puppeteer package to stay within the serverless function size limit.

Self-hosted / VPS

Works out of the box with standard puppeteer as long as the server has:

  • Node.js 18+
  • Chrome dependencies: chromium-browser or equivalent system packages
# Ubuntu / Debian — install Chrome dependencies
sudo apt-get install -y \
 libgbm-dev libnss3 libatk-bridge2.0-0 libdrm2 \
 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 \
 libgbm1 libasound2
npm run build && npm start

Docker

FROM node:18-slim
# Install Chromium dependencies
RUN apt-get update && apt-get install -y \
 chromium libgbm-dev libnss3 libatk-bridge2.0-0 \
 libdrm2 libxcomposite1 libxdamage1 libxfixes3 \
 libxrandr2 libasound2 --no-install-recommends \
 && rm -rf /var/lib/apt/lists/*
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]

Environment Variables

No environment variables are required to run CrawlScope. All scanning happens server-side at request time.

Optional variables for future extension:

# .env.local (optional)
NEXT_PUBLIC_SITE_URL=https://crawlscope.vercel.app

Audit Categories

Category Checks Scoring weight
SEO 12 checks — title, meta desc, H1, canonical, JSON-LD, OG tags, lang, robots, internal links, HTTPS 30%
Performance 11 checks — render-blocking, WebP, compression, caching, minification, unused JS/CSS 30%
Accessibility 12 checks — alt text, contrast, ARIA, button names, form labels, skip nav 20%
Best Practices Lighthouse native best-practices score 20%
Technical 9 checks — HTTPS, status code, viewport, deprecated HTML, vulnerable libraries Info
Content 6 checks — title/desc length, heading order, word count, internal linking Info

Overall score = Performance ×ばつ 0.3 + SEO ×ばつ 0.3 + Accessibility ×ばつ 0.2 + BestPractices ×ばつ 0.2


Contributing

Pull requests are welcome. For major changes, open an issue first.

# Fork and clone
git clone https://github.com/your-username/crawlscope.git
cd crawlscope
# Create a feature branch
git checkout -b feat/your-feature
# Make changes, then
npm run lint
npm run build
# Push and open a PR
git push origin feat/your-feature

Roadmap

  • Vercel-compatible deployment with @sparticuz/chromium
  • Batch URL scanning
  • Shareable report URLs (short-lived, no database)
  • CLI mode (npx crawlscope https://example.com)
  • GitHub Action for CI auditing
  • Score trend comparison (re-scan diff)

About

Free website SEO, performance & accessibility audit tool powered by Lighthouse. No signup required.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /