Name	Name	Last commit message	Last commit date
Latest commit History 3 Commits
bin	bin
docs	docs
scripts	scripts
skills/websource	skills/websource
src	src
tests	tests
.env.example	.env.example
.gitignore	.gitignore
CLAUDE.md	CLAUDE.md
CONTRIBUTING.md	CONTRIBUTING.md
LICENSE	LICENSE
README.md	README.md
package-lock.json	package-lock.json
package.json	package.json
tsconfig.json	tsconfig.json
vitest.config.ts	vitest.config.ts

Name

Last commit message

Last commit date

Latest commit

History

bin

src

websource

Node.js License: MIT

Turn websites into reusable structured data sources — through conversation, not code.

websource is a local-first CLI tool that analyzes a URL, detects extractable fields (title, price, image, date...), and generates a reusable extraction config that runs on demand or on a schedule. All data stays on your machine in SQLite.

Quick start

git clone https://github.com/2fe2000/websource.git
cd websource
npm install
npx playwright install chromium
# Interactive setup wizard
npx tsx bin/websource.ts init https://books.toscrape.com

Commands

Command	Description
`init [url]`	Guided setup for a new data source
`scan <url>`	Analyze a page without saving
`sources list`	List all sources
`sources show <id>`	Show source details
`preview <id>`	Dry-run extraction (no DB write)
`extract <id>`	Run extraction and save
`diff <id>`	Show changes since last run
`schedule <id> <expr>`	Set a cron refresh schedule
`serve`	Start local REST API + scheduler
`export <id>`	Export to JSON/CSV
`doctor`	Run health checks

Claude Code skill (optional)

If you use Claude Code, you can run the interactive wizard from any chat session with /websource or natural language like "scrape this URL".

bash scripts/install-skill.sh

Configuration

All config is optional. Copy .env.example to .env to override defaults:

Variable	Default	Description
`WEBSOURCE_DATA_DIR`	`~/.local/share/websource`	Database and log location
`WEBSOURCE_CONFIG_DIR`	`~/.config/websource`	Config file location
`LOG_LEVEL`	`warn`	`trace` / `debug` / `info` / `warn` / `error`

Data storage

All extracted data is stored locally in a single SQLite database:

~/.local/share/websource/
├── websource.db ← all data
└── logs/ ← log files (production mode only)

Table	Contents
`sources`	Source list (name, URL, status)
`extraction_configs`	Field selectors, fetchMode, and other settings
`runs`	Extraction run history (time, record counts, status)
`snapshots`	The actual extracted records
`diffs`	Added / changed / removed records between runs
`schedules`	Cron schedule settings

Export extracted data:

# JSON
npx tsx bin/websource.ts export <sourceId> --format json
# CSV
npx tsx bin/websource.ts export <sourceId> --format csv
# REST API
npx tsx bin/websource.ts serve
# GET http://localhost:3847/sources/:id/data

Change the storage location — add to .env:

WEBSOURCE_DATA_DIR=/your/custom/path

Architecture

Node.js + TypeScript (ESM, strict)
Cheerio for static HTML parsing, Playwright for JS-rendered pages
SQLite (better-sqlite3) for all local persistence
Fastify for the local REST API
node-cron for scheduling

See docs/ARCHITECTURE.md for details.

Documentation

Contributing

See CONTRIBUTING.md.

License

MIT — see LICENSE.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

choism4/websource

Folders and files

Latest commit

History

Repository files navigation

websource

Quick start

Commands

Claude Code skill (optional)

Configuration

Data storage

Architecture

Documentation

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

websource

Quick start

Commands

Claude Code skill (optional)

Configuration

Data storage

Architecture

Documentation

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages