Name	Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows	.github/workflows
src	src
.eslintrc.cjs	.eslintrc.cjs
.gitignore	.gitignore
.npmignore	.npmignore
.npmrc	.npmrc
LICENSE.md	LICENSE.md
README.md	README.md
package-lock.json	package-lock.json
package.json	package.json
prettier.config.cjs	prettier.config.cjs
tsconfig.json	tsconfig.json

🕷️ Crawlio JS SDK

crawlio-js is a Node.js SDK for interacting with the Crawlio web scraping and crawling API. It provides programmatic access to scraping, crawling, and batch processing endpoints with built-in error handling.

Visit Crawlio See Docs

📦 Installation

npm install crawlio.js

🚀 Getting Started

import { Crawlio } from 'crawlio.js'
const client = new Crawlio({ apiKey: 'your-api-key' })
const result = await client.scrape({ url: 'https://example.com' })
console.log(result.html)

🔧 Constructor

`new Crawlio(options: CrawlioOptions)`

Creates a new Crawlio client.

Options:

Name	Type	Required	Description
apiKey	`string`	✅	Your Crawlio API key
baseUrl	`string`	❌	API base URL (default: `https://crawlio.xyz`)

📘 API Methods

`scrape(options: ScrapeOptions): Promise<ScrapeResponse>`

Scrapes a single page.

await client.scrape({ url: 'https://example.com' })

ScrapeOptions:

Name	Type	Required	Description
url	`string`	✅	Target URL
exclude	`string[]`	✅	CSS selectors to exclude
includeOnly	`string[]`	❌	CSS selectors to include
markdown	`boolean`	❌	Convert HTML to Markdown
returnUrls	`boolean`	❌	Return all discovered URLs
workflow	`Workflow[]`	❌	Custom workflow steps to execute
normalizeBase64	`boolean`	❌	Normalize base64 content
cookies	`CookiesInfo[]`	❌	Cookies to include in the request
userAgent	`string`	❌	Custom User-Agent header for the request

`crawl(options: CrawlOptions): Promise<CrawlResponse>`

Initiates a site-wide crawl.

CrawlOptions:

Name	Type	Required	Description
url	`string`	✅	Root URL to crawl
count	`number`	✅	Number of pages to crawl
sameSite	`boolean`	❌	Limit crawl to same domain
patterns	`string[]`	❌	URL patterns to match
exclude	`string[]`	❌	CSS selectors to exclude
includeOnly	`string[]`	❌	CSS selectors to include

`crawlStatus(id: string): Promise<CrawlStatusResponse>`

Checks the status of a crawl job.

`crawlResults(id: string): Promise<{ results: ScrapeResponse[] }>`

Gets results from a completed crawl.

`search(query: string, options?: SearchOptions): Promise<SearchResponse>`

Performs a search on scraped content.

SearchOptions:

Name	Type	Description
site	`string`	Limit search to a specific domain

`batchScrape(options: BatchScrapeOptions): Promise<BatchScrapeResponse>`

Initiates scraping for multiple URLs in one request.

BatchScrapeOptions:

Name	Type	Description
url	`string[]`	List of URLs
options	`Omit<ScrapeOptions, 'url'>`	Common options for all URLs

`batchScrapeStatus(batchId: string): Promise<BatchScrapeStatusResponse>`

Checks the status of a batch scrape job.

`batchScrapeResult(batchId: string): Promise<{ results: { id: string; result: ScrapeResponse } }>`

Fetches results from a completed batch scrape.

🛑 Error Handling

All Crawlio errors extend from CrawlioError. You can catch and inspect these for more context.

Error Types:

CrawlioError
CrawlioRateLimit
CrawlioLimitExceeded
CrawlioAuthenticationError
CrawlioInternalServerError
CrawlioFailureError

📄 Types

`ScrapeResponse`

{
 jobId: string
 html: string
 markdown: string
 meta: Record<string, string>
 urls?: string[]
 url: string
}

`CrawlStatusResponse`

{
 id: string
 status: 'IN_QUEUE' | 'RUNNING' | 'LIMIT_EXCEEDED' | 'ERROR' | 'SUCCESS'
 error: number
 success: number
 total: number
}

`CookiesInfo`

{
 name: string
 value: string
 path: string
 expires?: number
 httpOnly: boolean
 secure: boolean
 domain: string
 sameSite: 'Strict' | 'Lax' | 'None'
}

License

Weekend-Dev-Labs/crawlio-js

Folders and files

Latest commit

History

Repository files navigation

🕷️ Crawlio JS SDK

📦 Installation

🚀 Getting Started

🔧 Constructor

new Crawlio(options: CrawlioOptions)

📘 API Methods

scrape(options: ScrapeOptions): Promise<ScrapeResponse>

crawl(options: CrawlOptions): Promise<CrawlResponse>

crawlStatus(id: string): Promise<CrawlStatusResponse>

crawlResults(id: string): Promise<{ results: ScrapeResponse[] }>

search(query: string, options?: SearchOptions): Promise<SearchResponse>

batchScrape(options: BatchScrapeOptions): Promise<BatchScrapeResponse>

batchScrapeStatus(batchId: string): Promise<BatchScrapeStatusResponse>

batchScrapeResult(batchId: string): Promise<{ results: { id: string; result: ScrapeResponse } }>

🛑 Error Handling

Error Types:

📄 Types

ScrapeResponse

CrawlStatusResponse

CookiesInfo

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0