portia

Visual scraping for Scrapy

Popularity
8.8
Declining
Activity
0.0
Stable
9,475
491
1,401

Description

Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web page to identify the data you wish to extract, and Portia will understand based on these annotations how to scrape data from similar pages.

Code Quality Rank: L2
Programming language: Python
License: BSD 3-clause "New" or "Revised" License
Latest version: v2.0.8

portia alternatives and similar packages

Based on the "Web Crawling" category.
Alternatively, view portia alternatives based on common mentions on social networks and blogs.

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Do you think we are missing an alternative of portia or a related project?

Add another 'Web Crawling' Package

README

Portia

Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web page to identify the data you wish to extract, and Portia will understand based on these annotations how to scrape data from similar pages.

Running Portia

The easiest way to run Portia is using Docker:

You can run Portia using Docker & official Portia-image by running:

docker run -v ~/portia_projects:/app/data/projects:rw -p 9001:9001 scrapinghub/portia

You can also set up a local instance with Docker-compose by cloning this repo & running from the root of the folder:

docker-compose up

For more detailed instructions, and alternatives to using Docker, see the Installation docs.

Documentation

Documentation can be found from Read the docs. Source files can be found in the docs directory.

Do not miss the trending, packages, news and articles with our weekly report.

Awesome Python is part of the LibHunt network. Terms. Privacy Policy.

(CC)
BY-SA
We recommend Spin The Wheel Of Names for a cryptographically secure random name picker.

AltStyle によって変換されたページ (->オリジナル) /