Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Rightian/PSpider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

13 Commits

Repository files navigation

PSpider

Python3下极为简洁的爬虫框架, 简单介绍点这里, 实例点这里

utilities module

定义爬虫需要的工具类/工具函数等

instances module

定义抓取过程中的fetcher/parser/saver类

concurrent module

定义多线程/多进程爬取策略,并保证数据同步

others

  • setup.py为安装文件,可将该框架安装到系统环境中
  • test.py为测试文件,可进行简单的功能性测试
  • pylint.conf是代码检查需要的配置文件
  • demos_yundama封装了yundama接口,方便调用
  • demos_nbastats抓取NBA官网上的所有球员数据,并以此作为案例介绍框架的使用方法

问题汇总

  • 安装时提示: Could not find suitable distribution for Requirement.parse('pybloom>=2.0.0')

bloomfilter需要手动安装, 源代码地址在setup文件中, 从GitHub下载后安装即可

欢迎大家在"Issues"中提出问题或者建议,也可以fork后提交"Pull requests"

About

simple python spider

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%

AltStyle によって変換されたページ (->オリジナル) /