zzhengyang/crawler_script

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.idea		.idea
AppUuid		AppUuid
CSDN		CSDN
Others		Others
aso		aso
bangkok		bangkok
data		data
siam2nite		siam2nite
tripAdvisor_cn		tripAdvisor_cn
tripAdvisor_en		tripAdvisor_en
vpn_switch		vpn_switch
.gitignore		.gitignore
README.md		README.md
vpn_switch.zip		vpn_switch.zip

Repository files navigation

README

introduction ------------------------
crawler.py文件用来对蚂蜂窝、大众点评、携程的曼谷、清迈、普吉岛、苏梅、芭堤雅5个地区进行数据采集。

library ------------------------
1. BeautifulSoup包,用来根据URL获取静态页面中的元素信息参考资料: Python爬虫HTML分析,BeautifulSoup库中文文档:http://beautifulsoup.readthedocs.io/zh_CN/latest/# Python结合BeautifulSoup抓取知乎数据:http://blog.csdn.net/u012286517/article/details/51212268 BeautifulSoup用法:http://cuiqingcai.com/1319.html 2. Ghost包,用来根据页面的url动态加载js,获取加载之后的页面代码,并且得到图片标签的src属性参考资料:http://www.2cto.com/kf/201401/273914.html 3. urllib2包,抓取页面并返回页面HTML

About

crawler dictionary

Releases

No releases published

Packages

No packages published

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zzhengyang/crawler_script

Folders and files

Latest commit

History

Repository files navigation

README

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors 2

Uh oh!

Languages

zzhengyang/crawler_script

Folders and files

Latest commit

History

Repository files navigation

README

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages