Frank-qlu / recruit Public

Notifications You must be signed in to change notification settings
Fork 38
Star 143

recruit 招聘爬虫+数据分析 1.爬虫: 采用Scrapy 分布式爬虫技术,使用mongodb作为数据存储,爬取的网站Demo为51job,数据我目前爬了有几千条 2.数据处理: 采用pandas对爬取的数据进行清洗和处理 2.数据分析: 采用flask后端获取mongodb数据,前端使用bootstrap3.echarts以及D3的词云图,如果喜欢请star or Fork,预览详见

www.xunguo.site

License

Apache-2.0 license

143 stars 38 forks Branches Tags Activity

Star

Notifications

Frank-qlu/recruit

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
招聘爬虫		招聘爬虫
LICENSE		LICENSE
README.md		README.md
README.md.bak		README.md.bak

Repository files navigation

recruit

招聘爬虫+数据分析 1.爬虫: 采用Scrapy 分布式爬虫技术,使用mongodb作为数据存储,爬取的网站Demo为51job,数据我目前爬了有几万条 2.数据处理: 采用pandas对爬取的数据进行清晰和处理 2.数据分析: 采用flask后端获取mongodb数据,前端使用bootstrap3.echarts以及D3的词云图

###注意:1. pymongo安装版本 <=3.0 建议 pip install pymongo==2.8### 2. 如果scrapy安装不上,在这上面查找https://www.lfd.uci.edu/~gohlke/pythonlibs/ 先安装对应版本 twisted ,再安装scrapy就没问题。 3.mongodb启动,进入安装mongodb的文件夹的bin目录下面,输入 mongod --dbpath= data文件夹路径

关于项目启动

爬虫:

1.cd 目录 2. pip install pymongo==2.8 3. scrapy crawl zlzp
数据可视化
1. 激活虚拟环境 cd venv/Scripts activate
2. python zlzpView.py

version 1.0:

首次更新项目

version 2.0(2019年06月17日更新):

1.优化界面,采用blueprint设计模式
2.添加高级搜索(聚合查找)
3.添加前后台,增添redis数据库
3. 后台设置招聘信息过期时间
4.后台用户管理

version 3.0(未来):

1.采用flask-restful
2.优化数据分析模块
3. 设置兴趣标签,添加推荐系统,相似职位推荐

项目预览

###

About

www.xunguo.site

Releases

No releases published

Packages

No packages published

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Frank-qlu/recruit

Folders and files

Latest commit

History

Repository files navigation

recruit

version 1.0:

version 2.0(2019年06月17日更新):

version 3.0(未来):

项目预览

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

License

Frank-qlu/recruit

Folders and files

Latest commit

History

Repository files navigation

recruit

version 1.0:

version 2.0(2019年06月17日更新):

version 3.0(未来):

项目预览

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages