Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Oct 10, 2019. It is now read-only.
/ douyin Public archive

抖音数据爬虫,初学python和scrapy框架的练手项目,未完善版本

License

Notifications You must be signed in to change notification settings

gisShield/douyin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

11 Commits

Repository files navigation

[已废弃]抖音爬虫抓取


项目背景

主要用于个人初学python和scrapy框架的练手项目。该爬虫仅供学习使用,不用做任何其他途径。

开发依赖

  • python3.6.1
  • scrapy1.5.0
  • mongoDB
  • APScheduler

项目介绍

主要通过抓取链接,获得搜索中的"热门挑战"和"热门音乐"的列表,再去通过参与人数排序拿出部分比较热门的链接抓取视频数据。分别存入mongoDB数据库中。每天凌晨更新数据。目前数据不全,没做全量的更新功能。

后续目标(随缘吧...)

  1. 单独写个客户端和Web端用于展现数据和筛选功能

更新记录:

  • 20180914 爬虫的链接好像失效了,近期会去更新。

About

抖音数据爬虫,初学python和scrapy框架的练手项目,未完善版本

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

AltStyle によって変換されたページ (->オリジナル) /