Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

数据新闻所需要的爬虫和数据分析代码

Notifications You must be signed in to change notification settings

Prevalence/DataNews

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

9 Commits

Repository files navigation

DataNews

数据新闻所需要的爬虫和数据分析代码

Corpus文件夹里面是爬虫爬来和各种手段整出来的语料原始文件

  • xxx讲话.txt里面就是xxx的讲话合集汇总了。

Code文件夹里面是用到的代码。

Data2Analyse文件夹里面是处理后用来进行可视化文件

  • xxx讲话分词版.txt是文本分词后的结果,强行删除了一些无效的词。
  • xxx讲话.xls是词频统计后的结果,有调整前比率和调整后比率。调整后比率=调整前比率*10

About

数据新闻所需要的爬虫和数据分析代码

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

AltStyle によって変換されたページ (->オリジナル) /