Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit c784ea8

Browse files
Create README.md
1 parent 3662e28 commit c784ea8

File tree

1 file changed

+35
-0
lines changed

1 file changed

+35
-0
lines changed

‎README.md‎

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# End-to-end tutorial to tackle topic mining and interactive visualizations in python
2+
3+
In this tutorial we'll dive in topic mining. We'll analyze a dataset of newsfeeds extracted from more than 60 sources thanks to a web service called <a href="https://newsapi.org"> newsapi.org </a>.
4+
5+
<p align="center">
6+
<img src="./images/article_2/news_sources.PNG"</img>
7+
</p>
8+
9+
We'll show how to process this text data, analyze it and extract visual clusters of topics from it.
10+
11+
We'll show how to put in practice great python tools for interactive visualization, topic mining and text analytics: **scikit-learn**, **gensim** for the modeling, **Bokeh** and **PyLDAvis** the plots.
12+
13+
All the code is available to you to run and test.
14+
15+
You can either visualize the notebook on <a href="https://github.com/ahmedbesbes/How-to-mine-newsfeed-data-and-extract-interactive-insights-in-Python/blob/master/article_2.ipynb"> github </a> or on <a href="https://ahmedbesbes.com/how-to-mine-newsfeed-data-and-extract-interactive-insights-in-python.html"> my website </a>.
16+
17+
18+
### Environment setup
19+
20+
In this tutorial, I'll be using python 2.7
21+
22+
One thing I recommend is downloading the Anaconda distribution for python 2.7 from this link. This distribution wraps python with the necessary packages used in data science like Numpy, Pandas, Scipy or Scikit-learn.
23+
24+
```shell
25+
pip install tqdm
26+
conda install -c anaconda nltk=3.2.2
27+
conda install bokeh
28+
pip install --upgrade gensim
29+
pip install pyldavis
30+
pip install wordcloud
31+
32+
```
33+
34+
<blockquote class="twitter-tweet" data-lang="fr"><p lang="en" dir="ltr">How to mine newsfeed data and extract interactive insights in <a href="https://twitter.com/hashtag/Python?src=hash&amp;ref_src=twsrc%5Etfw">#Python</a> <a href="https://twitter.com/hashtag/DataScience?src=hash&amp;ref_src=twsrc%5Etfw">#DataScience</a> <a href="https://twitter.com/hashtag/NLP?src=hash&amp;ref_src=twsrc%5Etfw">#NLP</a> <a href="https://t.co/KLNI4CVvfi">https://t.co/KLNI4CVvfi</a> <a href="https://t.co/pd5XiC7N0o">pic.twitter.com/pd5XiC7N0o</a></p>&mdash; KDnuggets (@kdnuggets) <a href="https://twitter.com/kdnuggets/status/989191823523504128?ref_src=twsrc%5Etfw">25 avril 2018</a></blockquote>
35+

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /