Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Dec 22, 2023. It is now read-only.

Commit 0b36cfb

Browse files
Added twitter topic modeling and sentiment analysis
1 parent 95893da commit 0b36cfb

File tree

13 files changed

+344
-0
lines changed

13 files changed

+344
-0
lines changed
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# Twitter Topic Modeling and Sentiment Analysis
2+
3+
A Flask web app with twitter OAuth for analysing user's tweet topics and its sentiments.
4+
5+
The user has to login using his/her twitter account details.
6+
7+
The application will retrieve user's tweets and apply an unsupervised clustering technique in NLP known as topic modeling to classify tweets into topics.The model used here is the Biterm topic modleing model know as BTM.BTM is most suitable model for small texts like tweets.
8+
9+
The application will also perform sentiment analysis using the vader sentiment analysis module.
10+
11+
## Prerequisites
12+
13+
* Clone the [BTM repo](https://github.com/markoarnauto/biterm) and copy-paste the biterm folder from there in the current folder.
14+
15+
* Create a twitter developer's account.
16+
17+
* Create an app from the [developer's home page](https://developer.twitter.com/en/apps)
18+
19+
* Mention the details for the app as shown, you can rename other fields but keep the callback url same.
20+
```
21+
Callback url : http://127.0.0.1:5000/login/twitter/authorized
22+
```
23+
24+
![twitter-app-details](images/twitter-app-details.JPG)
25+
26+
* From this page go to "Keys and tokens" and copy your tokens and paste in the app.py file.
27+
```
28+
twitter_blueprint = make_twitter_blueprint(
29+
api_key= "<your_api_key>", api_secret= ""<your_api_secret>")
30+
```
31+
32+
Website URL can be any random url.
33+
34+
* Install dependencies
35+
```
36+
pip install -r requirements.txt
37+
```
38+
39+
40+
## Usage
41+
* In the main directory run the following command
42+
```
43+
python app.py
44+
```
45+
* You will get a url, go to that url.
46+
In most cases it is this-
47+
```
48+
http://127.0.0.1:5000/
49+
```
50+
51+
* The app is hosted!!
52+
53+
* Login using your twitter credentials
54+
55+
![login](images/app-twitter-oauth.JPG)
56+
57+
* Enter your profile display name/ID.
58+
59+
## Screenshots
60+
61+
* Final user dashboard
62+
63+
![user dashboard](images/twitter-app-dashboard.JPG)
64+
65+
1. Your username/display name
66+
67+
2. Topics clustered by BTM
68+
69+
3. Each tweet classified into one of the 3 topics.
70+
71+
4. Vader Sentiment analysis for each tweet.
72+
73+
Each negative tweet is highlighted with grey background.
74+
75+
## Author name
76+
77+
Priya Mane
1.02 KB
Binary file not shown.
568 Bytes
Binary file not shown.
1.21 KB
Binary file not shown.
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
from flask import Flask, redirect, url_for, render_template
2+
from flask_dance.contrib.twitter import make_twitter_blueprint, twitter
3+
import requests
4+
import btm_model
5+
import text_cleaning
6+
import sentiment
7+
8+
app = Flask(__name__)
9+
app.config['SECRET_KEY'] = "youareawesomethiscanbeanything"
10+
11+
twitter_blueprint = make_twitter_blueprint(
12+
api_key="", api_secret="")
13+
14+
app.register_blueprint(twitter_blueprint, url_prefix='/login')
15+
16+
17+
@app.route('/')
18+
def index():
19+
# Home page
20+
# If the user is not authorized, redirect to the twitter login page
21+
if not twitter.authorized:
22+
return redirect(url_for('twitter.login'))
23+
return redirect("http://127.0.0.1:5000/twitter")
24+
25+
26+
@app.route('/twitter')
27+
def twitter_login():
28+
# If the user is not authorized, redirect to the twitter login page
29+
if not twitter.authorized:
30+
return redirect(url_for('twitter.login'))
31+
# If user is authorized retrieve his/her account details
32+
account_info = twitter.get('account/settings.json')
33+
# If user is authorized retrieve his/her tweets
34+
user_tweets = twitter.get(
35+
"statuses/user_timeline.json")
36+
37+
# If account information is successfully retrieved, proceed to analyse and display it
38+
if account_info.ok:
39+
# Convert retrieved info to json format
40+
user_tweets_json = user_tweets.json()
41+
account_info_json = account_info.json()
42+
43+
# Get tweet text from the objects returned
44+
all_tweets = []
45+
print(account_info_json)
46+
for tweet in user_tweets_json:
47+
all_tweets.append(tweet['text'])
48+
49+
# Text Cleaning for tweets
50+
all_tweets_cleaned = text_cleaning.clean_tweets(all_tweets)
51+
52+
# BTM model for topic modeling results
53+
classified_tweets, topics = btm_model.categorize(all_tweets_cleaned)
54+
55+
# Sentiment analysis
56+
tweet_sentiment = sentiment.get_sentiment(all_tweets_cleaned)
57+
58+
# Prepare data to be sent and rendered on the template for user dashboard
59+
data = {
60+
"all_tweets": all_tweets,
61+
"account_info_json": account_info_json,
62+
"classified_tweets": classified_tweets,
63+
"topics": topics,
64+
"sentiment": tweet_sentiment
65+
}
66+
67+
# Render template with user data
68+
return render_template('user_dash.html', data=data)
69+
70+
# If account info is not retrieved successfully return an error message.
71+
return '<h2>Error</h2>'
72+
73+
74+
if __name__ == '__main__':
75+
app.run(debug=True)
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
import numpy as np
2+
from biterm.biterm.btm import oBTM
3+
from sklearn.feature_extraction.text import CountVectorizer
4+
from biterm.biterm.utility import vec_to_biterms, topic_summuary
5+
6+
7+
def categorize(tweets_list, number_of_topics=3):
8+
9+
# vectorize texts
10+
vec = CountVectorizer(stop_words='english')
11+
X = vec.fit_transform(tweets_list).toarray()
12+
13+
# get vocabulary
14+
vocab = np.array(vec.get_feature_names())
15+
16+
# get biterms
17+
biterms = vec_to_biterms(X)
18+
19+
# create btm
20+
btm = oBTM(num_topics=number_of_topics, V=vocab)
21+
22+
# print("\n\n Train Online BTM ..")
23+
for i in range(0, len(biterms), 100): # prozess chunk of 200 texts
24+
biterms_chunk = biterms[i:i + 100]
25+
btm.fit(biterms_chunk, iterations=50)
26+
topics = btm.transform(biterms)
27+
28+
#print("\n\n Topic coherence ..")
29+
res = topic_summuary(btm.phi_wz.T, X, vocab, 6)
30+
31+
topics_top_words = res['top_words']
32+
33+
topic_classification = []
34+
35+
# print("\n\n Texts & Topics ..")
36+
for i in range(len(tweets_list)):
37+
# print("{} (topic: {})".format(tweets_list[i], topics[i].argmax()))
38+
topic_classification.append(topics[i].argmax())
39+
40+
# print(type(topics))
41+
42+
return topic_classification, topics_top_words
77 KB
Loading[フレーム]
239 KB
Loading[フレーム]
28 KB
Loading[フレーム]
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
flask
2+
requests
3+
Flask-Dance
4+
numpy
5+
vaderSentiment
6+
nltk
7+
re

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /