Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit aea09f4

Browse files
Added scraper and updated readme
1 parent 4bcc841 commit aea09f4

File tree

3 files changed

+8
-7
lines changed

3 files changed

+8
-7
lines changed

‎README.md‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,5 @@
3131
28. Ecommerce Scraper: Scrapes product data from ecommerce websites and displays it to user in CLI.
3232
29. Lyrics Scraper: Scrape lyrics from atozlyrics website by specifying artist name.
3333
30. Walmart Scraper: Scrape data from walmart website and store it in database using MySQLdb.
34+
31. Twitter Scraper: Scrapes tweets from popular hashtags and saves them to csv file
3435

‎twitter-scraper/myfile.csv‎

2.16 MB
Binary file not shown.

‎twitter-scraper/twitter_scraper.py‎

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,19 +9,19 @@
99
#This code is using AppAuthHandler, not OAuthHandler to get higher limits, 2.5 times.
1010
auth = tweepy.AppAuthHandler('j2UAZfXuk6iitAjnLjbFcmn0y', 'Q9X7g4eAhyElO8u5VI183QwRCUF1sXrZs8m9poGt6Q1pmN4cOw')
1111
api = tweepy.API(auth, wait_on_rate_limit=True,
12-
wait_on_rate_limit_notify=True)
12+
wait_on_rate_limit_notify=True)
1313

1414

1515
if (not api):
1616
print ("Can't Authenticate")
1717
sys.exit(-1)
1818
def clean(val):
19-
clean = ""
20-
if val:
21-
clean = val.encode('utf-8')
22-
return clean
19+
clean = ""
20+
if val:
21+
clean = val.encode('utf-8')
22+
return clean
2323

24-
searchQuery = '' #This is for your hasthag(s), separate by comma
24+
searchQuery = '#techsytalk' #This is for your hasthag(s), separate by comma
2525
maxTweets = 80000 # Large max nr
2626
tweetsPerQry = 100 # the max the API permits
2727
fName = 'myfile.csv' #The CSV file where your tweets will be stored
@@ -62,7 +62,7 @@ def clean(val):
6262
print("No more tweets found")
6363
break
6464
for tweet in new_tweets:
65-
csvwriter.writerow([tweet.created_at, clean(tweet.user.screen_name), clean(tweet.text), tweet.user.created_at, tweet.user.followers_count, tweet.user.friends_count, tweet.user.statuses_count, clean(tweet.user.location), tweet.user.geo_enabled, tweet.user.lang, clean(tweet.user.time_zone), tweet.retweet_count]);
65+
csvwriter.writerow([tweet.created_at, clean(tweet.user.screen_name), clean(tweet.text), tweet.user.created_at, tweet.user.followers_count, tweet.user.friends_count, tweet.user.statuses_count, clean(tweet.user.location), tweet.user.geo_enabled, tweet.user.lang, clean(tweet.user.time_zone), tweet.retweet_count]);
6666

6767
tweetCount += len(new_tweets)
6868
#print("Downloaded {0} tweets".format(tweetCount))

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /