Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 1bcfdfe

Browse files
Merge pull request avinashkranjan#88 from Kreateer/redditmemescraper
Reddit Meme Scraper
2 parents af94fea + 7e719d0 commit 1bcfdfe

File tree

8 files changed

+147
-0
lines changed

8 files changed

+147
-0
lines changed

‎Reddit Meme Scraper/.gitignore

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
venv/
2+
.idea/
3+
*.csv
4+
Test/
5+
reddit_tokens.json
6+
scriptcopy.py
7+
*.jpg
8+
*.jpeg

‎Reddit Meme Scraper/README.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Reddit Meme Scraper
2+
3+
This script locates and downloads images from several subreddits (r/deepfriedmemes, r/surrealmemes, r/nukedmemes, r/bigbangedmemes, r/wackytictacs, r/bonehurtingjuice) into your local system.
4+
5+
For the sake of simplicity (and so that your system doesn't get stuffed full of images), the **download is limited to 25 (total) images per run**.
6+
However, you are **welcome to modify that limit** to whatever amount you'd like or **remove it** altogether, but then make sure you **update `sg.ProgressBar()`** so it properly represents the download progress.
7+
8+
## Usage
9+
10+
Make sure you have installed the **necessary packages** listed in **`requirements.txt`**, then simply run **`script.py`**.
11+
You'll be greeted by a popup window asking where to download the images, after which the download will commence.
12+
13+
## Screenshots
14+
15+
Some screenshots showing how the script works:
16+
17+
![Popup Asking For Destination Folder](https://raw.githubusercontent.com/Kreateer/Amazing-Python-Scripts/redditmemescraper/Reddit%20Meme%20Scraper/images/RM_Scraper_Popup_Win_01.PNG)
18+
19+
![Progress Bar Window](https://raw.githubusercontent.com/Kreateer/Amazing-Python-Scripts/redditmemescraper/Reddit%20Meme%20Scraper/images/RM_Scraper_Popup_Win_03.PNG)
20+
21+
![Popup Informing The User Where The Files Are Located](https://raw.githubusercontent.com/Kreateer/Amazing-Python-Scripts/redditmemescraper/Reddit%20Meme%20Scraper/images/RM_Scraper_Popup_Win_02.PNG)
22+
23+
![Console Output](https://raw.githubusercontent.com/Kreateer/Amazing-Python-Scripts/redditmemescraper/Reddit%20Meme%20Scraper/images/RM_Scraper_Console.PNG)
58 KB
Loading[フレーム]
5.69 KB
Loading[フレーム]
6.82 KB
Loading[フレーム]
3.45 KB
Loading[フレーム]

‎Reddit Meme Scraper/requirements.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
praw
2+
pandas
3+
PySimpleGUI
4+
wget

‎Reddit Meme Scraper/script.py

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
import praw
2+
import PySimpleGUI as sg
3+
import wget
4+
import pandas as pd
5+
import datetime as dt
6+
import json
7+
import os
8+
9+
destination_folder = sg.popup_get_folder('Choose where to download files:\n\n'
10+
'NOTE: A folder to store the files will be created within the directory!',
11+
default_path='', title='Choose destination')
12+
folder_lst = [destination_folder]
13+
if folder_lst[0] is None:
14+
sg.Popup('Destination not specified!\nProgram terminated!', title='ERROR: No destination!',
15+
custom_text='Close', button_type=0)
16+
raise SystemExit()
17+
18+
19+
class RedditCred:
20+
def __init__(self):
21+
self.text_file = 'reddit_tokens.json'
22+
23+
# Functions made to read the reddit app id and secret from file
24+
def read_id(self):
25+
file = self.text_file
26+
with open(file, 'r') as f:
27+
data = json.load(f)
28+
keys = data.keys()
29+
return str(*keys)
30+
31+
def read_secret(self):
32+
file = self.text_file
33+
with open(file, 'r') as f:
34+
data = json.load(f)
35+
value = data.values()
36+
return str(*value)
37+
38+
39+
red_cred = RedditCred()
40+
u_agent = 'Script that downloads memes from various subreddits'
41+
42+
reddit = praw.Reddit(client_id=red_cred.read_id(),
43+
client_secret=red_cred.read_secret(),
44+
user_agent=u_agent)
45+
46+
subreddit = reddit.subreddit('deepfriedmemes+surrealmemes+nukedmemes+bigbangedmemes+wackytictacs+bonehurtingjuice')
47+
posts = subreddit.hot(limit=25)
48+
49+
# Empty lists to hold data
50+
51+
image_urls = []
52+
image_titles = []
53+
image_scores = []
54+
image_timestamps = []
55+
image_ids = []
56+
image_extensions = ['.jpg', '.jpeg', '.png']
57+
58+
# This iterates through posts and collects their data into lists
59+
60+
for post in posts:
61+
image_urls.append(post.url.encode('utf-8'))
62+
image_titles.append(post.title.encode('utf-8'))
63+
image_scores.append(post.score)
64+
image_timestamps.append(dt.datetime.fromtimestamp(post.created))
65+
image_ids.append(post.id)
66+
67+
# This creates a GUI window with a progress bar to keep track of the download
68+
69+
layout = [[sg.Text(f"Downloading files...", key='textkey')],
70+
[sg.ProgressBar(25, orientation='h', size=(20, 20), key='progbar')],
71+
[sg.Cancel()]]
72+
73+
window = sg.Window('Download in Progress', layout)
74+
75+
# This iterates through URLs, checks if it has the specified image extension and downloads the image
76+
77+
for index, url in enumerate(image_urls):
78+
path = str(folder_lst[0])
79+
file_ending = str(url)[2:-1]
80+
event, values = window.read(timeout=0)
81+
_, extension = os.path.splitext(file_ending)
82+
if extension in image_extensions:
83+
try:
84+
if os.path.exists(path + '/' + 'Downloaded Images'):
85+
pass
86+
else:
87+
os.mkdir(path + '/' + 'Downloaded Images')
88+
if event == 'Cancel' or event == sg.WIN_CLOSED:
89+
break
90+
91+
destination = str(folder_lst[0]) + '/' + 'Downloaded Images' + '/'
92+
window['progbar'].update_bar(index + 1)
93+
print(f"Downloading '{str(image_titles[index])[2:-1]}' to '{path}' from '{str(image_urls[index])[2:-1]}'")
94+
download = wget.download(str(image_urls[index])[2:-1], out=destination)
95+
except:
96+
print(f"Something went wrong while downloading '{str(image_urls[index])[2:-1]}'\n")
97+
else:
98+
print("\nDownload complete!")
99+
window.close()
100+
sg.Popup(f"Files downloaded into:\n\n'{path}/Downloaded Images'", title='Download complete!')
101+
102+
103+
# Optional saving of collected data to .csv file
104+
105+
dataframe = pd.DataFrame({
106+
'Title': image_titles,
107+
'Score': image_scores,
108+
'URL': image_urls,
109+
'Timestamp': image_timestamps,
110+
'ID': image_ids
111+
})
112+
csv = dataframe.to_csv('./images.csv', index=True, header=True)

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /