Commit 1bcfdfe

authored

Merge pull request avinashkranjan#88 from Kreateer/redditmemescraper

Reddit Meme Scraper

2 parents af94fea + 7e719d0 commit 1bcfdfeCopy full SHA for 1bcfdfe

File tree

8 files changed

+147

-0

lines changed

Reddit Meme Scraper

8 files changed

+147

-0

lines changed

`‎Reddit Meme Scraper/.gitignore`

Lines changed: 8 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,8 @@`
	`1`	`+venv/`
	`2`	`+.idea/`
	`3`	`+*.csv`
	`4`	`+Test/`
	`5`	`+reddit_tokens.json`
	`6`	`+scriptcopy.py`
	`7`	`+*.jpg`
	`8`	`+*.jpeg`

`‎Reddit Meme Scraper/README.md`

Lines changed: 23 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,23 @@`
	`1`	`+# Reddit Meme Scraper`
	`2`	`+`
	`3`	`+This script locates and downloads images from several subreddits (r/deepfriedmemes, r/surrealmemes, r/nukedmemes, r/bigbangedmemes, r/wackytictacs, r/bonehurtingjuice) into your local system.`
	`4`	`+`
	`5`	`+For the sake of simplicity (and so that your system doesn't get stuffed full of images), the download is limited to 25 (total) images per run.`
	`6`	+However, you are welcome to modify that limit to whatever amount you'd like or remove it altogether, but then make sure you update `sg.ProgressBar()` so it properly represents the download progress.
	`7`	`+`
	`8`	`+## Usage`
	`9`	`+`
	`10`	+Make sure you have installed the necessary packages listed in `requirements.txt`, then simply run `script.py`.
	`11`	`+You'll be greeted by a popup window asking where to download the images, after which the download will commence.`
	`12`	`+`
	`13`	`+## Screenshots`
	`14`	`+`
	`15`	`+Some screenshots showing how the script works:`
	`16`	`+`
	`17`	`+![Popup Asking For Destination Folder](https://raw.githubusercontent.com/Kreateer/Amazing-Python-Scripts/redditmemescraper/Reddit%20Meme%20Scraper/images/RM_Scraper_Popup_Win_01.PNG)`
	`18`	`+`
	`19`	`+![Progress Bar Window](https://raw.githubusercontent.com/Kreateer/Amazing-Python-Scripts/redditmemescraper/Reddit%20Meme%20Scraper/images/RM_Scraper_Popup_Win_03.PNG)`
	`20`	`+`
	`21`	`+![Popup Informing The User Where The Files Are Located](https://raw.githubusercontent.com/Kreateer/Amazing-Python-Scripts/redditmemescraper/Reddit%20Meme%20Scraper/images/RM_Scraper_Popup_Win_02.PNG)`
	`22`	`+`
	`23`	`+![Console Output](https://raw.githubusercontent.com/Kreateer/Amazing-Python-Scripts/redditmemescraper/Reddit%20Meme%20Scraper/images/RM_Scraper_Console.PNG)`

`‎Reddit Meme Scraper/images/RM_Scraper_Console.PNG`

58 KB

Loading[フレーム]

`‎Reddit Meme Scraper/images/RM_Scraper_Popup_Win_01.PNG`

5.69 KB

Loading[フレーム]

`‎Reddit Meme Scraper/images/RM_Scraper_Popup_Win_02.PNG`

6.82 KB

Loading[フレーム]

`‎Reddit Meme Scraper/images/RM_Scraper_Popup_Win_03.PNG`

3.45 KB

Loading[フレーム]

`‎Reddit Meme Scraper/requirements.txt`

Lines changed: 4 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,4 @@`
	`1`	`+praw`
	`2`	`+pandas`
	`3`	`+PySimpleGUI`
	`4`	`+wget`

`‎Reddit Meme Scraper/script.py`

Lines changed: 112 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,112 @@`
	`1`	`+import praw`
	`2`	`+import PySimpleGUI as sg`
	`3`	`+import wget`
	`4`	`+import pandas as pd`
	`5`	`+import datetime as dt`
	`6`	`+import json`
	`7`	`+import os`
	`8`	`+`
	`9`	`+destination_folder = sg.popup_get_folder('Choose where to download files:\n\n'`
	`10`	`+ 'NOTE: A folder to store the files will be created within the directory!',`
	`11`	`+ default_path='', title='Choose destination')`
	`12`	`+folder_lst = [destination_folder]`
	`13`	`+if folder_lst[0] is None:`
	`14`	`+ sg.Popup('Destination not specified!\nProgram terminated!', title='ERROR: No destination!',`
	`15`	`+ custom_text='Close', button_type=0)`
	`16`	`+ raise SystemExit()`
	`17`	`+`
	`18`	`+`
	`19`	`+class RedditCred:`
	`20`	`+ def __init__(self):`
	`21`	`+ self.text_file = 'reddit_tokens.json'`
	`22`	`+`
	`23`	`+# Functions made to read the reddit app id and secret from file`
	`24`	`+ def read_id(self):`
	`25`	`+ file = self.text_file`
	`26`	`+ with open(file, 'r') as f:`
	`27`	`+ data = json.load(f)`
	`28`	`+ keys = data.keys()`
	`29`	`+ return str(*keys)`
	`30`	`+`
	`31`	`+ def read_secret(self):`
	`32`	`+ file = self.text_file`
	`33`	`+ with open(file, 'r') as f:`
	`34`	`+ data = json.load(f)`
	`35`	`+ value = data.values()`
	`36`	`+ return str(*value)`
	`37`	`+`
	`38`	`+`
	`39`	`+red_cred = RedditCred()`
	`40`	`+u_agent = 'Script that downloads memes from various subreddits'`
	`41`	`+`
	`42`	`+reddit = praw.Reddit(client_id=red_cred.read_id(),`
	`43`	`+ client_secret=red_cred.read_secret(),`
	`44`	`+ user_agent=u_agent)`
	`45`	`+`
	`46`	`+subreddit = reddit.subreddit('deepfriedmemes+surrealmemes+nukedmemes+bigbangedmemes+wackytictacs+bonehurtingjuice')`
	`47`	`+posts = subreddit.hot(limit=25)`
	`48`	`+`
	`49`	`+# Empty lists to hold data`
	`50`	`+`
	`51`	`+image_urls = []`
	`52`	`+image_titles = []`
	`53`	`+image_scores = []`
	`54`	`+image_timestamps = []`
	`55`	`+image_ids = []`
	`56`	`+image_extensions = ['.jpg', '.jpeg', '.png']`
	`57`	`+`
	`58`	`+# This iterates through posts and collects their data into lists`
	`59`	`+`
	`60`	`+for post in posts:`
	`61`	`+ image_urls.append(post.url.encode('utf-8'))`
	`62`	`+ image_titles.append(post.title.encode('utf-8'))`
	`63`	`+ image_scores.append(post.score)`
	`64`	`+ image_timestamps.append(dt.datetime.fromtimestamp(post.created))`
	`65`	`+ image_ids.append(post.id)`
	`66`	`+`
	`67`	`+# This creates a GUI window with a progress bar to keep track of the download`
	`68`	`+`
	`69`	`+layout = [[sg.Text(f"Downloading files...", key='textkey')],`
	`70`	`+ [sg.ProgressBar(25, orientation='h', size=(20, 20), key='progbar')],`
	`71`	`+ [sg.Cancel()]]`
	`72`	`+`
	`73`	`+window = sg.Window('Download in Progress', layout)`
	`74`	`+`
	`75`	`+# This iterates through URLs, checks if it has the specified image extension and downloads the image`
	`76`	`+`
	`77`	`+for index, url in enumerate(image_urls):`
	`78`	`+ path = str(folder_lst[0])`
	`79`	`+ file_ending = str(url)[2:-1]`
	`80`	`+ event, values = window.read(timeout=0)`
	`81`	`+ _, extension = os.path.splitext(file_ending)`
	`82`	`+ if extension in image_extensions:`
	`83`	`+ try:`
	`84`	`+ if os.path.exists(path + '/' + 'Downloaded Images'):`
	`85`	`+ pass`
	`86`	`+ else:`
	`87`	`+ os.mkdir(path + '/' + 'Downloaded Images')`
	`88`	`+ if event == 'Cancel' or event == sg.WIN_CLOSED:`
	`89`	`+ break`
	`90`	`+`
	`91`	`+ destination = str(folder_lst[0]) + '/' + 'Downloaded Images' + '/'`
	`92`	`+ window['progbar'].update_bar(index + 1)`
	`93`	`+ print(f"Downloading '{str(image_titles[index])[2:-1]}' to '{path}' from '{str(image_urls[index])[2:-1]}'")`
	`94`	`+ download = wget.download(str(image_urls[index])[2:-1], out=destination)`
	`95`	`+ except:`
	`96`	`+ print(f"Something went wrong while downloading '{str(image_urls[index])[2:-1]}'\n")`
	`97`	`+else:`
	`98`	`+ print("\nDownload complete!")`
	`99`	`+ window.close()`
	`100`	`+ sg.Popup(f"Files downloaded into:\n\n'{path}/Downloaded Images'", title='Download complete!')`
	`101`	`+`
	`102`	`+`
	`103`	`+# Optional saving of collected data to .csv file`
	`104`	`+`
	`105`	`+dataframe = pd.DataFrame({`
	`106`	`+ 'Title': image_titles,`
	`107`	`+ 'Score': image_scores,`
	`108`	`+ 'URL': image_urls,`
	`109`	`+ 'Timestamp': image_timestamps,`
	`110`	`+ 'ID': image_ids`
	`111`	`+})`
	`112`	`+csv = dataframe.to_csv('./images.csv', index=True, header=True)`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 1bcfdfe

File tree

8 files changed

8 files changed

`‎Reddit Meme Scraper/.gitignore`

`‎Reddit Meme Scraper/README.md`

`‎Reddit Meme Scraper/images/RM_Scraper_Console.PNG`

`‎Reddit Meme Scraper/images/RM_Scraper_Popup_Win_01.PNG`

`‎Reddit Meme Scraper/images/RM_Scraper_Popup_Win_02.PNG`

`‎Reddit Meme Scraper/images/RM_Scraper_Popup_Win_03.PNG`

`‎Reddit Meme Scraper/requirements.txt`

`‎Reddit Meme Scraper/script.py`

0 commit comments