Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 0456129

Browse files
Added Reddit-scraping-and-flair-detection folder
1 parent ce7e971 commit 0456129

13 files changed

+4559
-0
lines changed

‎.DS_Store

12 KB
Binary file not shown.
6 KB
Binary file not shown.

‎Reddit-scraping-and-flair-detection/Exploratory-Data-Analysis(EDA).ipynb

Lines changed: 1015 additions & 0 deletions
Large diffs are not rendered by default.

‎Reddit-scraping-and-flair-detection/Modelling.ipynb

Lines changed: 1364 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Reddit Flair Detector
2+
## Steps followed:
3+
4+
Described each step along with code in the notebooks.
5+
6+
### Step 1: Extraction of r/india data
7+
Used praw library of python for extraction.
8+
9+
### Step 2: Exploratory Data Analysis
10+
Analysed the data using graphs and scattered points as well as correlation. Used matplotlib library for the same.
11+
12+
### Step 3: Made Reddit Flair Detector. Performed the following the steps:
13+
- Preprocessed the data: Removed stopwords and performed stemming on the data
14+
- Diving into training and test: Divided the dataset into training and test set. Used standard, 0.7:0.3 metric
15+
- Testing accross classifiers: Tested along 3 classifiers: Naive Bayees, SVM and Logisitic Regression. Checked accuracy of each of the classifiers.
16+
- Saving the model: Saved the model with highest accuracy in a .sav file to use it for prediction.
17+
- Model testing: Take input URL from the user and return the predicted and actual flairs. Call the saved model for predicted flairs
18+
19+
### How it works:
20+
The model reads all the urls in the file line by line and predict the flair
21+
- The same is stored in json file.
22+
23+
### Output:
24+
25+
It will be a key and predicted flair as value.
26+
27+

‎Reddit-scraping-and-flair-detection/WebScrapping and PreProcessing.ipynb

Lines changed: 936 additions & 0 deletions
Large diffs are not rendered by default.
140 KB
Loading[フレーム]

‎Reddit-scraping-and-flair-detection/data.csv

Lines changed: 1217 additions & 0 deletions
Large diffs are not rendered by default.
6.72 MB
Binary file not shown.
58.3 KB
Loading[フレーム]

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /