Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit f0b2aec

Browse files
Merge pull request avinashkranjan#122 from KGene1901/issue115/KGene1901
Add LinkedIn Email Scraper program
2 parents 4bd2f63 + 46c69d1 commit f0b2aec

File tree

4 files changed

+94
-0
lines changed

4 files changed

+94
-0
lines changed

‎LinkedIn Email Scraper/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# LinkedIn Email Scraper
2+
3+
## Prerequisites:
4+
1. Do `pip install -r requirements.txt` to make sure you have the necessary libraries.
5+
2. Make sure you have a **chromedriver** installed and added to PATH.
6+
3. Have the **URL** to your desired LinkedIn post ready (*make sure the post has some emails in the comments section*)
7+
4. Have your **LinkedIn** account credentials ready
8+
9+
## Executing Application
10+
1. Replace the values of the URL, email and password variables in the code with your own data
11+
2. Either hit **run** if your IDE has the option or just type in `python main.py` in the terminal.
12+
3. The names and corresponding email address scraped from the post should appear in the **emails.csv** file.

‎LinkedIn Email Scraper/emails.csv

Whitespace-only changes.

‎LinkedIn Email Scraper/main.py

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
from selenium import webdriver
2+
from email_validator import validate_email, EmailNotValidError
3+
import csv
4+
5+
def LinkedInEmailScraper(userEmail, userPassword):
6+
emailList = {}
7+
8+
browser = webdriver.Chrome()
9+
url = '[INSERT URL TO LINKEDIN POST]' # example => 'https://www.linkedin.com/posts/faangpath_hiring-womxn-ghc2020-activity-6721287139721650176-QFCV/'
10+
browser.get(url) # visits page of the desired post
11+
12+
browser.implicitly_wait(5)
13+
14+
commentDiv = browser.find_element_by_xpath('/html/body/main/section[1]/section[1]/div/div[3]/a[2]') # finds comment button
15+
loginLink = commentDiv.get_attribute('href')
16+
browser.get(loginLink)
17+
18+
email = browser.find_element_by_xpath('//*[@id="username"]')
19+
password = browser.find_element_by_xpath('//*[@id="password"]')
20+
email.send_keys(userEmail) # inputs email in email field
21+
password.send_keys(userPassword) # inputs password in password field
22+
submit = browser.find_element_by_xpath('//*[@id="app__container"]/main/div[3]/form/div[3]/button')
23+
submit.submit() # submits form
24+
25+
browser.implicitly_wait(5)
26+
27+
commentSection = browser.find_element_by_css_selector('.comments-comments-list') # finds the comments section
28+
29+
for _ in range (3): # this can also be set to any number or "while True" if you want it to search through the whole comment section of the post
30+
try:
31+
moreCommentsButton = commentSection.find_element_by_class_name('comments-comments-list__show-previous-container').find_element_by_tag_name('button')
32+
moreCommentsButton.click()
33+
browser.implicitly_wait(5)
34+
except:
35+
print('End of checking comments')
36+
break
37+
38+
browser.implicitly_wait(20)
39+
40+
comments = commentSection.find_elements_by_tag_name('article') # finds all individual comments
41+
42+
for comment in comments:
43+
try:
44+
commenterName = comment.find_element_by_class_name('hoverable-link-text') # finds name of commenter
45+
commentText = comment.find_element_by_tag_name('p')
46+
commenterEmail = commentText.find_element_by_tag_name('a').get_attribute('innerHTML') # finds email of commenter
47+
validEmail = validate_email(commenterEmail) # validates email address
48+
commenterEmail = validEmail.email
49+
except:
50+
continue
51+
52+
emailList[commenterName.get_attribute('innerHTML')] = commenterEmail
53+
54+
browser.quit()
55+
return emailList
56+
57+
def DictToCSV(input_dict):
58+
'''
59+
Converts dictionary into csv
60+
'''
61+
with open('./LinkedIn Email Scraper/emails.csv', 'w') as f:
62+
f.write('name,email\n')
63+
for key in input_dict:
64+
f.write('%s,%s\n'%(key, input_dict[key]))
65+
f.close()
66+
67+
if __name__ == '__main__':
68+
userEmail = '[INSERT YOUR EMAIL ADDRESS FOR LINKEDIN ACCOUNT]'
69+
userPassword = '[INSERT YOUR PASSWORD FOR LINKEDIN ACCOUNT'
70+
71+
emailList = LinkedInEmailScraper(userEmail, userPassword)
72+
DictToCSV(emailList)
73+
74+
75+
76+
77+
78+
79+
80+

‎LinkedIn Email Scraper/requirements.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
selenium
2+
email-validator

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /