Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Dec 22, 2023. It is now read-only.

Commit 935dcc9

Browse files
Merge pull request #565 from kartavyashankar/kartavyashankar
LinkedIn Posts Scrapping
2 parents 94bcd88 + 664bad5 commit 935dcc9

File tree

4 files changed

+116
-0
lines changed

4 files changed

+116
-0
lines changed
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# LinkedIn - Latest Posts (Based on User Interaction)
2+
3+
Scrapes user's LinkedIn profile and displays the latest Posts accordingly (Does not include images and videos).
4+
User is needed to download `Google Chrome` browser and appropriate `chromedriver` and then set the path for the chromedriver in main.py file.
5+
6+
## Pre-Requisites
7+
8+
Run The Command `pip install -r requirements.txt`
9+
10+
## To Run the File
11+
12+
For Windows - `python main.py`
13+
14+
For Ubuntu/Linux - `python3 main.py`
15+
16+
## Screenshots -
17+
18+
### Screenshot of the console interaction
19+
20+
![Screenshot](image1.png)
21+
22+
## *Author Name*
23+
24+
[Kartavya Shankar](https://github.com/kartavyashankar)
194 KB
Loading[フレーム]
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
from selenium import webdriver
2+
from selenium.webdriver.common.keys import Keys
3+
from bs4 import BeautifulSoup
4+
from selenium.webdriver.chrome.options import Options
5+
import re
6+
import time
7+
import getpass
8+
9+
# Sign in and validation part
10+
print('Please sign in to your LinkedIn Account:')
11+
u = input("Email or phone number: ")
12+
p = getpass.getpass('Password: ')
13+
print("Validating...")
14+
chrome_options = Options()
15+
chrome_options.add_argument("--window-size=1360,768")
16+
chrome_options.add_argument("headless")
17+
driver = webdriver.Chrome('./chromedriver', options=chrome_options) # Can replace this path with your chromedriver path
18+
driver.get("https://www.linkedin.com")
19+
unme = driver.find_element_by_id('session_key')
20+
passw = driver.find_element_by_id('session_password')
21+
unme.send_keys(u)
22+
passw.send_keys(p)
23+
passw.send_keys(Keys.ENTER)
24+
cond = True
25+
time.sleep(2)
26+
if(driver.title=="LinkedIn Login, Sign in | LinkedIn" or driver.title=="LinkedIn: Log In or Sign Up"):
27+
print('Invalid Username or Password')
28+
print('The program will now exit')
29+
driver.quit()
30+
cond = False
31+
if(cond is True):
32+
time.sleep(2)
33+
print('Fetching Info (This might take a while)...')
34+
body = driver.find_element_by_tag_name('body')
35+
for i in range (50):
36+
body.send_keys(Keys.CONTROL, Keys.END)
37+
time.sleep(5)
38+
soup = BeautifulSoup(driver.page_source, 'html.parser')
39+
driver.quit()
40+
print('Done')
41+
# Fetching Posts
42+
divs = soup.find_all('div', attrs={'class': re.compile('feed-shared-update-v2 feed-shared-update-v2--minimal-padding full-height relative feed-shared-update-v2--e2e artdeco-card ember-view')})
43+
ctr=0
44+
authors=[]
45+
sdesc=[]
46+
timestamp=[]
47+
posts=[]
48+
print('Fetching the latest posts for you...')
49+
for d in divs:
50+
author=d.find('div', attrs={'class' : re.compile('feed-shared-actor__meta relative')})
51+
content = d.find('div', attrs={'class' : re.compile('feed-shared-update-v2__description-wrapper ember-view')})
52+
try:
53+
name = author.find('span', attrs={'dir' : 'ltr'})
54+
adesc = author.find('span', attrs={'class' : 'feed-shared-actor__description t-12 t-normal t-black--light'})
55+
added = author.find('span', attrs={'class' : 'visually-hidden'})
56+
post = content.find('span', attrs={'dir' : 'ltr'})
57+
n=name.text
58+
ad=added.text
59+
ads=adesc.text
60+
po=post.text
61+
except AttributeError:
62+
continue
63+
authors.append(n)
64+
sdesc.append(ads)
65+
timestamp.append(ad)
66+
posts.append(po)
67+
if(len(authors)==0):
68+
# Bots can be caught by linkedin website if used very frequently
69+
print("Oops! Seems the the bot has crashed due to over-usage :(")
70+
print("Please try after 10 mins.")
71+
cond = False
72+
if(cond is True):
73+
print('Done')
74+
print('Choose the post you want to see :')
75+
for i in range(len(authors)):
76+
print("\t"+str(i+1)+". "+authors[i]+". Added: "+timestamp[i])
77+
ans="y"
78+
while(ans=="y"):
79+
ch = int(input("Enter your choice: "))
80+
if(ch>len(authors) or ch<1):
81+
print("Invalid Choice.")
82+
else:
83+
print(authors[ch-1])
84+
print("Posted: "+timestamp[ch-1])
85+
print("Author Description: "+sdesc[ch-1])
86+
print(posts[ch-1])
87+
ans=input('Want to see other posts? (y/n) ')
88+
print('')
89+
print("Thank You")
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
beautifulsoup4==4.9.3
2+
bs4==0.0.1
3+
selenium==3.141.0

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /