Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 316695d

Browse files
Add files via upload
0 parents commit 316695d

File tree

2 files changed

+26
-0
lines changed

2 files changed

+26
-0
lines changed

‎README.md‎

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
Web Scraping in Python
2+
======================
3+
4+
extract.py:
5+
6+
- This code uses the BeautifulSoup library to extract the links in any webpage.
7+
8+
- The user needs to enter the website from where links have to be extracted.
9+
10+
- This code uses the "a" tag in the HTML code to help extract all the links that are embedded in the web page.

‎extract.py‎

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Taken from http://www.pythonforbeginners.com/python-on-the-web/web-scraping-with-beautifulsoup/
2+
3+
from bs4 import BeautifulSoup
4+
5+
import requests
6+
7+
url = raw_input("Enter a website to extract the URL's from: ")
8+
9+
r = requests.get("http://" +url)
10+
11+
data = r.text
12+
13+
soup = BeautifulSoup(data)
14+
15+
for link in soup.find_all('a'):
16+
print(link.get('href'))

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /