Commit 057e4ef

committed

Added DOM Extraction Script

1 parent a9f85d5 commit 057e4efCopy full SHA for 057e4ef

File tree

+26

-0

lines changed

+26

-0

lines changed

Lines changed: 26 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,26 @@`
	`1`	`+import requests`
	`2`	`+from bs4 import BeautifulSoup`
	`3`	`+`
	`4`	`+# Define the URL of the website you want to extract the DOM from`
	`5`	`+url = 'https://www.facebook.com'`
	`6`	`+`
	`7`	`+response = requests.get(url)`
	`8`	`+`
	`9`	`+if response.status_code == 200:`
	`10`	`+ soup = BeautifulSoup(response.text, 'html.parser')`
	`11`	`+`
	`12`	`+`
	`13`	`+ title = soup.title`
	`14`	`+ if title:`
	`15`	`+ print("Page Title:", title.text)`
	`16`	`+ else:`
	`17`	`+ print("No title tag found.")`
	`18`	`+`
	`19`	`+`
	`20`	`+ links = soup.find_all('a')`
	`21`	`+ print("Links in the page:")`
	`22`	`+ for link in links:`
	`23`	`+ print(link.get('href'))`
	`24`	`+`
	`25`	`+else:`
	`26`	`+ print("Failed to retrieve the page. Status code:", response.status_code)`

Comments

(0)