Commit 375151d

lokeshR2004 lokeshR2004

authored

Create webscrapper.py

1 parent 46e81a1 commit 375151dCopy full SHA for 375151d

File tree

1 file changed

+61

-0

lines changed

R.LOKESH/task6
- webscrapper.py

1 file changed

+61

-0

lines changed

`‎R.LOKESH/task6/webscrapper.py`

Lines changed: 61 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,61 @@`
	`1`	`+import requests`
	`2`	`+from bs4 import BeautifulSoup`
	`3`	`+import csv`
	`4`	`+`
	`5`	`+def scrape_website(url):`
	`6`	`+ # Send a GET request to the URL`
	`7`	`+ response = requests.get(url)`
	`8`	`+`
	`9`	`+ # Check if request was successful (status code 200)`
	`10`	`+ if response.status_code == 200:`
	`11`	`+ # Parse the HTML content of the page`
	`12`	`+ soup = BeautifulSoup(response.text, 'html.parser')`
	`13`	`+`
	`14`	`+ # Find the elements containing the data you want to extract`
	`15`	`+ # Replace 'example' with actual HTML tags and classes/IDs`
	`16`	`+ data_elements = soup.find_all('div', class_='example')`
	`17`	`+`
	`18`	`+ # Extract data from the elements and store in a list of dictionaries`
	`19`	`+ scraped_data = []`
	`20`	`+ for element in data_elements:`
	`21`	`+ # Example: Extract text from a specific tag within the element`
	`22`	`+ data = {`
	`23`	`+ 'title': element.find('h2').text.strip(),`
	`24`	`+ 'description': element.find('p').text.strip()`
	`25`	`+ }`
	`26`	`+ scraped_data.append(data)`
	`27`	`+`
	`28`	`+ return scraped_data`
	`29`	`+ else:`
	`30`	`+ print("Error: Failed to fetch website")`
	`31`	`+ return []`
	`32`	`+`
	`33`	`+def save_to_csv(data, filename):`
	`34`	`+ # Define CSV header based on keys of the first dictionary in the list`
	`35`	`+ fields = list(data[0].keys())`
	`36`	`+`
	`37`	`+ # Write data to CSV file`
	`38`	`+ with open(filename, 'w', newline='') as csvfile:`
	`39`	`+ writer = csv.DictWriter(csvfile, fieldnames=fields)`
	`40`	`+`
	`41`	`+ # Write header`
	`42`	`+ writer.writeheader()`
	`43`	`+`
	`44`	`+ # Write rows`
	`45`	`+ for row in data:`
	`46`	`+ writer.writerow(row)`
	`47`	`+`
	`48`	`+def main():`
	`49`	`+ url = "https://example.com"`
	`50`	`+ filename = "data.csv"`
	`51`	`+`
	`52`	`+ # Scrape website`
	`53`	`+ scraped_data = scrape_website(url)`
	`54`	`+`
	`55`	`+ # Save data to CSV`
	`56`	`+ save_to_csv(scraped_data, filename)`
	`57`	`+`
	`58`	`+ print(f"Data has been scraped and saved to {filename}")`
	`59`	`+`
	`60`	`+if __name__ == "__main__":`
	`61`	`+ main()`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 375151d

File tree

1 file changed

1 file changed

`‎R.LOKESH/task6/webscrapper.py`

0 commit comments