Commit 25cb2a6

committed

udpate README

1 parent dc2b9ba commit 25cb2a6Copy full SHA for 25cb2a6

File tree

1 file changed

+49

-0

lines changed

README.md

1 file changed

+49

-0

lines changed

`‎README.md‎`

Lines changed: 49 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -250,3 +250,52 @@ for i in range(40):`
`250`	`250`	Now how do I know that we have to increment by ```30``` well I checked the pattern of
`251`	`251`	urls by visiting the pages and stop at ```270``` so that we only request 10 pages.
`252`	`252`	You can use whatever value you want but it should be multiple of ```30```
	`253`	`+`
	`254`	`+# Reading Restaurant Title`
	`255`	+Now we will be using the previous code that we wrote in ```formatting_url.py``` and
	`256`	`+extract the particular piece of text from the html tags that we need which is the title`
	`257`	`+of the restaurant from each search page.`
	`258`	`+`
	`259`	`+Visit the [url](https://www.yelp.com/search?find_desc=Restaurants&find_loc=los+angeles&start=30) and open developers tools`
	`260`	`+and point at the block of restaurant with title, rating, review etc. and find the`
	`261`	+li tag with class ```regular-search-result```
	`262`	`+`
	`263`	+We will be using this class for searching the particular ```li``` tag from the response
	`264`	+using ```BeautifulSoup```
	`265`	`+`
	`266`	`+reading_name.py`
	`267`	+```
	`268`	`+import requests`
	`269`	`+...`
	`270`	`+info_block = soup.findAll('li', {'class': 'regular-search-result'})`
	`271`	`+print(info_block)`
	`272`	+```
	`273`	`+`
	`274`	`+Run the file and you should the whole li tag and its inner tags printed. But we want`
	`275`	`+to extract the title of the restaurant from each li tag, for that we have to find`
	`276`	`+the class used in the title of restaurant`
	`277`	`+`
	`278`	+The title is wrapped inside a anchor tag with class ```biz-name```
	`279`	`+`
	`280`	+```
	`281`	`+info_block = soup.findAll('a', {'class': 'biz-name'})`
	`282`	`+print(info_block)`
	`283`	`+`
	`284`	`+count = 0`
	`285`	`+for info in info_block:`
	`286`	`+ print(info.text)`
	`287`	`+ count += 1`
	`288`	`+`
	`289`	`+print(count)`
	`290`	+```
	`291`	`+`
	`292`	+On printing the ```text``` of the html tag we get the title of the restaurant, these are
	`293`	+not all the title cause some block don't have ```biz-name``` class but we have what we
	`294`	`+need.`
	`295`	`+`
	`296`	`+# Advanced Extraction`
	`297`	`+In this section we will be go a little more further and extract the name, address,`
	`298`	`+phone-number of the restaurant.`
	`299`	`+`
	`300`	+This time we will be looking for the ```div``` tag that has class ```biz-listing-large```
	`301`	`+that contains the restaurant details.`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 25cb2a6

File tree

1 file changed

1 file changed

`‎README.md‎`

0 commit comments