1

I've been trying to create a web scraper that collects the name, price and the district of an object on a website but cannot do anything with it because it raises an error:

AttributeError: 'NoneType' object has no attribute 'strip'.

What do I do? Help! Also how do I go to the second div? When I do districtcontainers = souped.find_all("div",{"class":"announcement-block-link") and then districtcontainers[0].div.div, it gives 0 output. How to solve that? Thank you very much for your attention and answers :).

import urllib.request as uReq
from bs4 import BeautifulSoup as soup
url = uReq.urlopen("https://www.bazaraki.com/real-estate/houses-and-villas-rent/larnaka-district-larnaca/")
html = url.read()
souped = soup(html,"html.parser")
containers = souped.find_all("div",{"class":"announcement-block-text-container"})
districtcontainers = souped.find_all("div",{"class":"announcement-block__location"})
for container in containers:
 for districtcontainer in districtcontainers:
 title = container.a
 price = container.p
 district = districtcontainer
 print("{}:\n Costs: \n District:{}".format(title.string.strip(),price.string.strip(),district.string.strip()))
coldspeed95
406k106 gold badges745 silver badges798 bronze badges
asked Jul 24, 2017 at 15:40
6
  • 1
    Try title.text or title.content? Commented Jul 24, 2017 at 15:41
  • trying, it says EOF error yet Commented Jul 24, 2017 at 15:53
  • It works,but it returns every item like a hundred times Commented Jul 24, 2017 at 15:58
  • What do you mean? Commented Jul 24, 2017 at 15:59
  • Are you really using soup.find_all()? Instead you should be using soup.findAll() Commented Jul 24, 2017 at 16:02

1 Answer 1

1

First, your last print statement has only two {} but you have three variables in the following format call, it's just a typo so it isn't the cause of the problem.

Second, it looks like one of your variables is getting assigned a value of None by your call to the souped object. Just check the page's coding to make sure the values you are looking for are found in the right position. BeautifulSoup will return None if you call for a section that isn't there instead of raising an error.

Since it looks like the exception is being raised while running the print call just check to make sure your three calls return values for your variable assignment.

As for getting to the second div in a page, you may want to check out the next_sibling method call.

answered Jul 24, 2017 at 16:18
Sign up to request clarification or add additional context in comments.

2 Comments

Could you please help me with the .div.div thing? .next_sibling returns "\n". I can't parse any page because of this
You can use multiple next_sibling on a single instance so it may take multiple next_sibling to get to the actual text or the next instance of an HTML element. I suggest next_sibling and possibly next_element because they assure that a value returned instead of creating another Nonetype error you're getting.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.