Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit f65dbab

Browse files
Merge pull request avinashkranjan#1060 from Himanshi2997/him
Real Estate Property Data avinashkranjan#1003
2 parents 8302b5b + 010d37a commit f65dbab

File tree

3 files changed

+112
-0
lines changed

3 files changed

+112
-0
lines changed

‎Real Estate Webscrapper/README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
## Real Estate Webscrapper
2+
- It will take information from the real estate site and store it in the form of csv file making the data more organised and locally accessible.
3+
4+
___
5+
6+
## Requirements
7+
- BeautifulSoup
8+
- Pandas
9+
---
10+
## How To install
11+
> pip install pandas
12+
13+
> pip install beautifulsoup
14+
---
15+
- Now run the real_estate_webscrapper.py file to create the output2.csv file.
16+
- Then output2.csv will be created in the same folder as real_estate_webscrapper.py file and it can be opened using Microsoft Excel.
17+
---
18+
### Step 1
19+
- Load the website https://www.magicbricks.com/ready-to-move-flats-in-new-delhi-pppfs in your code using requests.
20+
21+
### Step 2
22+
- Use inspect in website to know which div contains the information that we need
23+
- Use beautiful soup to load the information in program and store it into a dictionary for each property
24+
25+
### Step 3
26+
- Use pandas to convert the list of dictionaries to csv file
27+
---
28+
29+
## Author
30+
[Himanshi2997](https://github.com/Himanshi2997)
31+
---
32+
33+
## Output
34+
![output2](https://user-images.githubusercontent.com/67272318/118381259-b8b71f80-b606-11eb-983d-5d8094d05f06.PNG)
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
import requests
2+
from bs4 import BeautifulSoup
3+
import pandas
4+
5+
6+
headers = {
7+
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0',
8+
}
9+
10+
r=requests.get("https://www.magicbricks.com/ready-to-move-flats-in-new-delhi-pppfs", headers=headers)
11+
c=r.content
12+
soup=BeautifulSoup(c,"html.parser")
13+
14+
15+
complete_dataset = []
16+
17+
18+
all_containers=soup.find_all("div",{"class":"flex relative clearfix m-srp-card__container"})
19+
for item in all_containers:
20+
item_data={}
21+
try:
22+
Price=item.find("div",{"class":"m-srp-card__price"}).text.replace("\n","").replace(" ","").replace("₹","")
23+
p=Price.split()
24+
item_data["Price"]=p[0]
25+
26+
except:
27+
Price=item.find("span",{"class":"luxury-srp-card__price"}).text.replace("\n","").replace(" ","").replace("₹","")
28+
p=Price.split()
29+
item_data["Price"]=p[0]
30+
31+
32+
try:
33+
Pricepersqft=item.find("div",{"class":"m-srp-card__area"}).text.replace("₹","")
34+
pr=Pricepersqft.split()
35+
item_data["Pricepersqft"]=pr[0]
36+
37+
except:
38+
try:
39+
Pricepersqft=item.find("span",{"class":"luxury-srp-card__sqft"}).text.replace("\n","").replace(" ","").replace("₹","")
40+
pr=Pricepersqft.split()
41+
item_data["Pricepersqft"]=pr[0]
42+
except:
43+
item_data["Pricepersqft"]=None
44+
45+
try:
46+
item_data["Size"]=item.find("span",{"class":"m-srp-card__title__bhk"}).text.replace("\n","").strip()[0:5]
47+
except:
48+
item_data["Size"]=None
49+
50+
51+
title=item.find("span",{"class":"m-srp-card__title"})
52+
53+
words=(title.text.replace("in","")).split()
54+
55+
for i in range(len(words)):
56+
if words[i]=="sale" or words[i]=="Sale":
57+
break
58+
s=""
59+
for word in range(i+1,len(words)):
60+
s=s+words[word]+" "
61+
62+
item_data["Address"]=s
63+
64+
try:
65+
item_data["Carpet Area"]=item.find("div",{"class":"m-srp-card__summary__info"}).text
66+
except:
67+
item_data["Carpet Area"]=item.find("div",{"class":"luxury-srp-card__area__value"}).text
68+
69+
70+
complete_dataset.append(item_data)
71+
72+
73+
74+
df=pandas.DataFrame(complete_dataset)
75+
df.to_csv("./Real Estate Webscrapper/scraped.csv")
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
requests==2.25.1
2+
pandas==1.2.4
3+
beautifulsoup4==4.9.3

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /