I am trying to run a simple script on my aws instance. Same script works well on windows 7 and ubuntu (python27). But when i run my scripts on my server the web site redirects me to an error page it says "you must enable js on your browser".
I tried a lot of things until now (user-agent, redirection handler, mechanize ext). I am getting these redirection only with the domain below. All other js enabled websites works well.
Do you have any idea?
import urllib2
req = urllib2.Request("http://www.sahibinden.com/ilan/emlak-konut-satilik-karatepe-emlak-tan-zumrutevler-de-2-plus1-ara-kat-luks-daire-186413632/detay")
response = urllib2.urlopen(req)
the_page = response.read()
print the_page
EDIT: It turns out the web page is blocking my server ip. Thanks for help
1 Answer 1
There's no error in your code.
You need a js interpreter in it.
urllib2 just gets the raw data and does not interpret the js code in the page.
You can check this: How to interpret JavaScript with Python
Also, it works fine with following code:
import requests
session = requests.Session()
session.get('http://www.sahibinden.com/ilan/emlak-konut-satilik-karatepe-emlak-tan-zumrutevler-de-2-plus1-ara-kat-luks-daire-186413632/detay').content.decode('utf8')
it returns tons of html code like this:
<li class="">\n Çamaşır Makinesi</li>\n <li class="">\n Çamaşır Odası</li>\n <li class="selected">\n Çelik Kapı</li>\n <li class="">\n Şofben</li>\n <li class="">\n Şömine</li>\n </ul>\n <h3>Dış Özellikler</h3>\n <ul>\n <li class="">\n Asansör</li>\n <li class="">\n Engelliye Uygun</li>\n <li class="">\n Güvenlik</li>\n <li class="selected">\n Hidrofor</li>\n <li class="selected">\n Isı Yalıtım</li>\n <li class="">\n Jeneratör</li>\n <li class="selected">\n Kablo TV - Uydu</li>\n <li class="">\n Kapalı Garaj</li>\n <li class="">\n Kapıcı</li>\n <li class="">\n Kreş</li>\n <li class="">\n Otopark</li>\n <li class="">\n Oyun Parkı</li>\n <li class="selected">\n Ses Yalıtımı</li>\n <li class="">\n Siding</li>\n <li class="">\n Spor Alanı</li>\n <li class="selected">\n Su Deposu</li>\n <li class="">\n Tenis Kortu</li>\n <li class="">\n Yangın Merdiveni</li>\n <li class="">\n Yüzme Havuzu (Açık)</li>\n <li class="">\n Yüzme Havuzu (Kapalı)</li>\n </ul>\n <h3>Muhit</h3>\n <ul>\n <li class="selected">\n Alışveriş Merkezi</li>\n <li class="">\n Belediye</li>\n <li class="selected">\n Cami</li>\n <li class="">\n Cemevi</li>\n <li class="">\n Denize Sıfır</li>\n <li class="selected">\n Eczane</li>\n <li class="">\n Eğlence Merkezi</li>\n <li class="">\n Fuar</li>\n <li class="selected">\n Hastane</li>\n <li class="">\n Havra</li>\n <li class="">\n Kilise</li>\n <li class="">\n Lise</li>\n <li class="selected">\n Market</li>\n <li class="selected">\n Park</li>\n <li class="">\n Polis Merkezi</li>\n <li class="selected">\n Sağlık Ocağı</li>\n <li class="selected">\n Semt Pazarı</li>\n <li class="">\n Spor Salonu</li>\n <li class="">\n Üniversite</li>\n <li class="selected">\n İlköğretim</li>\n <li class="">\n İtfaiye</li>\n <li class="">\n Şehir Merkezi</li>\n </ul>\n <h3>Ulaşım</h3>\n <ul>\n <li class="">\n Anayol</li>\n <li class="">\n Boğaz Köprüleri</li>\n <li class="selected">\n Cadde</li>\n <li class="">\n Deniz Otobüsü</li>\n <li class="">\n Dolmuş</li>\n <li class="selected">\n E-5</li>\n <li class="">\n Havaalanı</li>\n <li class="">\n Marmaray</li>\n <li class="selected">\n Metro</li>\n <li class="">\n Metrobüs</li>\n <li class="selected">\n Minibüs</li>\n <li class="">\n Otobüs Durağı</li>\n <li class="">\n Sahil</li>\n <li class="">\n TEM</li>\n <li class="">\n Tramvay</li>\n <li class="">\n Tren İstasyonu</li>\n <li class="">\n İskele</li>\n </ul>\n <h3>Manzara</h3>\n <ul>\n <li class="">\n Boğaz</li>\n <li class="">\n Deniz</li>\n <li class="">\n Doğa</li>\n <li class="">\n Göl</li>\n <li class="selected">\n Şehir</li>\n </ul>\n <h3>Konut Tipi</h3>\n <ul>\n <li class="">\n Ara Kat Dubleks</li>\n <li class="">\n Bahçe Dubleksi</li>\n <li class="">\n Bahçe Katı</li>\n <li class="">\n Bahçeli</li>\n <li class="">\n Müstakil Girişli</li>\n <li class="">\n Tripleks</li>\n <li class="">\n Çatı Dubleksi</li>\n </ul>\n </div>\n </div>\n<script type="text/javascript">\n var bannerZoneId = "101";\n</script>\n\n<div class="uiBox">\n <div class="uiBoxTitle">\n <h3>Hadi Taşının!</h3>\n </div>\n <div class="uiBoxContainer" id="adHelperBoxMov">\n <div class="helper">\n <ul>\n <script type="text/javascript">\n var classifiedFooterZone9 = "&PAGE_NAME=ilan_detay_zone_9&CATEGORY_ID=16633&PARENT_ID=16623&CATEGORY_LEVEL_0=3518&CATEGORY_LEVEL_1=3613&CATEGORY_LEVEL_2=16623&CATEGORY_LEVEL_3=16633&CATEGORY_LEVEL_4=0&CATEGORY_LEVEL_5=0&CATEGORY_LEVEL_6=0&LANGUAGE=tr&CITY_ID=34&DISTRICT_ID=2177&TOWN_ID=446&QUARTER_ID=23171" + cAttributes;\n var classifiedFooterZone10 = "&PAGE_NAME=ilan_detay_zone_10&CATEGORY_ID=16633&PARENT_ID=16623&CATEGORY_LEVEL_0=3518&CATEGORY_LEVEL_1=3613&CATEGORY_LEVEL_2=16623&CATEGORY_LEVEL_3=16633&CATEGORY_LEVEL_4=0&CATEGORY_LEVEL_5=0&CATEGORY_LEVEL_6=0&LANGUAGE=tr&CITY_ID=34&DISTRICT_ID=2177&TOWN_ID=446&QUARTER_ID=23171" + cAttributes;\n var classifiedFooterZone11 = "&PAGE_NAME=ilan_detay_zone_11&CATEGORY_ID=16633&PARENT_ID=16623&CATEGORY_LEVEL_0=3518&CATEGORY_LEVEL_1=3613&CATEGORY_LEVEL_2=16623&CATEGORY_LEVEL_3=16633&CATEGORY_LEVEL_4=0&CATEGORY_LEVEL_5=0&CATEGORY_LEVEL_6=0&LANGUAGE=tr&CITY_ID=34&DISTRICT_ID=2177&TOWN_ID=446&QUARTER_ID=23171" + cAttributes;\n var classifiedFooterZone12 = "&PAGE_NAME=ilan_detay_zone_12&CATEGORY_ID=16633&PARENT_ID=16623&CATEGORY_LEVEL_0=3518&CATEGORY_LEVEL_1=3613&CATEGORY_LEVEL_2=16623&CATEGORY_LEVEL_3=16633&CATEGORY_LEVEL_4=0&CATEGORY_LEVEL_5=0&CATEGORY_LEVEL_6=0&LANGUAGE=tr&CITY_ID=34&DISTRICT_ID=2177&TOWN_ID=446&QUARTER_ID=23171" + cAttributes;\n\n getBanner(bannerZoneId, classifiedFooterZone9);\n getBanner(bannerZoneId, classifiedFooterZone10);\n getBanner(bannerZoneId, classifiedFooterZone11);\n getBanner(bannerZoneId, classifiedFooterZone12);\n </script>\n </ul>\n </div>\n
You could use geturl() method to determine whether your url is redirected (since the website might really generate the message you got according to your server's ip etc.).
If it is really redirected, you can prevent it or do some other things. See How do I prevent Python's urllib(2) from following a redirect
7 Comments
Explore related questions
See similar questions with these tags.