0

I am trying to run a simple script on my aws instance. Same script works well on windows 7 and ubuntu (python27). But when i run my scripts on my server the web site redirects me to an error page it says "you must enable js on your browser".

I tried a lot of things until now (user-agent, redirection handler, mechanize ext). I am getting these redirection only with the domain below. All other js enabled websites works well.

Do you have any idea?

import urllib2
req = urllib2.Request("http://www.sahibinden.com/ilan/emlak-konut-satilik-karatepe-emlak-tan-zumrutevler-de-2-plus1-ara-kat-luks-daire-186413632/detay")
response = urllib2.urlopen(req)
the_page = response.read()
print the_page

EDIT: It turns out the web page is blocking my server ip. Thanks for help

asked Feb 16, 2015 at 18:52

1 Answer 1

1

There's no error in your code.

You need a js interpreter in it.

urllib2 just gets the raw data and does not interpret the js code in the page.

You can check this: How to interpret JavaScript with Python


Also, it works fine with following code:

import requests
session = requests.Session()
session.get('http://www.sahibinden.com/ilan/emlak-konut-satilik-karatepe-emlak-tan-zumrutevler-de-2-plus1-ara-kat-luks-daire-186413632/detay').content.decode('utf8')

it returns tons of html code like this:

<li class="">\n Çamaşır Makinesi</li>\n <li class="">\n Çamaşır Odası</li>\n <li class="selected">\n Çelik Kapı</li>\n <li class="">\n Şofben</li>\n <li class="">\n Şömine</li>\n </ul>\n <h3>Dış Özellikler</h3>\n <ul>\n <li class="">\n Asansör</li>\n <li class="">\n Engelliye Uygun</li>\n <li class="">\n Güvenlik</li>\n <li class="selected">\n Hidrofor</li>\n <li class="selected">\n Isı Yalıtım</li>\n <li class="">\n Jeneratör</li>\n <li class="selected">\n Kablo TV - Uydu</li>\n <li class="">\n Kapalı Garaj</li>\n <li class="">\n Kapıcı</li>\n <li class="">\n Kreş</li>\n <li class="">\n Otopark</li>\n <li class="">\n Oyun Parkı</li>\n <li class="selected">\n Ses Yalıtımı</li>\n <li class="">\n Siding</li>\n <li class="">\n Spor Alanı</li>\n <li class="selected">\n Su Deposu</li>\n <li class="">\n Tenis Kortu</li>\n <li class="">\n Yangın Merdiveni</li>\n <li class="">\n Yüzme Havuzu (Açık)</li>\n <li class="">\n Yüzme Havuzu (Kapalı)</li>\n </ul>\n <h3>Muhit</h3>\n <ul>\n <li class="selected">\n Alışveriş Merkezi</li>\n <li class="">\n Belediye</li>\n <li class="selected">\n Cami</li>\n <li class="">\n Cemevi</li>\n <li class="">\n Denize Sıfır</li>\n <li class="selected">\n Eczane</li>\n <li class="">\n Eğlence Merkezi</li>\n <li class="">\n Fuar</li>\n <li class="selected">\n Hastane</li>\n <li class="">\n Havra</li>\n <li class="">\n Kilise</li>\n <li class="">\n Lise</li>\n <li class="selected">\n Market</li>\n <li class="selected">\n Park</li>\n <li class="">\n Polis Merkezi</li>\n <li class="selected">\n Sağlık Ocağı</li>\n <li class="selected">\n Semt Pazarı</li>\n <li class="">\n Spor Salonu</li>\n <li class="">\n Üniversite</li>\n <li class="selected">\n İlköğretim</li>\n <li class="">\n İtfaiye</li>\n <li class="">\n Şehir Merkezi</li>\n </ul>\n <h3>Ulaşım</h3>\n <ul>\n <li class="">\n Anayol</li>\n <li class="">\n Boğaz Köprüleri</li>\n <li class="selected">\n Cadde</li>\n <li class="">\n Deniz Otobüsü</li>\n <li class="">\n Dolmuş</li>\n <li class="selected">\n E-5</li>\n <li class="">\n Havaalanı</li>\n <li class="">\n Marmaray</li>\n <li class="selected">\n Metro</li>\n <li class="">\n Metrobüs</li>\n <li class="selected">\n Minibüs</li>\n <li class="">\n Otobüs Durağı</li>\n <li class="">\n Sahil</li>\n <li class="">\n TEM</li>\n <li class="">\n Tramvay</li>\n <li class="">\n Tren İstasyonu</li>\n <li class="">\n İskele</li>\n </ul>\n <h3>Manzara</h3>\n <ul>\n <li class="">\n Boğaz</li>\n <li class="">\n Deniz</li>\n <li class="">\n Doğa</li>\n <li class="">\n Göl</li>\n <li class="selected">\n Şehir</li>\n </ul>\n <h3>Konut Tipi</h3>\n <ul>\n <li class="">\n Ara Kat Dubleks</li>\n <li class="">\n Bahçe Dubleksi</li>\n <li class="">\n Bahçe Katı</li>\n <li class="">\n Bahçeli</li>\n <li class="">\n Müstakil Girişli</li>\n <li class="">\n Tripleks</li>\n <li class="">\n Çatı Dubleksi</li>\n </ul>\n </div>\n </div>\n<script type="text/javascript">\n var bannerZoneId = "101";\n</script>\n\n<div class="uiBox">\n <div class="uiBoxTitle">\n <h3>Hadi Taşının!</h3>\n </div>\n <div class="uiBoxContainer" id="adHelperBoxMov">\n <div class="helper">\n <ul>\n <script type="text/javascript">\n var classifiedFooterZone9 = "&amp;PAGE_NAME=ilan_detay_zone_9&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n var classifiedFooterZone10 = "&amp;PAGE_NAME=ilan_detay_zone_10&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n var classifiedFooterZone11 = "&amp;PAGE_NAME=ilan_detay_zone_11&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n var classifiedFooterZone12 = "&amp;PAGE_NAME=ilan_detay_zone_12&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n\n getBanner(bannerZoneId, classifiedFooterZone9);\n getBanner(bannerZoneId, classifiedFooterZone10);\n getBanner(bannerZoneId, classifiedFooterZone11);\n getBanner(bannerZoneId, classifiedFooterZone12);\n </script>\n </ul>\n </div>\n 

You could use geturl() method to determine whether your url is redirected (since the website might really generate the message you got according to your server's ip etc.). If it is really redirected, you can prevent it or do some other things. See How do I prevent Python's urllib(2) from following a redirect

answered Feb 16, 2015 at 18:56
Sign up to request clarification or add additional context in comments.

7 Comments

i know my code is working well on windows and ubuntu. The problem is same scripts getting a redirection on aws amazon linux server. I am not getting the same response.
@konjuge I cannot duplicate your situation, so I've no solid idea of solving this. Did you try to send exactly the same request on your server and your laptop? You could check the request using developer tools in your firefox or chrome and copy it to your code. You could also try requests package rather than urllib2.
yes i tried it. I tried requests packege also i got same results again. May be its all about some permission issue on aws server. but i have no idea
@konjuge Did you use geturl() to make sure that your url is redirected?
interesting when i use geturl it seems that there is no redirection. But the content of the webpage is still different
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.