This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
| Author | cosoleto |
|---|---|
| Recipients | cosoleto |
| Date | 2007年09月26日.07:55:12 |
| SpamBayes Score | 0.02637627 |
| Marked as misclassified | No |
| Message-id | <1190793332.9.0.29899287923.issue1205@psf.upfronthosting.co.za> |
| In-reply-to |
| Content | |
|---|---|
urllib fail to read URL contents, urllib2 crash Python Python version: ------------------------- Python 2.5.1 (r251:54863, May 18 2007, 16:56:43) [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on win32 Python 2.4.4 (#2, Aug 16 2007, 00:34:54) [GCC 4.1.3 20070812 (prerelease) (Debian 4.1.2-15)] on linux2 ------------------------- Working with GNU wget: ------------------------- $ wget -S http://www.recherche.fr/encyclopedie/Thomas-Robert_Bugeaud --08:42:21-- http://www.recherche.fr/encyclopedie/Thomas-Robert_Bugeaud => `Thomas-Robert_Bugeaud' Risoluzione di www.recherche.fr in corso... 88.191.11.214 Connessione a www.recherche.fr|88.191.11.214:80... connesso. HTTP richiesta inviata, aspetto la risposta... HTTP/1.1 200 OK Date: 2007年9月26日 06:42:53 GMT Server: Apache/2.2.3 (Debian) PHP/5.2.3-0.dotdeb.1 with Suhosin-Patch X-Powered-By: PHP/5.2.3-0.dotdeb.1 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8 Lunghezza: non specificato [text/html] [ <=> ] 267,080 --.--K/s 08:42:42 (14.11 KB/s) - "Thomas-Robert_Bugeaud" salvato [267080] ------------------------- Python: ------------------------- >>> import urllib >>> a = urllib.urlopen('http://www.recherche.fr/encyclopedie/Thomas- Robert_Bugeaud') >>> c = a.read(1024*1024*2) >>> len(c) 1035220 >>> c[63000:64000] 'he.fr en page d\'accueil</a><br>\n <span>Partenaires :</span> <a href="http://www.cartes.fr/" target="_blank">Cartes\n postales</a> <a href="http://www.deux.fr/script/" target="_blank">Rencontres\n gratuites\n </a> <a href="http://www.new.fr/" target="_blank">Noms\n de domaine gratuits</a> <a href="http://www.netencyclo.com/" target="_blank">Encyclopedia</a> </p>\n <p style="text- align:center;"><a href="http://www.futureobject.com/" target="_blank"><img src="http://www.recherche.fr/images/logo_fo.gif" border="0" height="25" width="96"></a></p>\n\n </p>\n </div>\n </div><!-- site -->\n</body>\n</html>\n\r\n\x00\x00\x00\x00\x00\x00\x00 \x00\x00[...omission...]\x00\x00\x00\x00' ------------------------- As above, but with urllib2 module instead of urllib: ------------------------- File "/usr/lib/python2.5/socket.py", line 291, in read data = self._sock.recv(recv_size) File "/usr/lib/python2.5/httplib.py", line 509, in read return self._read_chunked(amt) File "/usr/lib/python2.5/httplib.py", line 548, in _read_chunked chunk_left = int(line, 16) ValueError: invalid literal for int() with base 16: '\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00[...omission...]\x00\x00\x00\x00\x00\x00\x00 \ ------------------------- As above, but with Python 2.4: ------------------------- >>> import urllib2 >>> a = urllib2.urlopen('http://www.recherche.fr/encyclopedie/Thomas- Robert_Bugeaud') >>> >>> c = a.read(1024*1024*2) Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/lib/python2.4/socket.py", line 295, in read data = self._sock.recv(recv_size) File "/usr/lib/python2.4/httplib.py", line 460, in read return self._read_chunked(amt) File "/usr/lib/python2.4/httplib.py", line 499, in _read_chunked chunk_left = int(line, 16) ValueError: invalid literal for int(): ------------------------- Regards, Francesco Cosoleto |
|
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2007年09月26日 07:55:35 | cosoleto | set | spambayes_score: 0.0263763 -> 0.02637627 recipients: + cosoleto |
| 2007年09月26日 07:55:33 | cosoleto | set | spambayes_score: 0.0263763 -> 0.0263763 messageid: <1190793332.9.0.29899287923.issue1205@psf.upfronthosting.co.za> |
| 2007年09月26日 07:55:32 | cosoleto | link | issue1205 messages |
| 2007年09月26日 07:55:13 | cosoleto | create | |