urllib.urlretrieve never returns???

John Nagle nagle at animats.com
Tue Mar 20 16:18:24 EDT 2012


On 3/17/2012 9:34 AM, Chris Angelico wrote:
> 2012年3月18日 Laszlo Nagy<gandalf at shopzeus.com>:
>> In the later case, "log.txt" only contains "#1" and nothing else. If I look
>> at pythonw.exe from task manager, then its shows +1 thread every time I
>> click the button, and "#1" is appended to the file.

 Does it fail to retrieve on all URLs, or only on some of them?
 Running a web crawler, I've seen some pathological cases.
There are a very few sites that emit data very, very slowly,
but don't time out because they are making progress. There are
also some sites where attempting to negotiate a SSL connection
results in the SSL protocol reaching a point where the host end
is supposed to finish the handshake, but it doesn't.
 The odds are against this being the problem. I see problems
like that in maybe 1 in 100,000 URLs.
				John Nagle


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /