Issue849407
Created on 2003年11月26日 03:41 by allanbwilson, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files |
| File name |
Uploaded |
Description |
Edit |
|
urllibdiff.txt
|
allanbwilson,
2003年11月27日 20:03
|
diff -c ... >urllibdiff.txt |
| Messages (4) |
|
msg44947 - (view) |
Author: Allan B. Wilson (allanbwilson) |
Date: 2003年11月26日 03:41 |
A reporthook in urllib.urlretrieve() (in 2.3.2) is
given the max number of characters accepted ("bs") per
.read() as its second argument. It would be more
helpful to receive the number of characters actually
retrieved in the most recent block.
While perhaps this would break some existing code
(though I can't imagine how), the minor patches below
will allow giving progess updates, etc. that are accurate.
Thanks
Allan Wilson
------------
*** urllib.py.old Tue Nov 25 17:42:55 2003
--- urllib.py Tue Nov 25 18:00:50 2003
***************
*** 236,248 ****
reporthook(0, bs, size)
block = fp.read(bs)
if reporthook:
! reporthook(1, bs, size)
while block:
tfp.write(block)
block = fp.read(bs)
blocknum = blocknum + 1
if reporthook:
! reporthook(blocknum, bs, size)
fp.close()
tfp.close()
del fp
--- 236,248 ----
reporthook(0, bs, size)
block = fp.read(bs)
if reporthook:
! reporthook(1, len(block), size)
while block:
tfp.write(block)
block = fp.read(bs)
blocknum = blocknum + 1
if reporthook:
! reporthook(blocknum, len(block), size)
fp.close()
tfp.close()
del fp
|
|
msg44948 - (view) |
Author: Martin v. Löwis (loewis) * (Python committer) |
Date: 2003年11月27日 19:41 |
Logged In: YES
user_id=21627
Can you please attach the patch, instead of pasting it?
|
|
msg44949 - (view) |
Author: Guido van Rossum (gvanrossum) * (Python committer) |
Date: 2006年11月19日 16:44 |
I notice that the patch doesn't apply to the svn head (2.6a0). But that's easily fixed and the idea still applies.
As the original author of the code being patched I believe my reason for doing it the old way was that I wanted the report hook to be called before the first block, which would let a GUI open up a dialog box before anything was read. The idea was that if the reads are really slow, you'd want the dialog box there right from the start. But this was rather naive, since the most likely source of delay is making the connection and getting the response header back, and the report hook isn't being called at all until all the headers have been seen.
The changed API to reporthook() needs to be documented very clearly. There's one call to reporthook() that still passes the block size instead of the actual data size. A naive implementation could be confused by this call, although it is easily recognized because it is the first call and the only one with blocknum equal to zero.
I think this is a fine change -- as long as it isn't backported, since it is clearly a feature change. I do wonder "why bother", since most people using urllib don't care all that much about extreme details (I can't remember the last time I specified a reporthook), and most people caring about details don't like urllib and use something else (e.g. httplib, or urllib2).
So I guess I'm somewhere between +0 and -0 on this on this.
|
|
msg44950 - (view) |
Author: Martin v. Löwis (loewis) * (Python committer) |
Date: 2006年11月21日 18:29 |
Discussion on python-dev revealed that read() on a socket will always give you blocksize data, except for the last block. So this doesn't really change anything in practice; applications that find that the data read (blocksize*blocknumber) exceeds the amount of data expected should conclude that they saw the last block.
Rejecting this patch.
|
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2022年04月11日 14:56:01 | admin | set | github: 39609 |
| 2003年11月26日 03:41:59 | allanbwilson | create |