This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年12月04日 18:30 by terry.reedy, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| fileio_buffer.patch | christian.heimes, 2008年12月04日 22:35 | |||
| fileio_buffer2.patch | christian.heimes, 2008年12月04日 23:37 | |||
| Messages (10) | |||
|---|---|---|---|
| msg76915 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2008年12月04日 18:30 | |
C.l.p poster reported that 3.0 file.read is orders of magnitude slower
than with 2.5 (but confused issue with buffer = 0). Jerry Hill reported
"Here's a quick comparison between 2.5 and
3.0 on a relatively small 17 meg file:
C:\>c:\Python30\python -m timeit -n 1
"open('C:\\work\\temp\\bppd_vsub.csv', 'rb').read()"
1 loops, best of 3: 36.8 sec per loop
C:\>c:\Python25\python -m timeit -n 1
"open('C:\\work\\temp\\bppd_vsub.csv', 'rb').read()"
1 loops, best of 3: 33 msec per loop
That's 3 orders of magnitude slower on python3.0!"
I verified this informally on WinXP by opening and then reading
Doc/Pythonxy.chm (about 4 megs) -- an eye blink versus 3 seconds,
repeated. Even the open seemed slower but I did not time it.
>>> f=open('Doc/Python30.chm','rb')
>>> d=f.read()
|
|||
| msg76920 - (view) | Author: Christian Heimes (christian.heimes) * (Python committer) | Date: 2008年12月04日 20:21 | |
This needs definitely some testing! |
|||
| msg76934 - (view) | Author: Christian Heimes (christian.heimes) * (Python committer) | Date: 2008年12月04日 21:56 | |
The small buffer size in Modules/_fileio.c is one reason for the slowness. $ dd if=/dev/zero of=zeros bs=1MB count=50 $ cat testread.py open("zeros", "rb").read() $ ./python -m cProfile testread.py 40 function calls (39 primitive calls) in 4.246 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.016 0.016 4.246 4.246 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 io.py:277(__new__) 2 0.000 0.000 0.000 0.000 io.py:355(flush) 2 0.000 0.000 0.000 0.000 io.py:364(close) 2 0.000 0.000 0.000 0.000 io.py:376(__del__) 1 0.000 0.000 0.000 0.000 io.py:413(_checkReadable) 1 0.000 0.000 0.000 0.000 io.py:614(__init__) 2 0.000 0.000 0.000 0.000 io.py:618(close) 1 0.000 0.000 0.000 0.000 io.py:708(__init__) 1 0.000 0.000 0.000 0.000 io.py:733(flush) 1 0.000 0.000 0.000 0.000 io.py:736(close) 1 0.000 0.000 0.000 0.000 io.py:755(closed) 1 0.000 0.000 0.000 0.000 io.py:82(open) 1 0.000 0.000 0.000 0.000 io.py:896(__init__) 2 0.000 0.000 0.000 0.000 io.py:905(_reset_read_buf) 1 0.021 0.021 4.230 4.230 io.py:909(read) 1 0.000 0.000 4.209 4.209 io.py:920(_read_unlocked) 1 0.000 0.000 0.000 0.000 {built-in method allocate_lock} 2/1 0.000 0.000 4.246 4.246 {built-in method exec} 1 0.000 0.000 0.000 0.000 {built-in method fstat} 2 0.000 0.000 0.000 0.000 {built-in method isinstance} 3 0.000 0.000 0.000 0.000 {built-in method len} 1 0.000 0.000 0.000 0.000 {method '__enter__' of '_thread.lock' objects} 1 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {method 'fileno' of '_FileIO' objects} 1 0.000 0.000 0.000 0.000 {method 'isatty' of '_FileIO' objects} 1 0.825 0.825 0.825 0.825 {method 'join' of 'bytes' objects} 2 3.384 1.692 3.384 1.692 {method 'read' of '_FileIO' objects} 1 0.000 0.000 0.000 0.000 {method 'readable' of '_FileIO' objects} $ vi Modules/_fileio.c -#define DEFAULT_BUFFER_SIZE (8*1024) +#define DEFAULT_BUFFER_SIZE (80*1024) $ ./python -m cProfile testread.py 40 function calls (39 primitive calls) in 1.273 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.019 0.019 1.273 1.273 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 io.py:277(__new__) 2 0.000 0.000 0.000 0.000 io.py:355(flush) 2 0.000 0.000 0.000 0.000 io.py:364(close) 2 0.000 0.000 0.000 0.000 io.py:376(__del__) 1 0.000 0.000 0.000 0.000 io.py:413(_checkReadable) 1 0.000 0.000 0.000 0.000 io.py:614(__init__) 2 0.000 0.000 0.000 0.000 io.py:618(close) 1 0.000 0.000 0.000 0.000 io.py:708(__init__) 1 0.000 0.000 0.000 0.000 io.py:733(flush) 1 0.000 0.000 0.000 0.000 io.py:736(close) 1 0.000 0.000 0.000 0.000 io.py:755(closed) 1 0.000 0.000 0.000 0.000 io.py:82(open) 1 0.000 0.000 0.000 0.000 io.py:896(__init__) 2 0.000 0.000 0.000 0.000 io.py:905(_reset_read_buf) 1 0.016 0.016 1.254 1.254 io.py:909(read) 1 0.000 0.000 1.238 1.238 io.py:920(_read_unlocked) 1 0.000 0.000 0.000 0.000 {built-in method allocate_lock} 2/1 0.000 0.000 1.273 1.273 {built-in method exec} 1 0.000 0.000 0.000 0.000 {built-in method fstat} 2 0.000 0.000 0.000 0.000 {built-in method isinstance} 3 0.000 0.000 0.000 0.000 {built-in method len} 1 0.000 0.000 0.000 0.000 {method '__enter__' of '_thread.lock' objects} 1 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {method 'fileno' of '_FileIO' objects} 1 0.000 0.000 0.000 0.000 {method 'isatty' of '_FileIO' objects} 1 1.156 1.156 1.156 1.156 {method 'join' of 'bytes' objects} 2 0.081 0.041 0.081 0.041 {method 'read' of '_FileIO' objects} 1 0.000 0.000 0.000 0.000 {method 'readable' of '_FileIO' objects} |
|||
| msg76936 - (view) | Author: Christian Heimes (christian.heimes) * (Python committer) | Date: 2008年12月04日 22:35 | |
The fileio_buffer.patch implements the same progressive buffer as Python 2.x' Object/fileobject.c. |
|||
| msg76940 - (view) | Author: Gregory P. Smith (gregory.p.smith) * (Python committer) | Date: 2008年12月04日 22:50 | |
patch looks good to me. nitpick comments: use += instead of = and + in: newsize = newsize + newsize and newsize = newsize + BIGCHUNK. As for the XXX about overflow, so long as BUFSIZ is not defined to be an insanely large number (it should never be) this will be fine. add a preprocessor test for that in. #if (BUFSIZ >= 2**30) #error "unreasonable BUFSIZ defined" #endif |
|||
| msg76944 - (view) | Author: Christian Heimes (christian.heimes) * (Python committer) | Date: 2008年12月04日 23:37 | |
The preprocessor doesn't handle power. 2 << 24 (64MB) sounds sufficient for me. |
|||
| msg76971 - (view) | Author: Gregory P. Smith (gregory.p.smith) * (Python committer) | Date: 2008年12月05日 07:43 | |
fileio_buffer2.patch looks good other than minor touchups: Turn the XXX comment into: /* NOTE: overflow impossible due to limits on BUFSIZ * Also, 2 << 24 is 32MB yet your error message test says >= 64MB. I think you meant 1 << 26. fix those and commit. :) |
|||
| msg76981 - (view) | Author: Christian Heimes (christian.heimes) * (Python committer) | Date: 2008年12月05日 08:33 | |
The updated patch has been committed to 3.0 and 3.1. I'm going to backport the patch to 2.x later. |
|||
| msg78098 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2008年12月20日 12:48 | |
Since it is solved for 3.x and only needs to be bacported to 2.x (where the "io" module isn't the default), downgrading to critical. |
|||
| msg90024 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年07月02日 16:06 | |
This has been fixed as part of the big IO update in trunk. I assume nobody really cares about making a separate patch for 2.6, please re-open if you are interested! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:42 | admin | set | github: 48783 |
| 2009年07月02日 16:06:30 | pitrou | set | status: open -> closed resolution: accepted -> fixed messages: + msg90024 |
| 2008年12月20日 12:48:44 | pitrou | set | priority: release blocker -> critical nosy: + pitrou messages: + msg78098 |
| 2008年12月20日 02:40:30 | loewis | set | priority: deferred blocker -> release blocker |
| 2008年12月10日 08:22:55 | loewis | set | priority: release blocker -> deferred blocker |
| 2008年12月05日 08:33:25 | christian.heimes | set | resolution: accepted stage: patch review -> commit review messages: + msg76981 versions: + Python 2.6, Python 2.7, - Python 3.0, Python 3.1 |
| 2008年12月05日 07:43:31 | gregory.p.smith | set | messages: + msg76971 |
| 2008年12月04日 23:37:50 | christian.heimes | set | files:
+ fileio_buffer2.patch messages: + msg76944 |
| 2008年12月04日 22:50:34 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages: + msg76940 |
| 2008年12月04日 22:35:12 | christian.heimes | set | files:
+ fileio_buffer.patch keywords: + patch messages: + msg76936 stage: test needed -> patch review |
| 2008年12月04日 21:56:20 | christian.heimes | set | messages: + msg76934 |
| 2008年12月04日 20:21:30 | christian.heimes | set | versions:
+ Python 3.1 nosy: + christian.heimes messages: + msg76920 priority: release blocker components: + Extension Modules stage: test needed |
| 2008年12月04日 18:30:19 | terry.reedy | create | |