homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: binary buffered reading is quadratic
Type: performance Stage:
Components: Library (Lib) Versions: Python 3.0
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: alexandre.vassalotti, gregory.p.smith, loewis, pitrou
Priority: high Keywords: patch

Created on 2008年03月31日 18:11 by pitrou, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
binaryio1.patch pitrou, 2008年05月08日 02:14
binaryio2.patch pitrou, 2008年05月08日 14:35
binaryio3.patch alexandre.vassalotti, 2008年06月08日 05:42
Messages (20)
msg64788 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年03月31日 18:11
In py3k, buffered binary IO can be quadratic when e.g. reading a whole file.
This is a small test on 50KB, 100KB and 200KB files:
-> py3k with buffering:
./python -m timeit -s "f = open('50KB', 'rb')" "f.seek(0); f.read()"
1000 loops, best of 3: 286 usec per loop
./python -m timeit -s "f = open('100KB', 'rb')" "f.seek(0); f.read()"
1000 loops, best of 3: 1.07 msec per loop
./python -m timeit -s "f = open('200KB', 'rb')" "f.seek(0); f.read()"
100 loops, best of 3: 4.85 msec per loop
-> py3k without buffering (just the raw FileIO layer):
./python -m timeit -s "f = open('50KB', 'rb', buffering=0)" "f.seek(0);
f.read()"
10000 loops, best of 3: 46 usec per loop
./python -m timeit -s "f = open('100KB', 'rb', buffering=0)" "f.seek(0);
f.read()"
10000 loops, best of 3: 88.7 usec per loop
./python -m timeit -s "f = open('200KB', 'rb', buffering=0)" "f.seek(0);
f.read()"
10000 loops, best of 3: 156 usec per loop
-> for comparison, Python 2.5:
python -m timeit -s "f = open('50KB', 'rb')" "f.seek(0); f.read()"
10000 loops, best of 3: 34.4 usec per loop
python -m timeit -s "f = open('100KB', 'rb')" "f.seek(0); f.read()"
10000 loops, best of 3: 62.3 usec per loop
python -m timeit -s "f = open('200KB', 'rb')" "f.seek(0); f.read()"
10000 loops, best of 3: 119 usec per loop
I'm posting this issue as a reminder, but perhaps someone is already
working on this, or the goal is to translate it to C ultimately?
msg65236 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年04月09日 11:21
By the way, a simple way to fix it would be to use a native BytesIO
object (as provided by Alexandre's patch in #1751) rather than a str
object for the underlying buffer.
msg66390 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年05月08日 02:14
Here is a pure Python patch removing the quadratic behaviour and trying
to make read operations generally faster.
Here are some numbers:
./python -m timeit -s "f = open('50KB', 'rb')" "f.seek(0)" "while
f.read(11): pass"
-> py3k without patch: 23.6 msec per loop
-> py3k with patch: 14.5 msec per loop
-> Python 2.5: 4.72 msec per loop
./python -m timeit -s "f = open('50KB', 'rb')" "f.seek(0); f.read()"
-> py3k without patch: 284 usec per loop
-> py3k with patch: 90.1 usec per loop
-> Python 2.5: 33.8 usec per loop
./python -m timeit -s "f = open('100KB', 'rb')" "f.seek(0); f.read()"
-> py3k without patch: 828 usec per loop
-> py3k with patch: 142 usec per loop
-> Python 2.5: 62.5 usec per loop
./python -m timeit -s "f = open('200KB', 'rb')" "f.seek(0); f.read()"
-> py3k without patch: 3.67 msec per loop
-> py3k with patch: 375 usec per loop
-> Python 2.5: 131 usec per loop
And, for the record, with a 10MB file:
./python -m timeit -s "f = open('10MB', 'rb')" "f.seek(0); f.read()"
-> py3k without patch: still running after more than one minute, gave up
-> py3k with patch: 38.6 msec per loop
-> Python 2.5: 20.4 msec per loop
msg66393 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008年05月08日 02:48
I see that the code is still using the immutable bytes object for its
buffer (which forces Python to create a new buffer every time its
modified). Also, I think it worthwhile to check if using a pre-allocated
bytearray object (i.e., bytearray(buffer_size) where `buffer_size` is an
integer) would have any performance benefits.
msg66396 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年05月08日 03:12
Hi Alexandre,
I first tried to use a (non-preallocated) bytearray object and, after
trying several optimization schemes, I found out that the best one
worked as well with an immutable bytes object :) I also found out that
the bytes <-> bytearray conversion costs can be noticeable in some
benchmarks.
The internal buffer is rarely reallocated because the current offset
inside it is remembered instead; also, when reading bytes from the
underlying unbuffered stream, a list of bytes objects is accumulated and
then joined at the end.
I think a preallocated bytearray would not make a lot of sense since we
can't readinto() an arbitrary position, so we still have a memory copy
from the bytes object returned by raw.read() to the bytearray buffer,
and then when returning the result to the user as a bytes object we have
another memory copy. In other words each read byte is copied twice more.
Of course, if this code was rewritten in C, different compromises would
be possible.
cheers
Antoine.
msg66417 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年05月08日 14:35
Some code relies on -1 being usable as the default value for read()
(instead of None), this new patch conforms to this expectation. It fixes
some failures in test_mailbox.
msg67817 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年06月07日 21:30
I recommend not letting this issue rot too much :) Eating 20+ seconds to
read the contents of a 10MB binary file in one pass is not very good
marketing-wise, and the betas are coming soon...
msg67818 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008年06月07日 23:18
I am going to go through your patch as soon as I get the time -- i.e.,
later today or tomorrow morning.
msg67822 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008年06月08日 05:42
I reviewed the patch and I found a few bugs -- i.e., peek() was
replacing the buffer content, read() wasn't written in consideration of
non-blocking streams, the removal of the None check in
BufferedRandom.read() was wrong.
Here's an updated patch.
msg67830 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年06月08日 10:08
Thanks for the fixes.
By the way, I don't know much about non-blocking streams, but it seems
to me that "optimal" non-blocking read() would require that the chunks
we ask to the OS are block-aligned, which is not the case currently (we
use 2*avail, which can be an arbitrary value).
msg67887 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008年06月10日 02:22
Oh, that is simple to fix. You can round the value 2*avail to the
nearest block by doing something like (2*avail) & ~(bksize-1) where
bksize is a power of 2, or the less magic (2*avail//bksize) * bksize.
msg67900 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年06月10日 08:34
Yup. However, if you try it, you'll probably notice that it decreases
performance of normal (blocking) reads as well :-)
Anyway, non-blocking file objects are pretty much second-class citizens
in Py3k right now, so my remark was theoretical.
msg69885 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年07月17日 16:04
If nobody objects I'll commit Alexandre's patch in a few days (after
beta 2 though).
msg70028 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008年07月19日 13:47
I don't understand the second loop (where n is given). If n is given,
there should be only a single read operation, using
 max(buffer_size, n-avail)
(i.e. the way it is in patch 2). In particular, if the stream is
unbuffered, it shouldn't ever end up with buffered data.
msg70100 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年07月21日 09:30
Selon "Martin v. Löwis" <report@bugs.python.org>:
>
> Martin v. Löwis <martin@v.loewis.de> added the comment:
>
> I don't understand the second loop (where n is given). If n is given,
> there should be only a single read operation, using
>
> max(buffer_size, n-avail)
I mimicked the original logic rather than rethink the algorithm. I'm not totally
sure what motivates the original logic but the purpose seems to be that
non-blocking streams can return at least a few bytes rather than an empty string
when less than N bytes are ready at OS level.
> (i.e. the way it is in patch 2). In particular, if the stream is
> unbuffered, it shouldn't ever end up with buffered data.
Hmm, what do you mean by "if the stream is unbuffered"? Unbuffered streams
should use the raw unbuffered objects (e.g. FileIO) rather than wrap them with
BufferedReader.
msg70119 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008年07月21日 21:18
>> max(buffer_size, n-avail)
> 
> I mimicked the original logic rather than rethink the algorithm. I'm not totally
> sure what motivates the original logic but the purpose seems to be that
> non-blocking streams can return at least a few bytes rather than an empty string
> when less than N bytes are ready at OS level.
IIUC, a read of the full requested size would achieve exactly that: on a
non-blocking stream (IIUC), a read will always return
min(bytes_available, bytes_requested).
> Hmm, what do you mean by "if the stream is unbuffered"? Unbuffered streams
> should use the raw unbuffered objects (e.g. FileIO) rather than wrap them with
> BufferedReader.
IIUC, io.open will always return a BufferedReader, potentially with
buffer_size=0 for unbuffered IO. This case must be supported.
msg70127 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年07月21日 22:59
Le lundi 21 juillet 2008 à 21:18 +0000, Martin v. Löwis a écrit :
> IIUC, a read of the full requested size would achieve exactly that: on a
> non-blocking stream (IIUC), a read will always return
> min(bytes_available, bytes_requested).
Hmm, it seems logical indeed... Alexandre, do you have other information
on the subject?
> IIUC, io.open will always return a BufferedReader, potentially with
> buffer_size=0 for unbuffered IO. This case must be supported.
No, io.open returns the raw object without wrapping it:
 if buffering == 0:
 if binary:
 raw._name = file
 raw._mode = mode
 return raw
 raise ValueError("can't have unbuffered text I/O")
We could even decide to raise a ValueError when trying to construct a
BufferedReader with a buffer_size < 1.
msg70166 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008年07月23日 03:02
Antoine wrote:
> Le lundi 21 juillet 2008 à 21:18 +0000, Martin v. Löwis a écrit :
> > IIUC, a read of the full requested size would achieve exactly that: 
on a
> > non-blocking stream (IIUC), a read will always return
> > min(bytes_available, bytes_requested).
>
> Hmm, it seems logical indeed... Alexandre, do you have other 
information
> on the subject?
Martin is right. However, I don't how Python handle the case where 
bytes_available is zero (in C, an error value is returned and errno is 
set to EWOULDBLOCK).
When I revised the patch I had a weak understanding of nonblocking I/O. 
I thought the "exponential" reads were for nonblocking I/O, but I see 
now that is non-sense.
I am not sure, but I think Martin is also right about the second loop. 
The max() call should be changed back to "max(self.buffer_size, n))", 
like in the 2nd patch.
msg70171 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年07月23日 09:19
> When I revised the patch I had a weak understanding of nonblocking I/O.
> I thought the "exponential" reads were for nonblocking I/O, but I see
> now that is non-sense.
Fine, so it will make the patch simpler.
As for non-blocking IO, I think we should raise the general issue on
python-3000. There is no real support for it right now, by which I mean (1) no
easy and portable way of enable non-blocking IO on a file object and (2) no test
cases of non-blocking IO in real-world conditions (rather than with mock
objects). This shouldn't stop us from fixing the present bug though.
> I am not sure, but I think Martin is also right about the second loop.
> The max() call should be changed back to "max(self.buffer_size, n))",
> like in the 2nd patch.
Ok. Could you produce an updated patch? :)
msg70369 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008年07月28日 19:49
Following our discussion and Guido's answer on python-3000 (*), I
committed a modified fix in r65264.
(*) http://mail.python.org/pipermail/python-3000/2008-July/014466.html 
History
Date User Action Args
2022年04月11日 14:56:32adminsetgithub: 46775
2008年07月28日 19:49:18pitrousetstatus: open -> closed
resolution: fixed
messages: + msg70369
2008年07月23日 09:19:56pitrousetmessages: + msg70171
2008年07月23日 03:02:30alexandre.vassalottisetmessages: + msg70166
2008年07月21日 22:59:12pitrousetmessages: + msg70127
2008年07月21日 21:18:51loewissetmessages: + msg70119
2008年07月21日 09:30:17pitrousetmessages: + msg70100
2008年07月19日 13:47:33loewissetnosy: + loewis
messages: + msg70028
2008年07月17日 16:04:05pitrousetmessages: + msg69885
2008年06月10日 08:34:37pitrousetmessages: + msg67900
2008年06月10日 02:22:58alexandre.vassalottisetmessages: + msg67887
2008年06月08日 10:08:27pitrousetmessages: + msg67830
2008年06月08日 05:42:15alexandre.vassalottisetfiles: + binaryio3.patch
messages: + msg67822
2008年06月07日 23:18:01alexandre.vassalottisetmessages: + msg67818
2008年06月07日 21:30:20pitrousetmessages: + msg67817
2008年05月25日 09:02:35gregory.p.smithsetpriority: high
nosy: + gregory.p.smith
2008年05月08日 14:35:36pitrousetfiles: + binaryio2.patch
messages: + msg66417
2008年05月08日 03:12:46pitrousetmessages: + msg66396
2008年05月08日 02:48:48alexandre.vassalottisetmessages: + msg66393
2008年05月08日 02:14:55pitrousetfiles: + binaryio1.patch
keywords: + patch
messages: + msg66390
2008年05月06日 19:28:51alexandre.vassalottisetnosy: + alexandre.vassalotti
2008年04月09日 11:21:25pitrousetmessages: + msg65236
2008年03月31日 18:11:24pitroucreate

AltStyle によって変換されたページ (->オリジナル) /