Message206779
| Author |
belopolsky |
| Recipients |
belopolsky |
| Date |
2013年12月21日.20:58:58 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1387659539.36.0.629484655922.issue20048@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
This problem happens when I unpack a file from a 200+ MB zip archive as follows:
with zipfile.ZipFile(archive) as z:
data = b''
with z.open(filename, 'rU') as f:
for line in f:
data += line
I cannot reduce it to a test case suitable for posting here, but the culprit is the following code in zipfile.py:
def peek(self, n=1):
"""Returns buffered bytes without advancing the position."""
if n > len(self._readbuffer) - self._offset:
chunk = self.read(n)
self._offset -= len(chunk)
See http://hg.python.org/cpython/file/81f8375e60ce/Lib/zipfile.py#l605
The problem occurs when peek() is called on the boundary of the uncompress buffer and read() goes through more than one readbuffer. The result is that self._offset is smaller than len(chunk) leading to a non-sensical negative self._offset upon return from peek().
This problem does not seem to appear in 3.x since 028e8e0b03e8. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2013年12月21日 20:58:59 | belopolsky | set | recipients:
+ belopolsky |
| 2013年12月21日 20:58:59 | belopolsky | set | messageid: <1387659539.36.0.629484655922.issue20048@psf.upfronthosting.co.za> |
| 2013年12月21日 20:58:59 | belopolsky | link | issue20048 messages |
| 2013年12月21日 20:58:58 | belopolsky | create |
|