Message 245368 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	nczeczulin
Recipients	Ericg, martin.panter, nczeczulin, ned.deily
Date	2015年06月15日.06:58:53
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1434351534.33.0.29525164967.issue24301@psf.upfronthosting.co.za>

Content
The spec allows for multi-member files. Some libraries and utilities seem to solve this problem (incorrectly?) by simply ignoring everything past the first member -- even when valid (e.g., DotNetZip, 7-Zip) For 2.7 and 3.4, the data that has been decompressed but not yet read before the exception was raised is still available: Modifying Martin's example slightly: >>> f = BytesIO() >>> with GzipFile(fileobj=f, mode="wb") as z: ... z.write(b"data") ... 4 >>> f.write(b"garbage") 7 >>> f.seek(0) 0 >>> with GzipFile(fileobj=f, mode="rb") as z: ... try: ... z.read(1) ... z.read() ... except OSError as e: ... z.extrabuf[z.offset - z.extrastart:] ... e ... b'd' b'ata' OSError('Not a gzipped file',) My issue is that catching and handling this specific exception is a little more involved because there are 3(?) different OSErrors (IOError on 2.7) that could potentially be raised during the read. But mostly: OSError('CRC check failed 0x447ba3f9 != 0x225cb2a3',) -- would be bad one to mistake for it. Maybe a specific Exception type to catch for an invalid header, and a better method to read the remaining buffer when handling it?

Content

The spec allows for multi-member files. Some libraries and utilities seem to solve this problem (incorrectly?) by simply ignoring everything past the first member -- even when valid (e.g., DotNetZip, 7-Zip)
For 2.7 and 3.4, the data that has been decompressed but not yet read before the exception was raised is still available:
Modifying Martin's example slightly:
>>> f = BytesIO()
>>> with GzipFile(fileobj=f, mode="wb") as z:
... z.write(b"data")
...
4
>>> f.write(b"garbage")
7
>>> f.seek(0)
0
>>> with GzipFile(fileobj=f, mode="rb") as z:
... try:
... z.read(1)
... z.read()
... except OSError as e:
... z.extrabuf[z.offset - z.extrastart:]
... e
...
b'd'
b'ata'
OSError('Not a gzipped file',)
My issue is that catching and handling this specific exception is a little more involved because there are 3(?) different OSErrors (IOError on 2.7) that could potentially be raised during the read. But mostly:
OSError('CRC check failed 0x447ba3f9 != 0x225cb2a3',) -- would be bad one to mistake for it.
Maybe a specific Exception type to catch for an invalid header, and a better method to read the remaining buffer when handling it?

History
Date	User	Action	Args
2015年06月15日 06:58:54	nczeczulin	set	recipients: + nczeczulin, ned.deily, martin.panter, Ericg
2015年06月15日 06:58:54	nczeczulin	set	messageid: <1434351534.33.0.29525164967.issue24301@psf.upfronthosting.co.za>
2015年06月15日 06:58:54	nczeczulin	link	issue24301 messages
2015年06月15日 06:58:53	nczeczulin	create

homepage