Message 379720 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	vstinner
Recipients	aroussel, bckohan, gregory.p.smith, iritkatriel, vstinner
Date	2020年10月27日.03:07:15
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1603768036.28.0.40612189058.issue42096@roundup.psfhosted.org>

Content
ZipFile.open() checks the first 4 bytes: # Skip the file header: fheader = zef_file.read(sizeFileHeader) if len(fheader) != sizeFileHeader: raise BadZipFile("Truncated file header") fheader = struct.unpack(structFileHeader, fheader) if fheader[_FH_SIGNATURE] != stringFileHeader: raise BadZipFile("Bad magic number for file header") But is_zipfile() does not. Code could be shared for that. .gz and .zip files don't start by the same bytes, so this check should reduce the number of false positives. -- You may have a look at the validate() methods of my old Hachoir project, they check a few bytes to check if a file looks a valid gzip or ZIP archive. gzip: https://github.com/vstinner/hachoir/blob/0f56883d7cea7082e784bfbdd2882e0f2dd2f34b/hachoir/parser/archive/gzip_parser.py#L51-L62 zip: https://github.com/vstinner/hachoir/blob/0f56883d7cea7082e784bfbdd2882e0f2dd2f34b/hachoir/parser/archive/zip.py#L411-L430

Content

ZipFile.open() checks the first 4 bytes:
 # Skip the file header:
 fheader = zef_file.read(sizeFileHeader)
 if len(fheader) != sizeFileHeader:
 raise BadZipFile("Truncated file header")
 fheader = struct.unpack(structFileHeader, fheader)
 if fheader[_FH_SIGNATURE] != stringFileHeader:
 raise BadZipFile("Bad magic number for file header")
But is_zipfile() does not. Code could be shared for that.
.gz and .zip files don't start by the same bytes, so this check should reduce the number of false positives.
--
You may have a look at the validate() methods of my old Hachoir project, they check a few bytes to check if a file looks a valid gzip or ZIP archive.
gzip:
https://github.com/vstinner/hachoir/blob/0f56883d7cea7082e784bfbdd2882e0f2dd2f34b/hachoir/parser/archive/gzip_parser.py#L51-L62
zip:
https://github.com/vstinner/hachoir/blob/0f56883d7cea7082e784bfbdd2882e0f2dd2f34b/hachoir/parser/archive/zip.py#L411-L430

History
Date	User	Action	Args
2020年10月27日 03:07:16	vstinner	set	recipients: + vstinner, gregory.p.smith, iritkatriel, aroussel, bckohan
2020年10月27日 03:07:16	vstinner	set	messageid: <1603768036.28.0.40612189058.issue42096@roundup.psfhosted.org>
2020年10月27日 03:07:16	vstinner	link	issue42096 messages
2020年10月27日 03:07:15	vstinner	create

homepage