homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile keeps excessive dir structure in compressed files
Type: Stage:
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: lars.gustaebel, loewis, tarek, techtonik
Priority: normal Keywords: patch

Created on 2008年12月26日 13:19 by techtonik, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test_tarfile.extrapath.zip techtonik, 2008年12月26日 13:19 test tar.gz validness
4750.gzip.basename.fix.diff techtonik, 2008年12月29日 22:45 patch
python25.issue4750.diff techtonik, 2008年12月30日 07:20 python 2.5 patch
Messages (11)
msg78296 - (view) Author: anatoly techtonik (techtonik) Date: 2008年12月26日 13:19
When tarfile is directed to create tar.gz compressed archive file in a
path different from current, it saves full path information in .gz
header where only filename is required.
This causes problems with decompression utilities, such as 7zip. The
testsuite with patch are attached.
{{{
tar -czf dist\create_tar.tar.gz package
7z l dist\create_tar.tar.gz > tar.out
python test_create.tar.gz.py
7z l dist\create_py.tar.gz > py.out
diff -pu3 tar.out py.out
}}}
{{{
--- tar.out Fri Dec 26 15:12:42 2008
+++ py.out Fri Dec 26 15:12:42 2008
@@ -1,10 +1,10 @@
 7-Zip 4.57 Copyright (c) 1999-2007 Igor Pavlov 2007年12月06日
-Listing archive: dist\create_tar.tar.gz
+Listing archive: dist\create_py.tar.gz
 Date Time Attr Size Compressed Name
 ------------------- ----- ------------ ------------ 
------------------------
-2008年12月26日 15:12:41 10240 170 create_tar.tar
+2008年12月26日 15:03:39 10240 141 dist/create_py.tar
 ------------------- ----- ------------ ------------ 
------------------------
- 10240 170 1 files, 0 folders
+ 10240 141 1 files, 0 folders
}}}
See also issue 1886 and msg61515 in particular
msg78344 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008年12月27日 08:44
Lars, what do you think?
msg78372 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2008年12月27日 17:55
Anatoly is right, the gzip file format specification (RFC 1952) says
that the FNAME header field must be the basename of the original
filename. So, this behaviour is not tarfile's fault but that of the gzip
module and should be fixed there.
7zip can still decompress these files, right?
msg78414 - (view) Author: anatoly techtonik (techtonik) Date: 2008年12月28日 15:46
7zip can decompress both, but it still creates "dist/" directory when
decompressing file that is made with Python.
I've noticed this bug with extra path component is actual with "tar" +
"gzip" under windows. If they are executed separately and windows path
with backslashes is used - directory prefix is not stripped. I.e. this
creates archive with invalid header:
{{{
tar -cf dist\create_tar_sep.tar package
gzip -f9 dist\create_tar_sep.tar
}}}
This command is ok:
{{{
tar -cf dist\create_tar_sep.tar package
gzip -f9 dist/create_tar_sep.tar
}}}
msg78449 - (view) Author: anatoly techtonik (techtonik) Date: 2008年12月29日 12:01
For MSYS gzip added a bugreport here:
https://sourceforge.net/tracker2/index.php?func=detail&aid=2474481&group_id=2435&atid=102435 
msg78493 - (view) Author: anatoly techtonik (techtonik) Date: 2008年12月29日 22:45
I attach patch for Python 2.6 gzip
I clarified the meaning of self.name to be the basename corresponding to
FNAME field in GZIP file header.
There is a trace of deprecated gzip.filename API - I haven't found any
references to it in documentation, so I removed it. In Python 2.5 it
seemed to mean just filename in read mode and filename + .gz in write
mode even if opened filename did not end with .gz
If FNAME field from gzip header is ignored in read mode, so we want to
make self.filename or self.name available via API - we need to agree
what it should be - basename of archived file or path filename of
archive itself.
msg78510 - (view) Author: anatoly techtonik (techtonik) Date: 2008年12月30日 07:20
I attach for Python 2.5 as well. People will use gzip module for a long
time to build packages and patch will help them to get correct archives.
msg78515 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008年12月30日 09:07
No further bug fixes are accepted for 2.5 (unless they fix security
problems), so I reject the 2.5 patch.
msg94603 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009年10月28日 06:54
Lars, is this still accurate ?
msg94648 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009年10月29日 08:52
The latest patch (4750.gzip.basename.fix.diff) cannot be used the way it
is. The problem is that it uses the name attribute to store the basename
with the .gz extension stripped. This breaks compatibility.
msg94651 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009年10月29日 09:43
I fixed it in r75935 and r75937.
History
Date User Action Args
2022年04月11日 14:56:43adminsetgithub: 49000
2009年10月29日 09:43:49lars.gustaebelsetstatus: open -> closed
resolution: accepted
messages: + msg94651
2009年10月29日 08:52:04lars.gustaebelsetmessages: + msg94648
2009年10月28日 06:55:13tareksetnosy: loewis, lars.gustaebel, techtonik, tarek
components: - Distutils
2009年10月28日 06:54:58tareksetnosy: + tarek

messages: + msg94603
versions: + Python 3.1, Python 3.2, - Python 2.5
2008年12月30日 09:07:19loewissetmessages: + msg78515
2008年12月30日 09:06:38loewissetfiles: - tarfile.directory.fix.diff
2008年12月30日 07:20:20techtoniksetfiles: + python25.issue4750.diff
messages: + msg78510
2008年12月29日 22:45:52techtoniksetfiles: + 4750.gzip.basename.fix.diff
messages: + msg78493
2008年12月29日 12:01:53techtoniksetmessages: + msg78449
2008年12月28日 15:46:30techtoniksetmessages: + msg78414
2008年12月27日 17:55:45lars.gustaebelsetmessages: + msg78372
2008年12月27日 08:44:53loewissetassignee: lars.gustaebel
messages: + msg78344
nosy: + loewis, lars.gustaebel
2008年12月26日 13:21:17techtoniksetfiles: + tarfile.directory.fix.diff
keywords: + patch
2008年12月26日 13:19:59techtonikcreate

AltStyle によって変換されたページ (->オリジナル) /