homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile normalizes arcname
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: lars.gustaebel, mkv, srid
Priority: normal Keywords:

Created on 2009年05月18日 16:19 by mkv, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Messages (7)
msg88033 - (view) Author: (mkv) Date: 2009年05月18日 16:19
When creating tar archives using the tarfile module, requested arc names
are not respected. 
It is currently impossible to create a tar which when listing contents
would give:
$tar tvf test.tar
./
./control
./prerm
./postinst
The actual result will be
$tar tvf test.tar
./
control
prerm
postinst
This is caused by TarInfo's tobuf method calling normpath() on all file
names, even when the user has explicitly specified a certain name.
msg88150 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009年05月21日 09:18
So, what exactly are trying to accomplish? Why do you need that?
msg88157 - (view) Author: (mkv) Date: 2009年05月21日 14:44
I'm creating a debian package (.deb) for a system which uses busybox's
dpkg. A deb is an ar-archive (not tar, unix ar) archive, which in turn
contains two tar archives. dpkg will first extract a tar archive called
control.tar.gz (or bz2) from the package, and from that tar it will
extract a file stored with the path "./control". 
The problem is that with the current implementation of tarfile it's
impossible to create a tar archive which would contain a file stored
with the path "./control". This means it's impossible to use tarfile to
create deb packages which would work with busybox' dpkg. 
I'm not 100% sure if that precise path is requirement of the deb file
format, or if it is because of how busybox' dpkg is implemented. However
I have not seen a packaging guide or a deb package which wouldn't have
the control file stored as ./control in the tar archive.
msg88230 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009年05月23日 12:03
Apparently, the .deb file format is not explicit about that, but it
seems to be common practice to have all files prefixed with './'.
normpath is used all over tarfile, crucial are the occurrences in
TarFile.add() and TarInfo.get_info(). As you're using a unix-like system
the easiest workaround is to replace the module level tarfile.normpath
function with a no-op.
The original assumption for using normpath on all pathnames was to keep
the names in an archive clean and in their canonical form. Most
occurrences of normpath date back to the 2003 original version (cp.
r30613) and have never been touched.
But, I found nothing in POSIX about normalizing pathnames. GNU tar and
star both strip different leading path components like "./" and "../"
from pathnames, but they both don't remove "./" components from inside a
pathname, for example. This means that the usage of normpath seems more
or less unnecessary in tarfile.
I will create a patch that addresses these issues.
Thanks for your report.
msg88233 - (view) Author: (mkv) Date: 2009年05月23日 12:39
Great, thanks for the speedy work :) 
Now if only issue4750 would get fixed for 2.7 as well ;)
msg92044 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009年08月28日 20:56
I have done some research in order to find a suitable behaviour for
tarfile. I wrote a script to test to what extent all the different tar
implementations transform input pathnames. The results can be found at
http://www.gustaebel.de/lars/tarfile/wwgtd.html.
My conclusion is the following: tarfile now does no pathname
transformation whatsoever except for converting absolute to relative
paths (to stay backwards compatible). This way tarfile is closer to
POSIX, applies less magic and gives more responsibility to the user.
Fixed in r74571 (trunk) and r74573 (py3k). Thanks for your report.
msg105922 - (view) Author: Sridhar Ratnakumar (srid) Date: 2010年05月17日 17:54
Apparently this fix introduced a regression. See issue8741 
History
Date User Action Args
2022年04月11日 14:56:49adminsetgithub: 50304
2010年05月17日 17:54:16sridsetnosy: + srid
messages: + msg105922
2009年08月28日 20:56:24lars.gustaebelsetstatus: open -> closed
resolution: fixed
messages: + msg92044

versions: + Python 3.2, - Python 3.1
2009年05月23日 12:39:14mkvsetmessages: + msg88233
2009年05月23日 12:03:19lars.gustaebelsetmessages: + msg88230
versions: + Python 3.1, Python 2.7, - Python 2.6
2009年05月21日 14:44:25mkvsetmessages: + msg88157
2009年05月21日 09:18:37lars.gustaebelsetmessages: + msg88150
2009年05月18日 20:31:43loewissetassignee: lars.gustaebel

nosy: + lars.gustaebel
2009年05月18日 16:19:55mkvcreate

AltStyle によって変換されたページ (->オリジナル) /