homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: TarFile.getmember on directory requires trailing slash iff over 100 chars
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: af, andrei.avk, lars.gustaebel, miss-islington, moloney, puppet, r.david.murray, serhiy.storchaka, vstinner, zigg
Priority: normal Keywords: patch

Created on 2014年07月16日 03:40 by moloney, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
tarfile_issue.py moloney, 2014年07月16日 19:57
issue21987.diff lars.gustaebel, 2014年07月23日 09:57
issue21987_py3.5_with_test.patch zigg, 2014年07月26日 00:06 review
issue21987_py2.7_with_test.patch puppet, 2014年08月02日 08:54 review
Pull Requests
URL Status Linked Edit
PR 30283 merged andrei.avk, 2021年12月28日 16:58
PR 30737 merged miss-islington, 2022年01月21日 07:40
PR 30738 merged miss-islington, 2022年01月21日 07:40
Messages (15)
msg223167 - (view) Author: Brendan Moloney (moloney) Date: 2014年07月16日 03:40
If a directory path is under 100 char you have to omit the trailing slash from the name passed to 'getmember'. If it is over 100 you have to include the trailing slash.
As a work around I can use the private '_getmember' with 'normalize=True'.
I tested on 2.7.2 and searched the release notes looking for a related fix since then. I couldn't find anything there, or here in the issue tracker.
msg223174 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年07月16日 05:50
Could you please provide an example?
msg223264 - (view) Author: Brendan Moloney (moloney) Date: 2014年07月16日 19:57
Here is a script illustrating the issue.
msg223432 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014年07月18日 20:50
There is indeed special logic that triggers if the name is longer than 100 characters. Presumably it has a bug. Marking this as easy since it shouldn't be too hard, given the failure example, to figure out what is wrong and fix it (and turn the example into a unit test).
It doesn't look like the relevant code has changed in python3, so the bug probably exists there as well.
msg223732 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2014年07月23日 09:57
Apparently, the problem is located in TarInfo._proc_gnulong(). I attached a patch.
When tarfile reads an archive, it strips trailing slashes from all filenames, except GNUTYPE_LONGNAME headers, which is a bug. tarfile creates GNU_FORMAT tar files by default, hence it uses an additional GNUTYPE_LONGNAME header for filenames >100 chars. That's why tarfile_issue.py fails if used with PAX_FORMAT, because PAX_FORMAT doesn't have this bug.
msg224014 - (view) Author: Matt Behrens (zigg) * Date: 2014年07月26日 00:06
Here is a 3.5 fix based on Lars Gustäbel's, with test.
msg224542 - (view) Author: Daniel Eriksson (puppet) * Date: 2014年08月02日 08:54
Added Matt Behrens test to Lars Gustäbel 2.7 version.
msg348618 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019年07月29日 11:32
This issue is 5 years old has 4 patches: it's far from being "newcomer friendly", I remove the "Easy" label.
msg376370 - (view) Author: af (af) Date: 2020年09月04日 15:01
Any updates on this?
msg376373 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020年09月04日 15:18
> Any updates on this?
So far, nobody proposed a pull request. So no, there is no update.
Someone has to step in, dig into the issue, propose a fix, then someone else has to review the PR, and finally the PR should be merged.
msg409261 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021年12月28日 17:08
The original issue was twofold:
1. below 100 char not working with trailing slash
2. over 100 char not working WITHOUT trailing slash
The second part is no longer an issue -- tested in 3.9 and 3.11 on MacOS.
Currently the issue is that a trailing slash now doesn't work for lookup of dirs, no matter the size of name.
This is inconsistent with the way shell commands work as well as various Python path related modules that tolerate trailing slash for dirs.
This can cause users to wrongly assume a dir is absent in a tarfile, so I think it's worth fixing and I've added a PR with a test for both old and new issue.
msg409265 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021年12月28日 18:51
Well, the tar command strips trailing slashes (even from file paths), so it is reasonable to do this in getmember().
$ mkdir dir
$ touch dir/file
$ tar cf archive.tar dir
$ tar tf archive.tar dir
dir/
dir/file
$ tar tf archive.tar dir/
dir/
dir/file
$ tar tf archive.tar dir/file
dir/file
$ tar tf archive.tar dir/file/
dir/file
$ tar tf archive.tar dir/file////
dir/file
msg411089 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022年01月21日 07:40
New changeset cfadcc31ea84617b1c73022ce54d4ae831333e8d by andrei kulakov in branch 'main':
bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283)
https://github.com/python/cpython/commit/cfadcc31ea84617b1c73022ce54d4ae831333e8d
msg411092 - (view) Author: miss-islington (miss-islington) Date: 2022年01月21日 08:06
New changeset 1d11fdd3eeff77ba600278433b7ab0ce4d2a7f3b by Miss Islington (bot) in branch '3.10':
bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283)
https://github.com/python/cpython/commit/1d11fdd3eeff77ba600278433b7ab0ce4d2a7f3b
msg411391 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022年01月23日 17:54
New changeset 94d6434ba7ec3e4b154e515c5583b0b665ab0b09 by Miss Islington (bot) in branch '3.9':
[3.9] bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283) (GH-30738)
https://github.com/python/cpython/commit/94d6434ba7ec3e4b154e515c5583b0b665ab0b09
History
Date User Action Args
2022年04月11日 14:58:06adminsetgithub: 66186
2022年01月23日 17:54:59serhiy.storchakasetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.9, Python 3.10, Python 3.11, - Python 2.7, Python 3.5
2022年01月23日 17:54:22serhiy.storchakasetmessages: + msg411391
2022年01月21日 08:06:05miss-islingtonsetmessages: + msg411092
2022年01月21日 07:40:46miss-islingtonsetpull_requests: + pull_request28925
2022年01月21日 07:40:44serhiy.storchakasetmessages: + msg411089
2022年01月21日 07:40:41miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request28924
2021年12月28日 18:51:01serhiy.storchakasetmessages: + msg409265
2021年12月28日 17:08:06andrei.avksetmessages: + msg409261
2021年12月28日 16:58:04andrei.avksetnosy: + andrei.avk
pull_requests: + pull_request28497
2020年09月04日 15:18:45vstinnersetmessages: + msg376373
2020年09月04日 15:01:43afsetnosy: + af
messages: + msg376370
2019年07月29日 11:32:48vstinnersetkeywords: - easy
nosy: + vstinner
messages: + msg348618

2014年08月04日 14:07:02ezio.melottisetstage: test needed -> patch review
2014年08月02日 08:54:22puppetsetfiles: + issue21987_py2.7_with_test.patch
nosy: + puppet
messages: + msg224542

2014年07月26日 00:06:33ziggsetfiles: + issue21987_py3.5_with_test.patch
versions: + Python 3.5
nosy: + zigg

messages: + msg224014
2014年07月23日 09:57:06lars.gustaebelsetfiles: + issue21987.diff
keywords: + patch
messages: + msg223732
2014年07月18日 20:50:56r.david.murraysetkeywords: + easy
nosy: + r.david.murray
messages: + msg223432

2014年07月16日 19:57:48moloneysetfiles: + tarfile_issue.py

messages: + msg223264
2014年07月16日 05:50:13serhiy.storchakasetnosy: + lars.gustaebel, serhiy.storchaka

messages: + msg223174
stage: test needed
2014年07月16日 03:40:59moloneycreate

AltStyle によって変換されたページ (->オリジナル) /