This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2016年04月12日 08:32 by Tomas Tomecek, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Messages (5) | |||
|---|---|---|---|
| msg263237 - (view) | Author: Tomas Tomecek (Tomas Tomecek) | Date: 2016年04月12日 08:32 | |
I have a tarball (generated by docker-1.10 via `docker export`) and am trying to extract it with python 2.7 tarfile:
```
with tarfile.open(name=tarball_path) as tar_fd:
tar_fd.extractall(path=path)
```
Output from a pytest run:
```
/usr/lib64/python2.7/tarfile.py:2072: in extractall
for tarinfo in members:
/usr/lib64/python2.7/tarfile.py:2507: in next
tarinfo = self.tarfile.next()
/usr/lib64/python2.7/tarfile.py:2355: in next
tarinfo = self.tarinfo.fromtarfile(self)
/usr/lib64/python2.7/tarfile.py:1254: in fromtarfile
return obj._proc_member(tarfile)
/usr/lib64/python2.7/tarfile.py:1276: in _proc_member
return self._proc_pax(tarfile)
/usr/lib64/python2.7/tarfile.py:1406: in _proc_pax
value = value.decode("utf8")
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
input = '\x01\x00\x00\x02\xc0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', errors = 'strict'
def decode(input, errors='strict'):
> return codecs.utf_8_decode(input, errors, True)
E UnicodeDecodeError: 'utf8' codec can't decode byte 0xc0 in position 4: invalid start byte
/usr/lib64/python2.7/encodings/utf_8.py:16: UnicodeDecodeError
```
Since I know nothing about tars, I have no idea if this is a bug or there is a proper solution/workaround.
When using GNU tar, I'm able to to list and extract the tarball.
|
|||
| msg263239 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2016年04月12日 08:42 | |
Can you give a link to the tar archive, or for example the first 256 KB of the archive? |
|||
| msg263241 - (view) | Author: Tomas Tomecek (Tomas Tomecek) | Date: 2016年04月12日 10:00 | |
Unfortunately I can't, since it's internal docker image. I have found a bug report in Red Hat bugzilla with more info: https://bugzilla.redhat.com/show_bug.cgi?id=1194473 Here's even a commit with a fix (via monkeypatching): https://github.com/goldmann/docker-squash/commit/81d1c4c18960a5d940be9b986ccbfaa7853aceb1 If needed, I can construct a minimal reporoducer. |
|||
| msg329285 - (view) | Author: Sławomir Nizio (snizio) | Date: 2018年11月05日 08:01 | |
I had the same problem with entries: SCHILY.xattr.system.posix_acl_default, SCHILY.xattr.system.posix_acl_access in a tarball with pax header. This seems to be fixed for Python 3 in the issue 8633, commit 1465cc2 in cpython. Tarfile from Python 2 assumes (in _proc_pax) that the values can be always decoded as utf-8 string. |
|||
| msg394670 - (view) | Author: Irit Katriel (iritkatriel) * (Python committer) | Date: 2021年05月28日 16:44 | |
Python 2.7 is no longer maintained. There aren't enough details here to tell whether the issue was fixed in python 3. If you are having this problem with python 3.9+, please create a new issue. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:29 | admin | set | github: 70927 |
| 2021年05月28日 16:44:34 | iritkatriel | set | status: open -> closed nosy: + iritkatriel messages: + msg394670 resolution: out of date stage: resolved |
| 2018年11月05日 08:01:21 | snizio | set | nosy:
+ snizio messages: + msg329285 |
| 2016年04月12日 10:00:44 | Tomas Tomecek | set | messages: + msg263241 |
| 2016年04月12日 08:42:52 | vstinner | set | messages: + msg263239 |
| 2016年04月12日 08:36:25 | SilentGhost | set | nosy:
+ lars.gustaebel type: behavior |
| 2016年04月12日 08:32:18 | Tomas Tomecek | create | |