homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: The tarfile module crashes when tarfile contains a symlink and unpack directory contain it too
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: SilentGhost, barry, lars.gustaebel, martin.panter, mvo, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015年01月13日 08:55 by mvo, last changed 2022年04月11日 14:58 by admin.

Files
File name Uploaded Description Edit
tarfile-extract-crash.py mvo, 2015年01月13日 08:55
possible-fix-37688.diff mvo, 2015年01月13日 09:03 review
possible-fix-23228-with-test.diff mvo, 2015年01月13日 09:10 review
possible-fix-23228-with-test2.diff mvo, 2015年01月19日 09:27 review
windowserror.diff lars.gustaebel, 2016年05月08日 15:16 review
Messages (9)
msg233907 - (view) Author: Michael Vogt (mvo) * Date: 2015年01月13日 08:55
The tarfile.makelink() code crashes with a maximum recursion error when it unpacks a tarfile that contains a symlink into a directory that already contains this symlink.
Attached is a standalone testcase (that probably better explains whats going on :) and a possible fix.
msg233910 - (view) Author: Michael Vogt (mvo) * Date: 2015年01月13日 09:03
A possible fix that works with the previous testcase for this bug. It does not break a tarfile tests.
msg233914 - (view) Author: Michael Vogt (mvo) * Date: 2015年01月13日 09:10
This patch contains a regression test as well.
msg233918 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2015年01月13日 09:36
Perhaps I'm missing something here, but it doesn't seem to be a problem with valid links. Only invalid symlinks are causing this issue.
msg234309 - (view) Author: Michael Vogt (mvo) * Date: 2015年01月19日 09:27
Thanks SilentGhost for your feedback and sorry for my slow reply.
I looked into this some more and attached a updated patch with a more complete test. It also covers a crash now that happens when there is a symlink cycle in the tar and on disk. 
My fix is to remove existing links before unpack and replace them with the link in the tar. This is what gnu tar is also doing (according to http://www.gnu.org/software/tar/manual/html_node/Dealing-with-Old-Files.html). However this seems to be a behavior change of what the tarfile module has traditionally been done so feel free to reject it, I'm open to alternative ideas of course (even though I feel like having the same behavior as gnu tar is desirable).
Thanks,
 Michael
msg264808 - (view) Author: Michael Vogt (mvo) * Date: 2016年05月04日 12:39
Anything I can do to help moving this issue forward?
msg265146 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2016年05月08日 15:16
TarFile.makelink() has a fallback mode in case the platform does not support links. Instead of a symlink or a hardlink it extracts the file it points to as long as it exists in the current archive.
More precisely, makelink() calls os.symlink() and if one of the exceptions in the symlink_exception tuple is raised, it goes into fallback mode. r80944 introduced a regression because it replaced the WindowsError in symlink_exception with an OSError which is much less specific than a WindowsError. Since that change, the fallback is used everytime an OSError occurs, in Michael's case it is a FileExistsError, because the symlink is already there.
The attached patch restores the old behavior. This might not be what you wanted, Michael, but at least, tarfile no longer crashes.
msg265147 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2016年05月08日 15:27
I suck :-) It is hg revision bb94f6222fef.
msg331911 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2018年12月16日 01:34
The problem with WindowsError should only exist in 3.4+. 2.7 doesn’t support creating symlinks on Windows.
Michael’s fix is the same as already done in 2.7 for Issue 10761 and (part of) Issue 12088. However I’m not sure that is the best approach for a bug fix. Also see Issue 19974 proposing to replace existing directory entries in all cases, including replacing empty subdirectories, and not just when extracting symlinks.
I suspect Michael has only fixed the recursive loop on Unix. What happens if an exception is raised because symlinks are not supported (e.g. Windows)? Possible test case:
data = BytesIO()
writer = tarfile.TarFile(fileobj=data, mode='w')
selflink = tarfile.TarInfo('self')
selflink.size = 0
selflink.type = tarfile.SYMTYPE
selflink.linkname = selflink.name
writer.addfile(selflink)
writer.close()
data.seek(0)
tarfile.TarFile(fileobj=data).extractall()
History
Date User Action Args
2022年04月11日 14:58:11adminsetgithub: 67417
2018年12月16日 01:45:59martin.panterlinkissue35483 superseder
2018年12月16日 01:34:29martin.pantersetnosy: + martin.panter
messages: + msg331911
2018年07月11日 07:43:11serhiy.storchakasettype: crash -> behavior
2016年05月08日 15:27:16lars.gustaebelsetmessages: + msg265147
2016年05月08日 15:16:30lars.gustaebelsetfiles: + windowserror.diff

messages: + msg265146
2016年05月04日 12:54:59ppperrysettitle: Crashes when tarfile contains a symlink and unpack directory contain it too -> The tarfile module crashes when tarfile contains a symlink and unpack directory contain it too
2016年05月04日 12:39:01mvosetmessages: + msg264808
2015年01月19日 09:27:20mvosetfiles: + possible-fix-23228-with-test2.diff

messages: + msg234309
2015年01月13日 15:07:44barrysetnosy: + barry
2015年01月13日 10:04:57serhiy.storchakasetnosy: + lars.gustaebel, serhiy.storchaka
stage: patch review

versions: + Python 2.7, Python 3.4, Python 3.5
2015年01月13日 09:36:36SilentGhostsetnosy: + SilentGhost
messages: + msg233918
2015年01月13日 09:10:57mvosetfiles: + possible-fix-23228-with-test.diff

messages: + msg233914
2015年01月13日 09:03:16mvosetfiles: + possible-fix-37688.diff
keywords: + patch
messages: + msg233910
2015年01月13日 08:55:02mvocreate

AltStyle によって変換されたページ (->オリジナル) /