homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: os.makedirs breaks under autofs directories
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: amcnabb, hynek, loewis, neologix, pitrou, r.david.murray
Priority: normal Keywords:

Created on 2012年04月30日 19:14 by amcnabb, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
mkdir-p.out amcnabb, 2012年05月16日 18:03 strace from mkdir -p
makedirs.out amcnabb, 2012年05月16日 18:05 strace of python script running os.makedirs
Messages (35)
msg159728 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年04月30日 19:14
When a os.makedirs is used under an autofs directory, it crashes. For example, on my machine, `os.makedirs('/net/prodigy/tmp')` crashes with the following traceback:
Traceback (most recent call last):
 ...
 File "/usr/lib64/python2.7/os.py", line 157, in makedirs
 mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/net/prodigy/tmp'
In this case, '/net' is an autofs directory that automatically mounts the "prodigy" directory by connecting to a host called "prodigy" using NFS. The problem seems to be related to the fact that the "/net/prodigy" directory does not actually exist until it is first accessed.
I tried running `mkdir -p /net/prodigy/tmp`, and it succeeds even though the "/net/prodigy" directory did not exist before the "mkdir" command was run.
I'm not sure exactly how `mkdir -p` is implemented, but one potential workaround for Python's makedirs would be to add the following at the top of the function:
 os.stat(name)
This stat call really only needs to be run the first time makedirs is called (it does not need to be used for each recursive call).
msg159730 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年04月30日 19:32
As makedirs in 3.x doesn’t handle EPERM and is otherwise the same, I presume the error is there as well.
I also presume, that after the failed makedirs(), the directory is mounted? I'd just handle the error just we handle EEXIST in 3.x now.
msg159977 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年05月05日 01:45
Hynek: it doesn't seem like that would work, since legitimate EPERM errors are much more likely.
msg159980 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月05日 08:23
David: What do you mean? I'd just catch EPERM and look whether the directory "appeared" in the meantime. If it exists, continue. If not: re-raise. Am I missing something? I may have a look at GNU mkdir if it doesn't help.
msg159983 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012年05月05日 10:05
To me, this doesn't look like a os.makedirs() bug, but rather an autofs one (didn't know it was still in use).
I don't like the idea of adding such a kludge (i.e. catching EPERM and checking whether the directory appeared in between).
Could you please provide an strace output of both "os.makedirs()" and "mkdir -p"?
Thanks.
msg159988 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月05日 11:37
Charles, I don't think you can blame autofs here. The problem at hand is that makedirs() never checks whether the directory exists (that would trigger the mount too I presume). Instead, it tries a mkdir and looks if it gets an EEXIST.
If you try that approach in this case where /net is non-writable and /net/prodigy appears only on demand, it fails with an EPERM instead.
Have a look at http://hg.python.org/cpython/file/8215aaccc9dd/Lib/os.py#l136 especially line 151.
msg159990 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012年05月05日 13:27
> Charles, I don't think you can blame autofs here. The problem at hand is that makedirs() never checks whether the directory exists (that would trigger the mount too I presume).
Yes, it does. Have a look at line 148:
"""
 if head and tail and not path.exists(head):
"""
> Instead, it tries a mkdir and looks if it gets an EEXIST.
Actually, EEXIST is just caught to cope with race conditions (i.e. the
directory got created in between, TOCTTOU race).
> If you try that approach in this case where /net is non-writable and /net/prodigy appears only on demand, it fails with an EPERM instead.
Actually, no.
makedirs() does a recursive depth-first traversal:
makedirs('/net/prodigy/foo') will actually do something like:
"""
stat('/net/prodigy/foo') == ENOENT
stat('/net/prodigy') == ENONENT
mkdir('/net/prodigy') == EPERM
"""
The NFS mount should appear upon the first - or second - stat() call,
before mkdir().
But I'd like to be sure about that, that's why I think an strace()
output would be useful ;-)
msg159991 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年05月05日 13:30
Hynek: you said "just like EEXIST", which doesn't check to see if the directory exists before continuing, thus my confusion.
If mkdir -p does a stat first, then changing the makedirs algorithm to match is probably not a bad idea for OS compatibility reasons, with autofs just being an example of why.
msg159993 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月05日 14:08
>> Charles, I don't think you can blame autofs here. The problem at hand is that makedirs() never checks whether the directory exists (that would trigger the mount too I presume).
> 
> Yes, it does. Have a look at line 148:
> """
> if head and tail and not path.exists(head):
> """
Now that's embarrassing. I take everything back and claim the opposite.
 
>> If you try that approach in this case where /net is non-writable and /net/prodigy appears only on demand, it fails with an EPERM instead.
> 
> Actually, no.
> makedirs() does a recursive depth-first traversal:
> makedirs('/net/prodigy/foo') will actually do something like:
> """
> stat('/net/prodigy/foo') == ENOENT
> stat('/net/prodigy') == ENONENT
> mkdir('/net/prodigy') == EPERM
> """
> The NFS mount should appear upon the first - or second - stat() call,
> before mkdir().
> 
> But I'd like to be sure about that, that's why I think an strace()
> output would be useful ;-)
I glanced over coreutil's mkdir code and it seems that it changes the directory while creating the directories. I suspect it's related to that. At least it does no fstab parsing. ;) The code is wayyy too intricate to be grokable at glancing though. mkdir -p seems to be quite a can of worms.
Andrew, please send straces. :)
msg160850 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月16日 12:40
Andrew, are you still with us?
msg160892 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月16日 17:31
> Andrew, are you still with us?
I'm here, but it's been a busy few weeks. I'll see if I can spend some time on this today.
msg160899 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月16日 17:59
Some interesting information. If I do `os.mkdir('/net/prodigy/tmp')`, it gives "OSError: [Errno 13] Permission denied: '/net/prodigy/tmp'". However, if I instead do `os.mkdir('/net/prodigy/tmp/hi')`, it succeeds. (Note that I'm being careful to start from a fresh unmounted state in both cases).
I will post straces in a few minutes.
msg160920 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月16日 19:21
I guess this is the magic in mkdir -p:
mkdir("expert", 0755) = -1 EACCES (Permission denied)
chdir("expert") = 0
mkdir("tmp", 0755) = -1 EEXIST (File exists)
I'm not sure how I feel about that. I think we have more or less a policy in os & shutil not to change directories unless really necessary (like make_archive).
msg160979 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012年05月17日 16:35
> I guess this is the magic in mkdir -p:
>
> mkdir("expert", 0755)          = -1 EACCES (Permission denied)
> chdir("expert")             = 0
> mkdir("tmp", 0755)           = -1 EEXIST (File exists)
>
> I'm not sure how I feel about that. I think we have more or less a policy in os & shutil not to change directories unless really necessary (like make_archive).
Calling chdir() should be avoided, as it is not thread safe.
Actually, the real problem is that stat() alone doesn't trigger
automount, to avoid mount storms e.g. when using find, see:
https://bugzilla.redhat.com/show_bug.cgi?id=497830
http://lkml.indiana.edu/hypermail/linux/kernel/1109.1/00100.html
Since calling chdir() is not an option, if we want to fix this, I
guess the way to go would be to open() the directory, and not use
path.exists().
But I find really uncomfortable with adding such a kludge, especially
since this problem is probably present in many different places
throughout the stdlib.
Furthermore, it seems that kernel from 2.6.38 do trigger automount
upon stat() (see
http://lkml.indiana.edu/hypermail/linux/kernel/1109.1/00210.html), so
I'd suggest to drop this as "won't fix".
msg160995 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月17日 18:24
I agree, closing.
Thank you for your valuable assistance Andrew!
msg160997 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月17日 18:31
This isn't fixed. All of the examples I've given were with a 3.3.0 kernel. Doing a stat would be a fix.
msg161003 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月17日 18:59
If I read it correctly, we _do_ a stat:
stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
ISTM that this is a different problem altogether, or am I missing something?
If mkdir'ing of deeper directories works, it sounds like a race condition to me.
msg161006 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月17日 19:07
My suspicion re the race condition is that a chdir waits for autofs to mount but mkdir doesn't. You could check that yourself if a chdir/sleep would help.
msg161007 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月17日 19:22
Hmm. Maybe there's a difference between doing stat('/net/prodigy') vs. stat('/net/prodigy/tmp'). Just a guess, but maybe the former can succeed before the mount completes, but the latter has to wait for the mount to complete.
msg161008 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月17日 19:24
By the way, if my hunch about the difference in stat of '/net/prodigy' vs. '/net/prodigy/tmp' is correct, then this would explain why makedirs on deeper directories work. Specifically, one of the shallower stat calls would force the mount to complete before proceeding to the next deeper level.
msg161018 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012年05月17日 21:34
Yes, creating the directories in a bottom-up way (i.e. '/net', '/net/prodigy', '/net/prodigy/foo') could maybe avoid this problem.
But this is definietely an autofs bug, and there are probably many other places where such code might break.
msg161024 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月17日 22:52
I sense a very easy fix: just do a stat() on the final path before we do anything (and quickly return if present). Moving current makedirs into a private _makedirs that gets called if the stat fails.
I'm still feeling uneasy about that though. It _is_ a kludge.
msg161028 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年05月18日 01:19
I agree that the proposed solutions are hacks, and still propose to close this as "won't fix".
I think this is a bug in Linux. mkdir("/net/prodigy/tmp") ought to trigger the mount, then fail with EEXIST.
msg161038 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012年05月18日 06:46
Alright, closing for good then.
Andrew, if you want to get this fixed, you should report this to the autofs folks, because it's definitely not a Python bug.
msg161055 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月18日 15:33
I see no evidence that this is a bug in Linux, and I think it's ridiculous to close it when a trivial one-line fix is available. I won't reopen it because it's obvious no one wants to address this. :(
msg161057 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月18日 15:45
I posted a bug report with the kernel here:
https://bugzilla.kernel.org/show_bug.cgi?id=43262 
msg161116 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012年05月19日 13:57
> I see no evidence that this is a bug in Linux,
"""
stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
"""
As you can see, a stat() is already done on /net/prodigy.
Event if it wasn't done, calling mkdir() ought to trigger the mount.
So this is *definitely* as bug in autofs.
> and I think it's ridiculous to close it when a trivial one-line fix is available.
Which one-line fix do you propose?
> I won't reopen it because it's obvious no one wants to address this. :(
It's not that we don't want to address this, but rather that we want
to avoid introducing hacks to work around OS bugs.
msg161118 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月19日 14:11
> """
> stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
> """
> 
> As you can see, a stat() is already done on /net/prodigy.
To be fair, that shouldn’t trigger a mount. Otherwise a `ls -l` on /net
would mount all volumes.
> Event if it wasn't done, calling mkdir() ought to trigger the mount.
I’m not sure if/where the behavior is defined – let’s what the Linux
people say.
msg161120 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年05月19日 14:17
>> """
>> stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
>> mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
>> """
>>
>> As you can see, a stat() is already done on /net/prodigy.
>
> To be fair, that shouldn’t trigger a mount. Otherwise a `ls -l` on /net
> would mount all volumes.
Not sure what "that" is: my view is that mkdir should trigger the mount,
/net/prodigy is already there and available. ls -l doesn't invoke mkdir(2),
so you wouldn't get a mount storm when it is mkdir that triggers the mount.
msg161121 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年05月19日 14:20
>>> """
>>> stat("/net/prodigy", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
>>> mkdir("/net/prodigy/tmp", 0777) = -1 EACCES (Permission denied)
>>> """
>>>
>>> As you can see, a stat() is already done on /net/prodigy.
>>
>> To be fair, that shouldn’t trigger a mount. Otherwise a `ls -l` on /net
>> would mount all volumes.
> 
> Not sure what "that" is: my view is that mkdir should trigger the mount,
> /net/prodigy is already there and available. ls -l doesn't invoke mkdir(2),
> so you wouldn't get a mount storm when it is mkdir that triggers the mount.
Sure, I was refering (as I hoped that my quoting would indicate) to the
stat on /net/prodigy, not the mkdir. I commented on the mkdir in the
next paragraph.
msg161280 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月21日 15:35
> Which one-line fix do you propose?
Doing a stat("/net/prodigy/tmp") before mkdir("/net/prodigy/tmp") is an extremely simple workaround.
Of course, I would love to see clear documentation of how the kernel is defined to behave in this situation. It would certainly be intuitive for mkdir("/net/prodigy/tmp") to force a mount of "/net/prodigy", but defined behavior isn't always intuitive. :)
msg161284 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年05月21日 17:48
> Of course, I would love to see clear documentation of how the kernel 
> is defined to behave in this situation. It would certainly be 
> intuitive for mkdir("/net/prodigy/tmp") to force a mount of 
> "/net/prodigy", but defined behavior isn't always intuitive. :)
Maybe I'm confused, but the presence of "/net/prodigy" is *not*
the issue here, and what gets mounted is *not* "/net/prodigy",
but "/net/prodigy/tmp" (do "mount" to confirm or dispute).
msg161285 - (view) Author: Andrew McNabb (amcnabb) Date: 2012年05月21日 17:58
> Maybe I'm confused, but the presence of "/net/prodigy" is *not*
> the issue here, and what gets mounted is *not* "/net/prodigy",
> but "/net/prodigy/tmp" (do "mount" to confirm or dispute).
No, /net/prodigy is the mountpoint in this case:
amcnabb@sage:~ :) mount |grep prodigy
prodigy:/ on /net/prodigy type nfs4 ([redacted])
msg163997 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012年06月25日 17:37
I still don't like the idea of adding such a kludge to work around OS
bugs, so I'd suggest closing as won't fix.
msg163998 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年06月25日 17:46
+1 for closing.
History
Date User Action Args
2022年04月11日 14:57:29adminsetgithub: 58907
2012年06月25日 17:46:33pitrousetnosy: + pitrou
messages: + msg163998
2012年06月25日 17:37:10neologixsetmessages: + msg163997
2012年05月21日 17:58:18amcnabbsetmessages: + msg161285
2012年05月21日 17:48:24loewissetmessages: + msg161284
2012年05月21日 15:35:12amcnabbsetmessages: + msg161280
2012年05月19日 14:20:32hyneksetmessages: + msg161121
2012年05月19日 14:17:19loewissetmessages: + msg161120
2012年05月19日 14:11:27hyneksetmessages: + msg161118
2012年05月19日 13:57:34neologixsetmessages: + msg161116
2012年05月18日 15:45:34amcnabbsetmessages: + msg161057
2012年05月18日 15:33:40amcnabbsetmessages: + msg161055
2012年05月18日 06:46:23neologixsetstatus: open -> closed

messages: + msg161038
stage: resolved
2012年05月18日 01:19:30loewissetnosy: + loewis
messages: + msg161028
2012年05月17日 22:53:00hyneksetmessages: + msg161024
2012年05月17日 21:34:31neologixsetmessages: + msg161018
2012年05月17日 19:24:54amcnabbsetmessages: + msg161008
2012年05月17日 19:22:47amcnabbsetmessages: + msg161007
2012年05月17日 19:07:06hyneksetmessages: + msg161006
2012年05月17日 18:59:02hyneksetmessages: + msg161003
2012年05月17日 18:31:07amcnabbsetstatus: closed -> open

messages: + msg160997
2012年05月17日 18:24:40hyneksetstatus: open -> closed
resolution: wont fix
messages: + msg160995

stage: needs patch -> (no value)
2012年05月17日 16:35:20neologixsetmessages: + msg160979
2012年05月16日 19:21:43hyneksetmessages: + msg160920
2012年05月16日 18:05:09amcnabbsetfiles: + makedirs.out
2012年05月16日 18:03:07amcnabbsetfiles: + mkdir-p.out
2012年05月16日 17:59:46amcnabbsetmessages: + msg160899
2012年05月16日 17:31:47amcnabbsetmessages: + msg160892
2012年05月16日 12:40:41hyneksetmessages: + msg160850
2012年05月05日 14:08:57hyneksetkeywords: - easy

messages: + msg159993
2012年05月05日 13:30:32r.david.murraysetmessages: + msg159991
2012年05月05日 13:27:29neologixsetmessages: + msg159990
2012年05月05日 11:37:24hyneksetmessages: + msg159988
2012年05月05日 10:05:13neologixsetnosy: + neologix
messages: + msg159983
2012年05月05日 08:23:50hyneksetmessages: + msg159980
2012年05月05日 01:45:53r.david.murraysetnosy: + r.david.murray
messages: + msg159977
2012年04月30日 20:02:11hyneksetkeywords: + easy
2012年04月30日 19:32:16hyneksetversions: + Python 3.2, Python 3.3
nosy: + hynek

messages: + msg159730

type: behavior
stage: needs patch
2012年04月30日 19:14:17amcnabbcreate

AltStyle によって変換されたページ (->オリジナル) /