homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: new os.path function to extract common prefix based on path components
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Paddy McCarthy, Roman.Evstifeev, eric.araujo, eric.smith, eric.snow, ezio.melotti, loewis, paul.moore, python-dev, r.david.murray, rafik, rhettinger, ronaldoussoren, santoso.wijaya, serhiy.storchaka
Priority: low Keywords: patch

Created on 2010年11月12日 15:14 by ronaldoussoren, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
patch10395 rafik, 2012年11月04日 16:10 Implementation of os.path.commonpath review
patch10395-2 rafik, 2012年11月05日 20:45 Updated implementation of {posix,nt}path.commonpath review
patch10395-3 rafik, 2012年11月13日 00:18 New version of ntpath.commonpath review
ospath_commonpath.patch serhiy.storchaka, 2014年07月13日 18:56 review
Messages (18)
msg121038 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010年11月12日 15:14
The documentation for os.path.commonprefix notes:
os.path.commonprefix(list)
Return the longest path prefix (taken character-by-character) that is a prefix of all paths in list. If list is empty, return the empty string (''). Note that this may return invalid paths because it works a character at a time.
And indeed:
>>> os.path.commonprefix(['/usr/bin', '/usr/bicycle'])
'/usr/bi'
This is IMHO useless behaviour for a function in the os.path namespace, I'd expect that os.path.commonprefix works with path elements (e.g. that the call above would have returned '/usr').
msg121039 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010年11月12日 15:37
Indeed, that behavior seems completely useless.
I've verified that it works the same in 2.5.1.
msg121040 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010年11月12日 15:48
Although there are test cases in test_genericpath that verify this behavior, so apparently it's intentional.
msg121043 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010年11月12日 16:18
That's why I write 'broken by design' in the title.
A "fix" for this will have to a new function, if any get added (I've written a unix implementation that finds the longest shared path several times and can provide an implementation and tests when others agree that this would be useful)
msg121063 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010年11月12日 19:35
This goes back to issue400788 and
http://mail.python.org/pipermail/python-dev/2000-July/005897.html
http://mail.python.org/pipermail/python-dev/2000-August/008385.html
Skip changed it to do something meaningful (more than ten years ago), Mark Hammond complained that it was backwards incompatible, Tim Peters argued that you shouldn't change a function if the documented behavior matches the implementation, and Skip reverted the change and added more documentation to make the actual behavior more explicit.
It may be useless, but it's certainly not broken. In addition, it's very likely that applications of it rely on the very semantics that it has.
In any case, anybody proposing a change should go back and re-read the old threads.
msg121106 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年11月13日 03:12
Indeed, as I remember it there are people using commonprefix as a string function in situations having nothing to do with os paths.
I'm changing the title to reflect the fact that this is really a feature request for a new function. IMO it is a reasonable feature request. Finding a name for it ought to be an interesting exercise.
I think that this should only be accepted if there is also a windows implementation.
msg141535 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2011年08月01日 21:21
You can already get the better prefix using os.path, albeit less efficiently. Here's an example:
def commondirname(paths):
 subpath = os.path.commonprefix(paths)
 for path in paths:
 if path == subpath:
 return subpath
 else:
 return os.path.join(os.path.split(subpath)[0], "")
However, would it be better to implicitly normalize paths first rather than doing a character-by-character comparison? Here is an unoptimized demonstration of what I mean:
def commondirname(paths):
 result = ""
 for path in paths:
 path = os.path.normcase(os.path.abspath(path))
 if not result:
 result = path
 else:
 while not path.startswith(result + os.path.sep):
 result, _ = os.path.split(result)
 if os.path.splitdrive(result)[1] == os.path.sep:
 return result
 return result
msg174663 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012年11月03日 18:34
Rafik is working on os.path.commonpath for the bug day.
msg174818 - (view) Author: Rafik Draoui (rafik) Date: 2012年11月04日 16:10
Here is a patch with an implementation of os.path.commonpath, along with tests and documentation. At the moment, this is only implemented for POSIX, as I don't feel like I know enough about Windows to tackle drive letters and UNC in paths without spending some more time on it.
This probably needs more tests for corner cases.
msg174819 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年11月04日 16:44
> At the moment, this is only implemented for POSIX, as I don't feel like I know enough about Windows to tackle drive letters and UNC in paths without spending some more time on it.
Just use splitdrive() and first ensure that all drivespecs are same, then find common prefix for pathspecs.
msg174941 - (view) Author: Rafik Draoui (rafik) Date: 2012年11月05日 20:45
Here is a new patch addressing some of storchaka review comments, and implementing a version in ntpath.
For the Windows version, I did as proposed in msg174819, but as I am not familiar with the semantics and subtleties of paths in Windows maybe this version of ntpath.commonpath is too simplistic and would return wrong results in some cases. I would like someone more knowledgeable in Windows to take care of it, or maybe just provide a test suite with lots of different corner cases that I could use to provide a better implementation.
msg175493 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年11月13日 09:24
Some conclusions of discussion at Python-ideas (http://comments.gmane.org/gmane.comp.python.ideas/17719):
1. commonpath() should eat double slashes in input (['/usr/bin', '/usr//bin'] -> '/usr/bin'). In any case the current implementation eats slashes on output (['/usr//bin', '/usr//bin'] -> '/usr/bin', not '/usr//bin').
2. commonpath() should raise an exception instead of returning None on incompatible input.
3. May be commonpath() should eat also '.' components and return '.' instead of '' when relative paths have no common prefix. I am not sure.
In general the current patch looks good enough.
msg222966 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年07月13日 18:56
Here is revised patch. The behavior is changed in correspondence with results of Python-ideas discussion, extended tests, fixed several bugs.
msg222986 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014年07月14日 00:49
This patch looks reasonable except for the doc change to os.path.commonprefix(). Remember, that function IS working as documented and that our policy is to document in an affirmative manner (here is what the function does and how to use it versus being preachy about "broken-by-design" etc.)
msg238766 - (view) Author: Paddy McCarthy (Paddy McCarthy) Date: 2015年03月21日 05:42
Can we now:
 1. Move os.path.commonprefix to str.commonprefix or string.commonprefix
 2. Deprecate the use of os.path.commonprefix
 3. Add os.path.commonpath
 4. Update the documentation.
This seems to have lingered for too long and yet people have been willing to do the work it seems (from 1999).
msg239675 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015年03月31日 09:25
The patch looks good to me.
rhettinger: I'm not sure I see a problem with the doc changes in the latest patch - noting that commonprefix may return an invalid path is fine, and what the current docs say. Directing people to commonpath if they don't want invalid paths also seems fine.
Paddy McCarthy: I don't think that the backward compatibility cost of moving os.path.commonprefix is worth it.
msg239691 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年03月31日 12:14
The patch only adds a reference to commonpath() in commonprefix() documentation. The note about invalid paths already was here.
msg239694 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年03月31日 12:32
New changeset ec6c812fbc1f by Serhiy Storchaka in branch 'default':
Issue #10395: Added os.path.commonpath(). Implemented in posixpath and ntpath.
https://hg.python.org/cpython/rev/ec6c812fbc1f 
History
Date User Action Args
2022年04月11日 14:57:08adminsetgithub: 54604
2017年05月05日 21:53:15martin.panterlinkissue4755 superseder
2015年04月02日 12:42:22serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015年03月31日 12:32:56python-devsetnosy: + python-dev
messages: + msg239694
2015年03月31日 12:14:27serhiy.storchakasetmessages: + msg239691
2015年03月31日 09:25:18paul.mooresetnosy: + paul.moore
messages: + msg239675
2015年03月21日 05:42:33Paddy McCarthysetnosy: + Paddy McCarthy
messages: + msg238766
2014年07月14日 00:49:09rhettingersetnosy: + rhettinger
messages: + msg222986
2014年07月13日 18:56:44serhiy.storchakasetfiles: + ospath_commonpath.patch
keywords: + patch
messages: + msg222966

stage: commit review -> patch review
2014年07月05日 19:56:38eric.araujosetstage: patch review -> commit review
versions: + Python 3.5, - Python 3.4
2012年12月29日 22:12:25serhiy.storchakasetassignee: serhiy.storchaka
2012年11月13日 09:24:17serhiy.storchakasetmessages: + msg175493
2012年11月13日 00:18:05rafiksetfiles: + patch10395-3
2012年11月05日 20:45:58rafiksetfiles: + patch10395-2

messages: + msg174941
2012年11月04日 16:44:13serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg174819
2012年11月04日 16:13:09serhiy.storchakasetstage: needs patch -> patch review
2012年11月04日 16:10:38rafiksetfiles: + patch10395
nosy: + rafik
messages: + msg174818

2012年11月03日 18:34:30eric.araujosetmessages: + msg174663
versions: + Python 3.4, - Python 3.3
2011年08月01日 21:21:59eric.snowsetnosy: + eric.snow
messages: + msg141535
2011年08月01日 18:11:05santoso.wijayasetnosy: + santoso.wijaya

versions: + Python 3.3, - Python 3.2
2011年07月31日 02:39:52Roman.Evstifeevsetnosy: + Roman.Evstifeev
2010年11月19日 15:49:24eric.araujosetnosy: + eric.araujo
2010年11月13日 03:12:31r.david.murraysetnosy: + r.david.murray
title: os.path.commonprefix broken by design -> new os.path function to extract common prefix based on path components
messages: + msg121106

type: enhancement
stage: needs patch
2010年11月12日 23:53:31ezio.melottisetnosy: + ezio.melotti
2010年11月12日 19:35:22loewissetnosy: + loewis
messages: + msg121063
2010年11月12日 16:18:01ronaldoussorensetmessages: + msg121043
2010年11月12日 15:48:04eric.smithsetmessages: + msg121040
2010年11月12日 15:37:44eric.smithsetnosy: + eric.smith
messages: + msg121039
2010年11月12日 15:14:04ronaldoussorencreate

AltStyle によって変換されたページ (->オリジナル) /