I have the use case of needing to find the parent path of a sub path for which I have written the following function:
#! /usr/bin/env python3
from os.path import sep
def pathbase(main, sub):
if main.endswith(sub):
parent = main[:-len(sub)]
while parent.endswith(sep):
parent = parent[:-1]
return parent
else:
raise ValueError('Main path does not end on sub-path')
path1 = '/foo/bar/spamm/eggs'
path2 = 'spamm/eggs'
print(pathbase(path1, path2))
But I find that this is awfully verbose and that I re-invented the wheel with it. Also it breaks if the paths have an unequal amount of trailing slashes, so my method would need to normalize both paths first, which would make it even more verbose. Is there already some library method which does that, that I may have overlooked?
Edit: Due to the problems regarding the undesired partial node-match mentioned in some answers and comments I refactored the method to this:
from os.path import basename, dirname, normpath
def pathbase(main, sub):
"""Extract basename of main
relative to its sub-path sub
"""
norm_main = normpath(main)
norm_sub = normpath(sub)
while basename(norm_main) == basename(norm_sub):
norm_main = dirname(norm_main)
norm_sub = dirname(norm_sub)
else:
if basename(norm_sub):
raise ValueError('"{sub}" is not a sub-path of {main}'.format(
sub=sub, main=main))
return norm_main
2 Answers 2
First, have in mind what @Daerdemandt suggested in comments (using os.path.relpath
- even though I think this is returning the matches but I might be wrong)
From the docs:
Return a relative filepath to path either from the current directory or from an optional start directory. This is a path computation: the filesystem is not accessed to confirm the existence or nature of path or start.
start defaults to
os.curdir
.Availability: Unix, Windows.
After some reading of the os
module, I don't think this is possible either. Sincerely, it is an unusual path manipulation which leads to the reason of not having it in the above mentioned module.
Apart from this, we can cut down you method and make it straight-forward:
import sys
def pathbase(path1, path2):
try:
if path1.endswith(path2):
return path1[:-len(path2)]
raise TypeError
except TypeError:
print('Main path does not end on sub-path')
sys.exit(0)
path1 = '/foo/bar/spamm/eggs'
path2 = 'spamm/eggs'
print(pathbase(path1, path2))
There's not much to say about this. I just simplified your code by removing the while
part which didn't make too much sense to me. I also removed the else
part and raise
d a TypeError
which I used in an try-except
block.
Some minor style things
- you could've call your method:
get_base_path()
- you could've name your arguments:
main_path
andsub_path
In the end, everything would look just like:
import sys
def get_base_path(main_path, sub_path):
try:
if main_path.endswith(sub_path):
return main_path[:-len(sub_path)]
raise TypeError
except TypeError:
print('Main path does not end on sub-path')
sys.exit(0)
main_path = '/foo/bar/spamm/eggs'
sub_path = 'spamm/eggs'
print(get_base_path(main_path, sub_path))
The above returns:
/foo/bar/
Your function doesn't behave the way I would expect: pathbase('/foo/bar/spamm/eggs', 'mm/eggs')
would return /foo/bar/spa
instead of raising ValueError
.
I suggest the following solution. It has a side-effect of normalizing the paths (as demonstrated in the fourth doctest below), but that's probably a good thing.
import os.path
def pathbase(main, sub):
"""
Strip trailing path components.
>>> pathbase('/foo/bar/spamm/eggs', 'spamm/eggs')
'/foo/bar'
>>> pathbase('/foo/bar/spamm/eggs', 'spam/eggs')
Traceback (most recent call last):
...
ValueError: Main path does not end with sub-path
>>> pathbase('/foo/bar/spamm/eggs', 'eggs')
'/foo/bar/spamm'
>>> pathbase('/foo/./bar/baz/../spamm/eggs', 'bar/soap/../spamm/eggs/')
'/foo'
"""
main = os.path.normpath(main)
sub = os.path.normpath(os.path.join(os.path.sep, sub))
if main.endswith(sub):
return main[:-len(sub)]
else:
raise ValueError('Main path does not end with sub-path')
-
1\$\begingroup\$ You can get rid of the
else
and keep only theraise ValueError('Main path does not end with sub-path')
part \$\endgroup\$Grajdeanu Alex– Grajdeanu Alex2016年08月31日 18:07:03 +00:00Commented Aug 31, 2016 at 18:07
'/foo/bar/'
? Take a look atos.path.relpath
. \$\endgroup\$/foo/bar
in the example. @Dex' ter: To normalize the path in a case likepathbase('/foo/bar////spamm/eggs', 'spamm/eggs')
\$\endgroup\$