3
\$\begingroup\$

I have the use case of needing to find the parent path of a sub path for which I have written the following function:

#! /usr/bin/env python3
from os.path import sep
def pathbase(main, sub):
 if main.endswith(sub):
 parent = main[:-len(sub)]
 while parent.endswith(sep):
 parent = parent[:-1]
 return parent
 else:
 raise ValueError('Main path does not end on sub-path')
path1 = '/foo/bar/spamm/eggs'
path2 = 'spamm/eggs'
print(pathbase(path1, path2))

But I find that this is awfully verbose and that I re-invented the wheel with it. Also it breaks if the paths have an unequal amount of trailing slashes, so my method would need to normalize both paths first, which would make it even more verbose. Is there already some library method which does that, that I may have overlooked?

Edit: Due to the problems regarding the undesired partial node-match mentioned in some answers and comments I refactored the method to this:

from os.path import basename, dirname, normpath
def pathbase(main, sub):
 """Extract basename of main
 relative to its sub-path sub
 """
 norm_main = normpath(main)
 norm_sub = normpath(sub)
 while basename(norm_main) == basename(norm_sub):
 norm_main = dirname(norm_main)
 norm_sub = dirname(norm_sub)
 else:
 if basename(norm_sub):
 raise ValueError('"{sub}" is not a sub-path of {main}'.format(
 sub=sub, main=main))
 return norm_main
asked Aug 31, 2016 at 15:00
\$\endgroup\$
3
  • 1
    \$\begingroup\$ What is the desired output? '/foo/bar/'? Take a look at os.path.relpath. \$\endgroup\$ Commented Aug 31, 2016 at 15:52
  • \$\begingroup\$ What's that while for ? \$\endgroup\$ Commented Aug 31, 2016 at 16:52
  • \$\begingroup\$ @Daerdemandt: It would be /foo/bar in the example. @Dex' ter: To normalize the path in a case like pathbase('/foo/bar////spamm/eggs', 'spamm/eggs') \$\endgroup\$ Commented Sep 1, 2016 at 8:18

2 Answers 2

3
\$\begingroup\$

First, have in mind what @Daerdemandt suggested in comments (using os.path.relpath - even though I think this is returning the matches but I might be wrong)

From the docs:

Return a relative filepath to path either from the current directory or from an optional start directory. This is a path computation: the filesystem is not accessed to confirm the existence or nature of path or start.

start defaults to os.curdir.

Availability: Unix, Windows.

After some reading of the os module, I don't think this is possible either. Sincerely, it is an unusual path manipulation which leads to the reason of not having it in the above mentioned module.

Apart from this, we can cut down you method and make it straight-forward:

import sys
def pathbase(path1, path2):
 try:
 if path1.endswith(path2):
 return path1[:-len(path2)]
 raise TypeError
 except TypeError:
 print('Main path does not end on sub-path')
 sys.exit(0)
path1 = '/foo/bar/spamm/eggs'
path2 = 'spamm/eggs'
print(pathbase(path1, path2))

There's not much to say about this. I just simplified your code by removing the while part which didn't make too much sense to me. I also removed the else part and raised a TypeError which I used in an try-except block.

Some minor style things

  • you could've call your method: get_base_path()
  • you could've name your arguments: main_path and sub_path

In the end, everything would look just like:

import sys
def get_base_path(main_path, sub_path):
 try:
 if main_path.endswith(sub_path):
 return main_path[:-len(sub_path)]
 raise TypeError
 except TypeError:
 print('Main path does not end on sub-path')
 sys.exit(0)
main_path = '/foo/bar/spamm/eggs'
sub_path = 'spamm/eggs'
print(get_base_path(main_path, sub_path)) 

The above returns:

/foo/bar/

answered Aug 31, 2016 at 17:14
\$\endgroup\$
3
\$\begingroup\$

Your function doesn't behave the way I would expect: pathbase('/foo/bar/spamm/eggs', 'mm/eggs') would return /foo/bar/spa instead of raising ValueError.

I suggest the following solution. It has a side-effect of normalizing the paths (as demonstrated in the fourth doctest below), but that's probably a good thing.

import os.path
def pathbase(main, sub):
 """
 Strip trailing path components.
 >>> pathbase('/foo/bar/spamm/eggs', 'spamm/eggs')
 '/foo/bar'
 >>> pathbase('/foo/bar/spamm/eggs', 'spam/eggs')
 Traceback (most recent call last):
 ...
 ValueError: Main path does not end with sub-path
 >>> pathbase('/foo/bar/spamm/eggs', 'eggs')
 '/foo/bar/spamm'
 >>> pathbase('/foo/./bar/baz/../spamm/eggs', 'bar/soap/../spamm/eggs/')
 '/foo'
 """
 main = os.path.normpath(main)
 sub = os.path.normpath(os.path.join(os.path.sep, sub))
 if main.endswith(sub):
 return main[:-len(sub)]
 else:
 raise ValueError('Main path does not end with sub-path')
answered Aug 31, 2016 at 17:58
\$\endgroup\$
1
  • 1
    \$\begingroup\$ You can get rid of the else and keep only the raise ValueError('Main path does not end with sub-path') part \$\endgroup\$ Commented Aug 31, 2016 at 18:07

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.