The question that sparked this question, was one on Stack Overflow in which the OP was looking for a way to find a common prefix among file names( a list of strings). While an answer was given that said to use something from the os
library, I began to wonder how one might implement a common_prefix
function.
I deiced to try my hand at finding out, and along with creating a common_prefix
function, I also created a common_suffix
function. After verifying that the functions worked, I deiced to go the extra mile; I documented my functions and made them into a package of sorts, as I'm sure they will come in handy later.
But before sealing up the package for good, I deiced I would try to make my code as "Pythonic" as possible, which lead me here.
I made sure to document my code heavily, so I feel confident that I shouldn't have to explain how the functions work, and how to use them:
from itertools import zip_longest
def all_same(items: (tuple, list, str)) -> bool:
'''
A helper function to test if
all items in the given iterable
are identical.
Arguments:
item -> the given iterable to be used
eg.
>>> all_same([1, 1, 1])
True
>>> all_same([1, 1, 2])
False
>>> all_same((1, 1, 1))
True
>> all_same((1, 1, 2))
False
>>> all_same("111")
True
>>> all_same("112")
False
'''
return all(item == items[0] for item in items)
def common_prefix(strings: (list, tuple), _min: int=0, _max: int=100) -> str:
'''
Given a list or tuple of strings, find the common prefix
among them. If a common prefix is not found, an empty string
will be returned.
Arguments:
strings -> the string list or tuple to
be used.
_min, _max - > If a common prefix is found,
Its length will be tested against the range _min
and _max. If its length is not in the range, and
empty string will be returned, otherwise the prefix
is returned
eg.
>>> common_prefix(['hello', 'hemp', 'he'])
'he'
>>> common_prefix(('foobar', 'foobaz', 'foobam'))
'foo'
>>> common_prefix(['foobar', 'foobaz', 'doobam'])
''
'''
prefix = ""
for tup in zip_longest(*strings):
if all_same(tup):
prefix += tup[0]
else:
if _min <= len(prefix) <= _max:
return prefix
else:
return ''
def common_suffix(strings: (list, tuple), _min: int=0, _max: int=100) -> str:
'''
Given a list or tuple of strings, find the common suffix
among them. If a common suffix is not found, an empty string
will be returned.
Arguments:
strings -> the string list or tuple to
be used.
_min, _max - > If a common suffix is found,
Its length will be tested against the range _min
and _max. If its length is not in the range, and
empty string will be returned, otherwise the suffix
is returned
eg.
>>> common_suffix([rhyme', 'time', 'mime'])
'me'
>>> common_suffix(('boo', 'foo', 'goo'))
'oo'
>>> common_suffix(['boo', 'foo', 'goz'])
''
'''
suffix = ""
strings = [string[::-1] for string in strings]
for tup in zip_longest(*strings):
if all_same(tup):
suffix += tup[0]
else:
if _min <= len(suffix) <= _max:
return suffix[::-1]
else:
return ''
3 Answers 3
common_suffix
can be written as return common_prefix(string[::-1])[::-1]
because the operations are just the simmetric of one another, and this way will prevent duplication.
Also I think you should not handle max or min inside the common_prefix
function because it feels like the function has double responsabilty: finding prefixes + length interval check.
Why are you limiting yourself to strings? Python allows general functions very easily.
Why do you build all the result and then return it? You should yield
the result item by item:
Why do you write so much yourself? Using the itertools
module is much more efficient and simple:
def common_prefix(its):
yield from itertools.takewhile(all_equal, zip(*its))
PS: common_suffix
will now need to use reversed(list
instead of [::-1]
-
\$\begingroup\$ By the way: while I disagree on many aspects of this code, I find the documentation outstanding, and I could review it very fast and easy because of it \$\endgroup\$Caridorc– Caridorc2016年10月31日 19:20:09 +00:00Commented Oct 31, 2016 at 19:20
-
\$\begingroup\$ Thanks! As a side note, I was considering using
yield
but I would have had know way of testing my prefix/suffix length which was important to my project. \$\endgroup\$Chris– Chris2016年10月31日 19:28:03 +00:00Commented Oct 31, 2016 at 19:28 -
\$\begingroup\$ @Pythonic After you call the function do
len(list(common_prefix)) in range(min, max)
. It may lose you on efficiency though. If you want to take really short parts of prefixes of really long prefixes you can usetake
to preserve efficiency. (take
isislice
) \$\endgroup\$Caridorc– Caridorc2016年10月31日 19:31:32 +00:00Commented Oct 31, 2016 at 19:31 -
\$\begingroup\$ Alright, I'll see how that works out. \$\endgroup\$Chris– Chris2016年10月31日 19:32:56 +00:00Commented Oct 31, 2016 at 19:32
-
\$\begingroup\$ @Pythonic Did you implement another version using
islice
? Wasin range
fast enough for you? \$\endgroup\$Caridorc– Caridorc2016年11月02日 22:58:54 +00:00Commented Nov 2, 2016 at 22:58
If you want to use a type annotation for all_same(items: (tuple, list, str))
, I suggest declaring items
to be a typing.Sequence
.
I don't understand why you want to do zip_longest()
, when the length of the common prefix is certainly limited by the shortest input. A simple zip()
should do.
#!/usr/bin/env python3
# common prefix and common suffix of a list of strings
# https://stackoverflow.com/a/6719272/10440128
# https://codereview.stackexchange.com/a/145762/205605
import itertools
def all_equal(it):
x0 = it[0]
return all(x0 == x for x in it)
def common_prefix(strings):
char_tuples = zip(*strings)
prefix_tuples = itertools.takewhile(all_equal, char_tuples)
return "".join(x[0] for x in prefix_tuples)
def common_suffix(strings):
return common_prefix(map(reversed, strings))[::-1]
strings = ["aa1zz", "aaa2zzz", "aaaa3zzzz"]
assert common_prefix(strings) == "aa"
assert common_suffix(strings) == "zz"
print("ok")
-
3\$\begingroup\$ You did not review the existing solution. You provided an alternative answer without explaining how it is better than the existing one. Please edit your answer to comply with how to answer \$\endgroup\$Billal BEGUERADJ– Billal BEGUERADJ2024年07月02日 18:58:35 +00:00Commented Jul 2, 2024 at 18:58
-
\$\begingroup\$ bla bla bla. i have converted the existing solution into actual code \$\endgroup\$milahu– milahu2024年07月02日 19:14:03 +00:00Commented Jul 2, 2024 at 19:14
-
3\$\begingroup\$ @milahu Please add an explanation of how your answer has improved the code. If you point to one thing (a what) and explain the improvement (a how) you should be clearly on the correct side of our rules. \$\endgroup\$2024年07月02日 19:30:44 +00:00Commented Jul 2, 2024 at 19:30