So the problem of verifying if a list is a subsequence of another came up in a discussion, and I wrote code that seems to work (I haven't rigorously tested it).
IsSubequence.py
def is_subsequence(lst1, lst2):
"""
* Finds if a list is a subsequence of another.
* Params:
* `lst1` (`list`): The candidate subsequence.
* `lst2` (`list`): The parent list.
* Return:
* (`bool`): A boolean variable indicating whether `lst1` is a subsequence of `lst2`.
"""
l1, l2 = len(lst1), len(lst2)
if l1 > l2: #`l1` must be <= `l2` for `lst1` to be a subsequence of `lst2`.
return False
i = j = 0
d1, d2 = l1, l2
while i < l1 and j < l2:
while lst1[i] != lst2[j]:
j += 1
d2 -= 1
if d1 > d2: #At this point, `lst1` cannot a subsequence of `lst2`.
return False
i, j, d1, d2 = i+1, j+1, d1-1, d2-1
if d1 > d2:
return False
return True
I'm primarily concerned about performance.
1 Answer 1
Review
Testing
(I haven't rigorously tested it).
Well you should write some test to ensure validity of the function. So even after changes you can be sure it will still work.
doctest
is a pretty nice module to use, and is a nice extension of your docstringNaming
Variables should have descriptive names!
lst1
,lst2
if it wasn't for that docstring I would not have known what is the subseq and the parent, so instead I propose to rename them toneedle
andhaystack
here the intent is more clearSame goes for
d1
,d2
... I can see that they are the remaining length of the list, but it is hard to tell from the variable name.for
is considered more Pythonic vswhile
For loops are Pythons greatest feature IMHO, they are easy to read and short to write
You should start writing for loops instead of a while, "Loop like a native" might be an interesting talk to view
Too many assignments in a line
Might be preference, but I find this line hard to read:
i, j, d1, d2 = i+1, j+1, d1-1, d2-1
There are too many values with not enough descriptive names on this line
Alternative
We can instead loop over the haystack
and use slicing to compare the sliced haystack
with the needle
, lastly top it off with the any
keyword and write some tests with the doctest module
import doctest
def is_subsequence(needle, haystack):
"""
Finds if a list is a subsequence of another.
* args
needle: the candidate subsequence
haystack: the parent list
* returns
boolean
>>> is_subsequence([1, 2, 3, 4], [1, 2, 3, 4, 5, 6])
True
>>> is_subsequence([1, 2, 3, 4], [1, 2, 3, 5, 6])
False
>>> is_subsequence([6], [1, 2, 3, 5, 6])
True
>>> is_subsequence([5, 6], [1, 2, 3, 5, 6])
True
>>> is_subsequence([[5, 6], 7], [1, 2, 3, [5, 6], 7])
True
>>> is_subsequence([1, 2, 3, 4, 5, 6, 7, 8], [1, 2, 3, [5, 6], 7])
False
"""
return any(
haystack[i:i+len(needle)] == needle
for i in range(len(haystack) - len(needle) + 1)
)
if __name__ == '__main__':
doctest.testmod()
-
1\$\begingroup\$ Thanks for the suggestions. I don't think
while
is compatible with the algorithm I chose. Also, the algorithm you used is wrong; it tests if the needle is a contiguous subsequence of the haystack, and that's not what I'm trying to do (my use case doesn't require the needle to be contiguous. e.g[1, 1]
is a subsequence of[1, 0, 1]
). \$\endgroup\$Tobi Alafin– Tobi Alafin2019年03月15日 14:03:19 +00:00Commented Mar 15, 2019 at 14:03 -
3\$\begingroup\$ You should have clarified the problem statement in the Question, and if there were any tests, it would have been more obvious ;) \$\endgroup\$Ludisposed– Ludisposed2019年03月15日 15:08:21 +00:00Commented Mar 15, 2019 at 15:08
-
\$\begingroup\$ Indeed, that's one of the most important functions of tests: conveying to other users what results are expected. \$\endgroup\$Toby Speight– Toby Speight2022年07月20日 18:30:37 +00:00Commented Jul 20, 2022 at 18:30
Explore related questions
See similar questions with these tags.
[1,1]
a subsequence of[1,0,1]
? According to your code, it is. If you only want to consider continuous subsequences (eg.[1,1]
is not a subsequence of[1,0,1]
), you can use a string matching algorithm. The items of the two lists can be viewed as characters of two strings. \$\endgroup\$[1, 1]
is supposed to be a subsequence of[1, 0, 1]
. \$\endgroup\$