Implementation of Longest Common Subsequence

Question 1

So I implemented the LCS algorithm once using a 2-D matrix to store values while the other one used a python dictionary. I found the dictionary implementation was easier to implement and was a more natural and intuitive way of solving the problem.

I just wanted to make sure if it's a correct way of implementing the LCS algorithm and how can I improve on it.

Using 2-D matrix

def LowestCommonSubstring(s1, s2):
 LCS = [[0 for x in range(len(s2) + 1)] for x in range(len(s1) + 1)]
 for i in range(1, len(s1)+1):
 for j in range(1, len(s2)+1):
 if s1[i - 1] == s2[j - 1]:
 LCS[i][j] = 1 + LCS[i-1][j-1]
 else:
 LCS[i][j] = max(LCS[i-1][j], LCS[i][j-1])
 return LCS[i][j]

using dictionary

cache = {}
def lcs(s1, s2):
 global cache
 if len(s1) == 0 or len(s2) == 0:
 return 0
 if (s1, s2) in cache:
 return cache[(s1, s2)]
 else:
 if s1[-1] == s2[-1]:
 cache[(s1, s2)] = 1 + lcs(s1[:-1], s2[:-1])
 else:
 cache[(s1, s2)] = max(lcs(s1[:-1], s2), lcs(s1, s2[:-1]))
 return cache[(s1, s2)]

This is the problem I'm trying to implement and for now my solution only calculates the length of the longest common substring.

Question 2

You could use an explicit memoizer rather than incorporating caching into your function. This will make your function easier to understand. You can use functools lru_cache if you are using Python3.
You should evaluate based on truth, rather than length. len(s1) == 0 is 'bad'.
You can use a turnery operator to reduce the repetition of your code.
Your else can be removed, to prevent the arrow anti-pattern. And reduce the amount of indentation.

def memoize(fn)
 def inner(*args):
 try:
 return cache[args]
 except IndexError:
 cache[args] = fn(*args)
 return cache[args]
 return inner
# Natural abstracted algorithm.
@memoize
def lcs(s1, s2):
 if not s1 or not s2:
 return 0
 return (
 1 + lcs(s1[:-1], s2[:-1])
 if s1[-1] == s2[-1] else
 max(lcs(s1[:-1], s2), lcs(s1, s2[:-1]))
 )

Peilonrayz ♦Peilonrayz 44.4k7 gold badges80 silver badges157 bronze badges · Accepted Answer · 2015-11-10 13:01:20Z

You could use an explicit memoizer rather than incorporating caching into your function. This will make your function easier to understand. You can use functools lru_cache if you are using Python3.
You should evaluate based on truth, rather than length. len(s1) == 0 is 'bad'.
You can use a turnery operator to reduce the repetition of your code.
Your else can be removed, to prevent the arrow anti-pattern. And reduce the amount of indentation.

def memoize(fn)
 def inner(*args):
 try:
 return cache[args]
 except IndexError:
 cache[args] = fn(*args)
 return cache[args]
 return inner
# Natural abstracted algorithm.
@memoize
def lcs(s1, s2):
 if not s1 or not s2:
 return 0
 return (
 1 + lcs(s1[:-1], s2[:-1])
 if s1[-1] == s2[-1] else
 max(lcs(s1[:-1], s2), lcs(s1, s2[:-1]))
 )

Stack Exchange Network

Implementation of Longest Common Subsequence

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Implementation of Longest Common Subsequence

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions