4
\$\begingroup\$

I was working on a problem set and I came across this problem:

Assume s is a string of lower case characters.

Write a program that prints the longest substring of s in which the letters occur in alphabetical order. For example, if s = 'azcbobobegghakl', then your program should print:

Longest substring in alphabetical order is: beggh

In the case of ties, print the first substring. For example, if s = 'abcbcd', then your program should print

Longest substring in alphabetical order is: abc

I actually was able to get a solution, but it was the most inefficient ever! And here's what I came up with:

def test():
 index = 1
 prev_index = 0
 count = 0
 global largest
 largest = ''
 test = s[prev_index]
 while count < len(s):
 if ord(s[index]) > ord(s[prev_index]):
 test += s[index]
 index += 1
 prev_index += 1
 elif ord(s[index]) == ord(s[prev_index]):
 test += s[index]
 index += 1
 prev_index += 1
 else:
 if len(largest) < len(test):
 largest = test[:]
 test = s[index]
 prev_index += 1
 index += 1
 count += 1
 return largest
try:
 test()
except IndexError:
 pass
finally:
 print largest

Can anyone give me an example of better code that produces the same output?

asked Oct 26, 2013 at 13:14
\$\endgroup\$

3 Answers 3

2
\$\begingroup\$

Your algorithm is basically fine, but it is expressed in the code clumsily.

The biggest problem is the interface of the function: it should accept a string parameter and return a string. Instead, your function takes s from its environment and uses the global variable largest to store the answer. Although there is a return largest statement, it is very misleading. What actually happens is that s[index] raises an IndexError, so that the return largest is never reached, and the answer is passed back using the global variable instead.

The root cause for the IndexError is that there are too many variables, causing confusion. With each iteration through the loop, you increment index, prev_index, and count. That means that those three variables could be reduced to one, which could just be called i. (If you think about it, you'll find that count is always one less than index. Since the loop condition is count < len(s), s[index] will go one position beyond the end of the string.)

The next observation to make is that your if and elif blocks are almost identical. You can combine them by using >= for the test.

Python strings are immutable, so largest = test[:] could just be largest = test.

Putting everything together...

def longest_nondecreasing_substring(s):
 if s is None or len(s) <= 1:
 return s
 longest = test = s[0]
 for i in range(1, len(s)):
 if ord(s[i]) >= ord(s[i - 1]):
 test += s[i]
 else:
 if len(longest) < len(test):
 longest = test
 test = s[i]
 if len(longest) < len(test):
 longest = test
 return longest
answered Oct 26, 2013 at 22:16
\$\endgroup\$
1
  • \$\begingroup\$ What, no docstring? \$\endgroup\$ Commented Oct 28, 2013 at 10:26
1
\$\begingroup\$

This answer doesn't have docstrings (Python isn't my main language), but the basic idea is to just start at every character and see how far you can go:

def longest_nondecreasing(s):
 start = -1
 maxlen = -1
 for i in range(len(s)): 
 for j in range(i+1, len(s)):
 if s[j] < s[i]: 
 break
 else: # we made it to the end!
 j = len(s)
 #so now our nondecreasing string goes from i to j
 if j - i > maxlen:
 maxlen = j - i
 start = i
 return s[start:start + maxlen]

Once we have a working solution, we can work on making some optimizations. For instance, if we have one contiguous non-decreasing sequence from say... 4 to 9, we definitely know that the longest non-decreasing sequence starting at 5 will also end at 9. Hence after the j loop we can just set i to j.

answered Oct 26, 2013 at 21:52
\$\endgroup\$
1
-1
\$\begingroup\$

It's the famous Longest increasing subsequence problem. Convert the chars to ints using the ord function

def longest_increasing_subsequence(X):
 """
 Find and return longest increasing subsequence of S.
 If multiple increasing subsequences exist, the one that ends
 with the smallest value is preferred, and if multiple
 occurrences of that value can end the sequence, then the
 earliest occurrence is preferred.
 """
 n = len(X)
 X = [None] + X # Pad sequence so that it starts at X[1]
 M = [None]*(n+1) # Allocate arrays for M and P
 P = [None]*(n+1)
 L = 0
 for i in range(1,n+1):
 if L == 0 or X[M[1]] >= X[i]:
 # there is no j s.t. X[M[j]] < X[i]]
 j = 0
 else:
 # binary search for the largest j s.t. X[M[j]] < X[i]]
 lo = 1 # largest value known to be <= j
 hi = L+1 # smallest value known to be > j
 while lo < hi - 1:
 mid = (lo + hi)//2
 if X[M[mid]] < X[i]:
 lo = mid
 else:
 hi = mid
 j = lo
 P[i] = M[j]
 if j == L or X[i] < X[M[j+1]]:
 M[j+1] = i
 L = max(L,j+1)
 # Backtrack to find the optimal sequence in reverse order
 output = []
 pos = M[L]
 while L > 0:
 output.append(X[pos])
 pos = P[pos]
 L -= 1
 output.reverse()
 return output
def main():
 TEST_STRING = 'azcbobobegghakl'
 print(''.join(map(chr, longest_increasing_subsequence(map(ord, TEST_STRING)))))
if __name__ == '__main__':
 main()
answered Oct 26, 2013 at 18:29
\$\endgroup\$
3
  • \$\begingroup\$ In the OP's problem it's clear that the sequence must be contiguous (consider the first example, where the answer is beggh, not abegghkl). So your answer is solving a different problem. \$\endgroup\$ Commented Oct 26, 2013 at 18:54
  • \$\begingroup\$ The example was invisible and there aren't any word about contiguous sequence only. \$\endgroup\$ Commented Oct 26, 2013 at 19:16
  • \$\begingroup\$ The OP may have been poorly formatted but the example was not invisible if you read it carefully. \$\endgroup\$ Commented Oct 26, 2013 at 19:20

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.