Given a string sequence and a string, the function should output the next string of that sequence.
For example, for sequence "abc":
a b c aa ab ac ba .. cc aaa aab
How can I improve my code while maintaining a balance between performance and readability?
data Sequence = Sequence {
getSequence :: String,
getFirst :: Char,
getLast :: Char
} deriving (Show, Eq, Ord)
getNextChar :: Sequence -> Char -> Char
getNextChar s c
| c == getLast s = getFirst s
| otherwise = findChar (getSequence s)
where findChar (x:r@(y:xs))
| x == c = y
| otherwise = findChar r
getNextStr :: Sequence -> String -> String
getNextStr s xs
| l /= getLast s = init xs ++ [getNextChar s l]
| otherwise = nextAny
where l = last xs
fstChar = getFirst s
lstChar = getLast s
(f, t) = span (== lstChar) . reverse $ xs
nextAny
| length t == 0 = replicate (length f + 1) fstChar
| otherwise = reverse $ replicate (length f) fstChar ++ [getNextChar s (head t)] ++ tail t
2 Answers 2
The primary issue I see is that your code is focused on the wrong sort of state. What you have written would be used something like—
iterate (getNextStr (Sequence "abc" 'a' 'c')) "a"
But this is very slow, you can see from the repeated usage of last
, (++)
, and reverse
that the actual state that matters (which character is next in the sequence) is getting parsed out of the previous state anew with every iteration. You are relying on repeating many \$O(n)\$ operations, which ends up causing your code to have a terrible constant factor performance hit.
The key insight or trick is to take advantage of lazy evaluation to produce an infinite stream of values which can hide all of the messy state machinery you need away in some thunks.
stringSequence :: [Char] -> [String]
We start with a clean slate and a new type signature. In this case we have a function that takes a sequence of Char
values, and returns a list (which we know will be infinite) of String
values made by applying the sequence generating procedure to the Char
s. The magic obviously happens in the sequence generating procedure, so let's concentrate on that.
stringSequence chars = map (:"") chars
Starting is easy, we simply make a String
out of each Char
in our given sequence. Next step is to generate the two character String
s.
stringSequence chars =
let
strings = map (:"") chars
front = strings
next = concatMap (\f -> map (f ++) strings) front
in
front ++ next
This is the biggest jump I'll make, so make sure you understand it. The two character strings are made by taking each string we've already generated and appending a Char
from the sequence to it (for convenience, strings
makes it easy to combine previous values generated by the sequence generating function and the next element of the given Char
s).
stringSequence chars =
let
strings = map (:"") chars
front = strings
next = concatMap (\f -> map (f ++) strings) front
after = concatMap (\s -> map (s ++) strings) next
in
front ++ next ++ after
Now we add the three character strings, and notice a pattern. Strings of length \$n+1\$ are made by appending the sequence characters to the strings of length \$n\$. Leveraging this insight and a trick called Tying the Knot, we can write the final version of stringSequence
that produces an infinite lazy list of sequenced values.
stringSequence chars =
let
strings = map (:"") chars
sequence = strings ++ concatMap (\s -> map (s ++) strings) sequence
in
sequence
Compare this version to your original. The Sequence
type you had is now replaced with a simple list. Instead of getNextChar
we use the whole sequence in one go so we don't have to perform additional bookkeeping to maintain correct ordering. Rather than parsing the previous result value with getNextStr
, we maintain our state hidden away from the user by providing them an infinite list, instead of out in the open where the user has to keep track of it and we have to keep parsing it.
-
\$\begingroup\$ I loved your answer. Now, maybe add a function to get the next string of that sequence given a string? \$\endgroup\$Afonso Matos– Afonso Matos2015年07月28日 10:12:44 +00:00Commented Jul 28, 2015 at 10:12
-
\$\begingroup\$ Well, you could do something like
tail . dropWhile (/= string) $ sequence
to resume the sequence after an arbitrary member. \$\endgroup\$R B– R B2015年07月29日 03:45:58 +00:00Commented Jul 29, 2015 at 3:45 -
\$\begingroup\$ @catgocat You could also interpret the string as a number where chars are the digits and directly calculate the position:
seqPos chars string = sum $ concatMap (\(c, n) -> (map (\i -> n * 4^i) . findIndices (== c) $ string)) (zip chars [1..])
(ugly code optimized for brevity). The lookup the becomestail . drop (seqPos "abc" string) $ sequence
. This should usually be faster than comparing the string with every of its preceding elements. \$\endgroup\$tarleb– tarleb2015年07月29日 19:34:55 +00:00Commented Jul 29, 2015 at 19:34
A very good solution has already been given above. I'd like to add some comments on the current code without introducing a new algorithm.
getNextChar :: Sequence -> Char -> Char
getNextChar s c
| c == getLast s = getFirst s
| otherwise = findChar (getSequence s)
where findChar (x:r@(y:xs))
| x == c = y
| otherwise = findChar r
findChar
's name doesn't quite capture its intend. It would more aptly be named successorChar or something similar. The variable xs
isn't used in the function, one could get away without binding the remaining list to a variable ((x:r@(y:_))
). A much simpler way to write it would be to use dropWhile
:
findChar xs = head . tail . dropWhile (/= c) $ xs
One could get rid of the special case testing if we are looking at the last char by making the sequence of chars cyclic:
nextChar s c = head . tail . dropWhile (/= c) $ cycle (getSequence s)
The main function suffers from the way our problem is posed. Everything would be much simpler if the sequence was reversed (meaning that 'aa' would be followed by 'ba'). This allows us to replace list appends (++
) with more efficient (:)
and to do pattern matching on the chars that matter.
seqSuccessor s [] = getFirst s
seqSuccessor s cs'@(c:cs) =
| c /= getLast s = (getNextChar s c):cs
| otherwise = let (ls, ts) = span (== (getLast s)) cs'
in (replicate (length ls) $ getFirst s)
++ seqSuccessor s ts
The original function can be recovered simply by doing some list reversions:
getNextStr s = reverse . seqSuccessor s . reverse