I am learning Haskell and I am doing 99 Haskell problems. There is a problem to check if a list if a palindrome. It's obvious solution is
isPalindrome x = x == reverse x
There are several other solutions with some kind of reverse
. All of them use two passes -- one for building reverse list and one more for ==
. I wanted to write a single pass palindrome check.
walk :: (Eq a) => [a] -> Int -> Either (Either Int [a]) Bool
walk (x:xs) n = case walk xs (n + 1) of
Left (Left back) | (back + 1) == n -> Left (Right xs)
| back == n -> let (y:ys) = xs in if y == x then (if null ys then Right True else Left (Right ys)) else Right False
| otherwise -> Left (Left (back + 1))
Left (Right (y:ys)) -> if y == x then (if n > 0 then Left (Right ys) else Right True) else Right False
Left (Right []) -> error "Impossible"
Right ans -> Right ans
walk [] _ = Left (Left (-1))
isPalindrome :: (Eq a) => [a] -> Bool
-- isPalindrome x = x == reverse x
isPalindrome [] = True
isPalindrome [_] = True
isPalindrome x = ans where (Right ans) = walk x 0
This looks quite ugly. Is there a better way to check if a string is a palindrome using only one pass? Is it worth trying to write a single pass check?
-
1\$\begingroup\$ Keep in mind that the compiler might optimize away the second pass. Plus, unless the word is a palindrome, it should return as soon one of the equality tests fails; meaning that it won't necessarily even do a full single pass. \$\endgroup\$Carcigenicate– Carcigenicate2014年10月05日 18:16:32 +00:00Commented Oct 5, 2014 at 18:16
-
\$\begingroup\$ @carcigenicate There are simple worst-case tests "baaaaaaaaac" and "aaaaabcaaaaa" \$\endgroup\$daniil– daniil2014年10月05日 18:21:05 +00:00Commented Oct 5, 2014 at 18:21
-
1\$\begingroup\$ I would think that "baaaaaaaac" would return false immediately. "b" == "c" will evaluate to False, which means right off the bat that the strings are unequal, so palindrome should return False without having to look at the rest of either of the strings (because it didn't have to). \$\endgroup\$Carcigenicate– Carcigenicate2014年10月05日 18:28:18 +00:00Commented Oct 5, 2014 at 18:28
-
1\$\begingroup\$ Actually, in retrospect, it probably will have to do at least 1 pass to get to the last element of the reversed list. The first of your examples still isn't worst case though, because the first element of the first list will be compared to the last element of the second list, and fail. \$\endgroup\$Carcigenicate– Carcigenicate2014年10月05日 18:37:38 +00:00Commented Oct 5, 2014 at 18:37
-
\$\begingroup\$ @Carcigenicate "baaaaaac" is worst case for my implementation because it diverges from the middle of the list. "aaaabcaaaa" is worst case for reverse implementation, obviously. \$\endgroup\$daniil– daniil2014年10月05日 18:56:04 +00:00Commented Oct 5, 2014 at 18:56
2 Answers 2
From the readability point of view, I'd suggest to keep some maximum line length, like 72 or 80 characters. Beyond that it's difficult to read. Also instead of having Either (Either X Y) Z
, declare your own data type with 3 constructors. Not only it'll be simpler, but the constructor names will also be more descriptive and it'll be easier to understand what's going on. You should also document the function better, it's not clear what the Int
argument is without thoroughly inspecting the code. A small improvement is also replacing nested if
s with a case
statement.
In general, I'd say that it's not really possible to make a genuine single-pass solution.
Obviously we can't avoid checking the last element, if the list is a palindrome, and we need to have the first element to compare it to. So we'll always have to either get to the last element at the beginning, or keep the elements as we traverse the list, to have them for comparison when we get to the end, So we can't get to the situation when we just traverse the list from the beginning to the end, releasing the elements already traversed.
In your case, you're also traversing the list twice, although it's somewhat hidden. The first pass is the recursive call to walk
, which gets to the end of the list. And the second traversal is occurring in the line
Left (Right (y:ys)) -> if y == x then
(if n > 0 then Left (Right ys) else Right True)
else Right False
where we're traversing ys
as walk
returns to its parent call.
(The above nested if
s can be rewritten as
... -> case () of
_ | y /= x -> Right False
| n > 0 -> Left (Right ys)
| otherwise -> Right True
using case
.)
Furthermore, since walk
calls itself recursively and examines the result of the recursive call, it can't be optimized to tail recursion, so you're building a chain of n
calls on the stack.
Another thing to notice is that the whole original list is kept in memory until the whole chain of walk
s finishes.
My attempt to solve this problem would look like this:
isPal' :: (Eq a) => [a] -> Bool
isPal' xs = f [] xs xs
where
f ss xs [] = ss == xs
f ss (_:xs) [_] = ss == xs
f ss (x:xs) (_:_:es) = f (x:ss) xs es
We're traversing the list at two different "speeds" at the same time in a tail-recursive loop, to get to the middle. During that we compute the reverse of the traversed part in ss
. When we hit the middle after n/2 steps, we just compare the accumulated reversed first half with the rest. The combined length of ss
and xs
is always n and if we find out in the middle a difference, like in "aaabcaaa"
, we finish and release resources early.
Since all suggested algorithms are O(n), it's hard to decide which one will be more efficient just by reasoning. We'd have to do some measurements, for example using the criterion package.
Update: The problem could be actually solved in a genuine one pass using a rolling checksum, at the cost of having false positives in (rare) cases of hash collisions. Just compute two rolling checksums while traversing the list, one forwards and one backwards, and compare them at the end.
import Control.Arrow ((&&&), second) import Data.Foldable (foldMap) import Data.Function (on) import Data.Hashable import Data.Monoid import Data.Word -- see https://en.wikipedia.org/wiki/Rolling_hash -- and https://en.wikipedia.org/wiki/Lehmer_random_number_generator#Parameters_in_common_use modulus :: Word64 modulus = 2^31 - 1 expG :: Word64 expG = 7^5 data RKHash = RKHash { rkExp :: Word64, rkVal :: Word64 } deriving (Eq, Show) inject :: (Hashable a) => a -> RKHash inject = RKHash expG . (`mod` modulus) . fromIntegral . hash instance Monoid RKHash where mempty = RKHash 1 0 mappend (RKHash e1 v1) (RKHash e2 v2) = RKHash ((e1 * e2) `mod` modulus) ((v1 * e2 + v2) `mod` modulus) isPalindrome :: (Hashable a) => [a] -> Bool isPalindrome = uncurry (on (==) rkVal) . second getDual . foldMap ((id &&& Dual) . inject)
I think this recursive solution is one pass:
:{
isPalindrome inp =
if length inp < 2 then True else
(if (head inp) /= (last inp) then False else
isPalindrome $ init $ tail inp)
:}
Also, it stops as soon as it found it's not a palindrome (if two opposite characters are not identical).
It checks the first and last element are equal, then it removes those elements (and calls itself on that new string), until the remaining string has length 1 or 0.
EDIT: If interested in time efficiency: tail
and last
are really not efficient, so an extension like Data.Sequence is needed for efficient (random) access to the list's elements:
import qualified Data.Sequence as Seq
isPalindrome cpt lenn inp
| (lenn `div` 2 - cpt) < 1 = True
| Seq.lookup cpt inp /= Seq.lookup (lenn - cpt - 1) inp = False
| otherwise = isPalindrome (cpt+1) lenn inp
## calling the function:
isPalindrome 0 (length myPalindrome) $ Seq.fromList myPalindrome