4
\$\begingroup\$

I am trying to solve a Hacker Rank problem about string suffixes:

For two strings A and B, we define the similarity of the strings to be the length of the longest prefix common to both strings. For example, the similarity of strings "abc" and "abd" is 2, while the similarity of strings "aaa" and "aaab" is 3.

Calculate the sum of similarities of a string S with each of its suffixes.

I have the following code:

import System.IO
import Control.Monad
import Data.Maybe
import Data.List
import qualified Data.ByteString.Char8 as BSC
main = do
 n <- liftM getIntFromBS BSC.getLine
 replicateM_ n $ do
 s <- BSC.getLine
 putStrLn . show $ sum $ map (cntPrefix s) $ tail $ sort $ BSC.tails s
 where
 getIntFromBS = fst.fromJust.BSC.readInt
 cntPrefix str pref = length $ takeWhile (\t -> fst t == snd t) $ BSC.zip pref str

But performance is really bad. I am doing this challenges to learn Haskell and thus my skills aren't good enough to optimize this program. Any help is appreciated.

asked Nov 13, 2014 at 1:33
\$\endgroup\$
2

1 Answer 1

2
\$\begingroup\$

As far as I can tell your solution is not only slow but incorrect because of your usage of sort. Dropping that single function from your pipeline nets an exponential improvement, and it stopped including the similarity of s to itself in the answer.

Make sure also you're compiling with -O2, otherwise sum won't be optimized into a strict fold and you'll blow up the stack with thunks.

I also came up with a version targeted toward readability, but it's ~3x slower than your (corrected) ByteString based version so I'll just include it here as a curiosity.

import Control.Monad (replicateM_)
import Data.List (tails)
main :: IO ()
main = do
 n <- readLn
 replicateM_ n $ do
 s <- getLine
 print $ sum $ map (similarity s) (suffixes s)
similarity :: (Eq a) => [a] -> [a] -> Int
similarity s = length . takeWhile id . zipWith (==) s
suffixes :: [a] -> [[a]]
suffixes = tail . tails
answered Nov 13, 2014 at 13:49
\$\endgroup\$
3
  • \$\begingroup\$ The sort isn't there for fun, is necessary to give the correct answer. Your answer gives wrong output. \$\endgroup\$ Commented Nov 13, 2014 at 14:21
  • 1
    \$\begingroup\$ Only if you consider xs :: [a] to be a suffix of xs :: [a], which depends on how pedantic you're willing to be. It also is completely unnecessary, because the only thing tail . sort . tails will ever do is drop an instance of the empty list which in your case would have a similarity of 0 anyway and not affect the sum. Remove tail $ sort $ from your version and you'll see the answer you expect along with the speedup. \$\endgroup\$ Commented Nov 13, 2014 at 14:50
  • \$\begingroup\$ That's different indeed to what you posted since with your answer tail will drop an arbitrary suffix and not the empty list, that as you said is redundant to drop since it equals zero. Unfortunately tried with the suggested modifications but still too slow. I'll update the original post. \$\endgroup\$ Commented Nov 13, 2014 at 17:15

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.