Solution to Hackerrank challenge "Sherlock and Queries"

Question 1

First, here is the full challenge description:

Watson gives to Sherlock an array: \$A_1, A_2, ..., A_N\$. He also gives to Sherlock two other arrays: \$B_1, B_2, ..., B_M\$ and \$C_1, C_2, ..., C_M\$.

Then Watson asks Sherlock to perform the following program:
for i = 1 to M do
 for j = 1 to N do
 if j % B[i] == 0 then
 A[j] = A[j] * C[i]
 endif
 end do
end do
Can you help Sherlock and tell him the resulting array A? You should print all the array elements modulo \$(10^9 + 7)\$.

Input Format

The first line contains two integer numbers \$N\$ and \$M\$. The next line contains \$N\$ integers, the elements of array A. The next two lines contains \$M\$ integers each, the elements of array B and C.

Output Format

Print \$N\$ integers, the elements of array A after performing the program modulo \$(10^9 + 7)\$.

Constraints

\1ドル ≤ N, M ≤ 10^5\$
\1ドル ≤ B[i] ≤ N\$
\1ドル ≤ A[i], C[i] ≤ 10^5\$

Sample Input
4 3
1 2 3 4
1 2 3
13 29 71
Sample Output
13 754 2769 1508

And here is the code of my solution:

import Control.Applicative
import Control.Arrow
import Control.Monad.Primitive
import Control.Monad.ST
import Data.Int
import Data.List.Split
import Data.Maybe
import qualified Data.ByteString.Char8 as B
import qualified Data.Map as M
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Unboxed.Mutable as Vmu
type Vec = V.Vector Int64
(|>) :: a -> (a -> b) -> b
(|>) x y = y x
infixl 0 |>
main :: IO ()
main = do
 [n, m] <- (splitOn " " >>> map read) <$> getLine
 inputLines <- B.split '\n' <$> B.getContents
 let [a, b, c] = map readInts inputLines
 solve n m a b c |> V.toList >>> map show >>> unwords |> putStrLn
-- http://stackoverflow.com/questions/25913481/read-numbers-from-stdin-into-a-data-vector-unboxed-vector-int64
readInts :: B.ByteString -> Vec
readInts = B.split ' ' >>> mapMaybe (B.readInt >>> liftA fst) >>>
 map fromIntegral >>> V.fromList
limit :: Integral a => a -> a
limit = (`mod` 1000000007)
solve :: Int -> Int -> Vec -> Vec -> Vec -> Vec
solve n m a b c = applyChanges changes a
 where
 changes = [(i, \x -> limit $ x * fact ) | (i, fact) <- idxsAndFactors]
 idxsAndFactors = [let ii = fromIntegral i
 in zip [ii-1, ii+ii-1 .. (n-1)] (repeat factor) |
 (i, factor) <- M.assocs factors] |> concat
 factors = buildFactors m b c
-- http://stackoverflow.com/questions/25872149/apply-a-list-of-changes-to-elements-of-a-mutable-vector
applyChanges :: [(Int, Int64 -> Int64)] -> Vec -> Vec
applyChanges changes v = runST $ do
 mV <- V.thaw v
 mapM_ (applyChange mV) changes
 V.freeze mV
applyChange :: (Control.Monad.Primitive.PrimMonad m, Vmu.Unbox t) =>
 Vmu.MVector (Control.Monad.Primitive.PrimState m) t
 -> (Int, t -> t) -> m ()
applyChange mvec (idx, f) = do
 val <- Vmu.read mvec idx
 Vmu.write mvec idx $ f val
buildFactors :: Int -> Vec -> Vec -> M.Map Int64 Int64
buildFactors m b c = M.empty |> comp inserts
 |> M.map fromIntegral >>> M.mapKeys fromIntegral
 where
 inserts = [M.insertWith multAndLimit (b V.! i) (c V.! i) |
 i <- [0 .. m-1]]
 where multAndLimit x y = x * y |> limit
-- http://stackoverflow.com/questions/19777555/most-idiomatic-implementation-of-a-a-a-a
comp :: [b -> b] -> b -> b
comp = foldr (.) id

It passes all tests, but the code looks way too complicated to me, especially when I compare it to my Python solution:

import collections
import sys
def main():
 lines = [[int(x) for x in line.split()] for line in sys.stdin]
 [n, m], a, b, c = lines
 def one():
 return 1
 # http://codereview.stackexchange.com/questions/62956/performance-in-hackerrank-challenge-sherlock-and-queries
 factors = collections.defaultdict(one)
 for i in range(0, m):
 factors[b[i]] = factors[b[i]] * c[i] % 1000000007
 for i, factor in factors.iteritems():
 for idx in xrange(i-1, n, i):
 a[idx] = a[idx] * factor % 1000000007
 print ' '.join(map(str, a))
if __name__ == "__main__":
 main()

Here is a large test case (#15):

Input
Output

On my system, the Python version takes about 0.5s and the Haskell version 0.7s.

Usually my Haskell code is shorter, but not in this case. Also, the Haskell solution runs a bit slower than the Python version. This also should be the other way around, I guess.

How could my Haskell code be improved, i.e. made more readable etc. without losing performance?

Question 2

I started following an approach very similar to the one used in bisserlis answer. When I noticed I was too slow I had a look at your python implementation and tried to come up with something using the same ideas.

import Control.Applicative
import Control.Arrow
import Control.Monad.Primitive
import Control.Monad.ST
import Data.Int
import Data.List
import Data.List.Split
import Data.Maybe
import qualified Data.ByteString.Char8 as B
updateA2 :: [Int64] -> [Int64] -> [Int64] -> Int64 -> [Int64]
updateA2 as bs cs n = result
 where factorsFromA = zip [1..n] as
 contributesFromBC = zip bs cs
 factorsFromBC = (compress contributesFromBC) >>= (generateMultiples n)
 allFactors = factorsFromA ++ factorsFromBC
 (_, result) = unzip (compress allFactors)
compress :: Integral a => [(a,a)] -> [(a,a)]
compress = compressInner . sort
compressInner :: Integral a => [(a,a)] -> [(a,a)]
compressInner (x: []) = [x]
compressInner ((current,currentValue) : xs) = if (current == i)
 then (i, limit (v * currentValue)) : tail
 else (current, currentValue) : (i, v) : tail
 where ((i, v):tail) = compressInner xs
limit :: Integral a => a -> a
limit x = mod x 1000000007 
generateMultiples :: Integral a => a -> (a,a) -> [(a,a)]
generateMultiples n (index, value) = map (\i -> (i,value)) indexValues
 where indexValues = [index * i | i <- [1..(div n index)]]

The core part is the generation of factors. As in your code, factors come from two different sources. The list A, which provides the first factor, and the lists B and C. The logic is quite simple. At first we generate the pairs of (indexFromB, valueFromC). Then we expand it to keep track of all the multiples of indexFromB.

When we have the two lists of factors we can combine and compress them. What compress does is actually to take all the factors that apply to the same index and generate the corresponding value obtained from their multiplication.

It is worth noticing that I/O is extremely important when we have to deal with huge files like the ones in problems #15, #16, and #17. I had to use your implementation based on Data.ByteString.Char8 to remove almost all the time spent in I/O.

(|>) :: a -> (a -> b) -> b
(|>) x y = y x
infixl 0 |>
readInts :: B.ByteString -> [Int64]
readInts = B.split ' ' >>> mapMaybe (B.readInt >>> liftA fst) >>>
 map fromIntegral
main :: IO ()
main = do
 [n, m] <- (splitOn " " >>> map read) <$> getLine 
 inputLines <- B.split '\n' <$> B.getContents
 let [a, b, c] = map readInts inputLines
 putStrLn $ concat . intersperse " " . map show $ updateA2 a b c n

Question 3

Thank you. I put your version into hackerrank and the result is this. It still is about 5 times slower than my solution but it passes all tests, and that is good, especially as your code is much easier to read. I will now work trough your solution and explanation to understand it fully.

Question 4

I think it is time to give it a run with the profiler and understand where it is spending much time than needed. I'm going to give it a go tomorrow. In the meanwhile let me know if something in my explanation or in my code is not too clear.

Question 5

Thank you. I think your code is quite clear. Hlint makes some suggestions, but that is mostly peanuts. Ah, and I especially like your clever usage of the list monad. :)

Question 6

Your Haskell code is quite a tangle. I think you may have jumped the gun a bit with all those mutable vectors. Before reaching for mutation and iterative solutions in a Haskell program, take a step back and look at the shape of the problem before you. In this case the execution of the algorithm doesn't actually depend on any intermediate values, that is, there aren't any conditionals or jumps predicated on the A array itself right? There's a straight transform of input to output which is where functional lazy solutions can really shine.

Here's my version.

module Main where
main :: IO ()
main = do
 input <- getContents
 let ((n, m), a, b, c) = readInput input
 putStrLn $ unwords . map show . map (`mod` (10^9 + 7)) $ solve n m a b c
readInput :: String -> ((Int, Int), [Int], [Int], [Int])
readInput s = 
 let 
 ( [n, m] : a : b : c : _) = map (map read) . map words . lines $ s
 in
 ( (n, m) , a , b , c )
solve :: Int -> Int -> [Int] -> [Int] -> [Int] -> [Int]
solve n _ as bs cs = map product . zipWith (:) as
 $ [ [ c | (b, c) <- zip bs cs, j `mod` b == 0] | j <- [1..n]]

You can see most of the program is devoted to input and output, and only two lines (three if you count the function type signature) are required to implement the algorithm itself. Here's the trick.

The loops can be reordered so that instead of mutating an input array, you're actually producing an output list of multiplicands.

$ [ [ _ ] | j <- [1..n]]

Iterating on the j index was the inner loop originally, but because j indexes the output list I move it to the outside to take advantage of the 1-1 correspondence with the input list.

[c | (b, c) <- zip bs cs, j `mod` b == 0]

By noticing that the B and C arrays are only ever used in lockstep, I can throw out the i index (and the m array size at that, I left it as an argument just for correspondence with the original problem statement and your version). bs and cs are then zipped so that I can walk over them pairwise. When the test on b succeeds, I just include c in the output as a multiplicand.

zipWith (:) as

Because the output of the list comprehensions now contains lists of multiplicands from C that correspond to indices in A, I just pop the multipliers onto the front of the multiplicand lists and then...

map product

Compute the product at each index. The rest of the code is just pretty-printing and IO.

On the very small sample input you provided my version actually runs a fraction of a second faster than yours, my guess is that's due to the overhead you incur from setting up your vectors. If you have a larger sample input I'd be very interested to see the performance. I would wager that our two versions stay within an order of magnitude of each other on running time, and they should use roughly the same amount of space due to lazy evaluation in my version. Theoretically. ;)

Question 7

Thank you very much for your review so far. You code is the kind of terseness I was looking for. I had to remove concat ( gist.github.com/Dobiasd/6f1c6e582ca4ddc6dae4 ) to make it compile, but that is no big deal. But your code is not only much slower, but also often gives the wrong results. Your result: gist.github.com/Dobiasd/a3e8c75b0f347062cc0a My result: gist.github.com/Dobiasd/63c3bd66f4bcb88c9f49 I added test case 15 to my original post in case you still want to help. :)

Question 8

I think his wrong answers are due to Int overflowing. I implemented a solution very similar to his and I had to use Integer. Int64 is probably a better choice. I'm trying to profile my code to see if I manage to make it faster. I'll let you know if I get any improvement

Question 9

Here is my attempt. It follows ideas from your code but uses accum function to update vectors. It is subject to fusion hence runs quite fast (test case #15 runs in 0.18s on hackerrank).

The code is a bit ugly because of index juggling. Python version does not require this and is much cleaner.

{-# LANGUAGE TupleSections #-}
import qualified Data.ByteString.Char8 as B
import qualified Data.Vector.Unboxed as V
main :: IO ()
main = do
 let readInt s = let Just (i,_) = B.readInt s in i
 inp <- fmap (map (map readInt . B.words) . B.lines) B.getContents
 let [[n,m],a,b,c] = inp :: [[Int]]
 let mulMod x y = x * y `mod` 1000000007
 let factors = V.accum mulMod (V.replicate (m+1) 1) $ zip b c
 let indexedFactors = zip [1..] $ tail $ V.toList factors
 let d = concat [map (,f) [i, i+i .. n] | (i,f) <- indexedFactors]
 let res = V.accum mulMod (V.fromList $ 0 : a) d
 putStrLn $ unwords $ map show $ tail $ V.toList res

score 3 · Accepted Answer · 2014-09-20 22:35:56Z

I started following an approach very similar to the one used in bisserlis answer. When I noticed I was too slow I had a look at your python implementation and tried to come up with something using the same ideas.

import Control.Applicative
import Control.Arrow
import Control.Monad.Primitive
import Control.Monad.ST
import Data.Int
import Data.List
import Data.List.Split
import Data.Maybe
import qualified Data.ByteString.Char8 as B
updateA2 :: [Int64] -> [Int64] -> [Int64] -> Int64 -> [Int64]
updateA2 as bs cs n = result
 where factorsFromA = zip [1..n] as
 contributesFromBC = zip bs cs
 factorsFromBC = (compress contributesFromBC) >>= (generateMultiples n)
 allFactors = factorsFromA ++ factorsFromBC
 (_, result) = unzip (compress allFactors)
compress :: Integral a => [(a,a)] -> [(a,a)]
compress = compressInner . sort
compressInner :: Integral a => [(a,a)] -> [(a,a)]
compressInner (x: []) = [x]
compressInner ((current,currentValue) : xs) = if (current == i)
 then (i, limit (v * currentValue)) : tail
 else (current, currentValue) : (i, v) : tail
 where ((i, v):tail) = compressInner xs
limit :: Integral a => a -> a
limit x = mod x 1000000007 
generateMultiples :: Integral a => a -> (a,a) -> [(a,a)]
generateMultiples n (index, value) = map (\i -> (i,value)) indexValues
 where indexValues = [index * i | i <- [1..(div n index)]]

The core part is the generation of factors. As in your code, factors come from two different sources. The list A, which provides the first factor, and the lists B and C. The logic is quite simple. At first we generate the pairs of (indexFromB, valueFromC). Then we expand it to keep track of all the multiples of indexFromB.

When we have the two lists of factors we can combine and compress them. What compress does is actually to take all the factors that apply to the same index and generate the corresponding value obtained from their multiplication.

It is worth noticing that I/O is extremely important when we have to deal with huge files like the ones in problems #15, #16, and #17. I had to use your implementation based on Data.ByteString.Char8 to remove almost all the time spent in I/O.

(|>) :: a -> (a -> b) -> b
(|>) x y = y x
infixl 0 |>
readInts :: B.ByteString -> [Int64]
readInts = B.split ' ' >>> mapMaybe (B.readInt >>> liftA fst) >>>
 map fromIntegral
main :: IO ()
main = do
 [n, m] <- (splitOn " " >>> map read) <$> getLine 
 inputLines <- B.split '\n' <$> B.getContents
 let [a, b, c] = map readInts inputLines
 putStrLn $ concat . intersperse " " . map show $ updateA2 a b c n

Thank you. I put your version into hackerrank and the result is this. It still is about 5 times slower than my solution but it passes all tests, and that is good, especially as your code is much easier to read. I will now work trough your solution and explanation to understand it fully.
I think it is time to give it a run with the profiler and understand where it is spending much time than needed. I'm going to give it a go tomorrow. In the meanwhile let me know if something in my explanation or in my code is not too clear.
Thank you. I think your code is quite clear. Hlint makes some suggestions, but that is mostly peanuts. Ah, and I especially like your clever usage of the list monad. :)

Stack Exchange Network

Solution to Hackerrank challenge "Sherlock and Queries"

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Solution to Hackerrank challenge "Sherlock and Queries"

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions