3
\$\begingroup\$

I tried to do the second part of the 1. December challenge of Advent of Code in Haskell. I'm fairly new to Haskell but I have plenty of experience in other (procedural) languages. I struggled with the challenge for hours because the program would hang and it wouldn't produce any output although I couldn't find a problem in my code. Eventually it turned out that my programm worked correctly but it just took very long. To be exact the program took 2 minutes and 40 seconds. According to Advent of Code every challenge should able to execute within 15 seconds though.

So what makes my code so slow?

The task:

You notice that the device repeats the same frequency change list over and over. To calibrate the device, you need to find the first frequency it reaches twice.

For example, using the same list of changes above, the device would loop as follows:

Current frequency 0, change of +1; resulting frequency 1.
Current frequency 1, change of -2; resulting frequency -1.
Current frequency -1, change of +3; resulting frequency 2.
Current frequency 2, change of +1; resulting frequency 3.
(At this point, the device continues from the start of the list.)
Current frequency 3, change of +1; resulting frequency 4.
Current frequency 4, change of -2; resulting frequency 2, which has already been seen.

In this example, the first frequency reached twice is 2. Note that your device might need to repeat its list of frequency changes many times before a duplicate frequency is found, and that duplicates might be found while in the middle of processing the list.

Here are other examples:

+1, -1 first reaches 0 twice.
+3, +3, +4, -2, -4 first reaches 10 twice.
-6, +3, +8, +5, -6 first reaches 5 twice.
+7, +7, -2, -7, -4 first reaches 14 twice.

What is the first frequency your device reaches twice?

My code:

module DayOnePartTwo where
 import System.IO
 import Data.Maybe
 inputFileName = "input.txt"
 input :: String -> [Integer]
 input contents = toNum (cleanNumbers (splitNumbers contents))
 where 
 cleanNumbers strs = map removeLeadingPlus strs
 splitNumbers string = words string
 toNum numbers = map read numbers
 removeLeadingPlus str = if str !! 0 == '+' then tail str else str
 accumulate :: [Integer] -> [Integer]
 accumulate list = accumulator list 0
 where
 accumulator :: Num a => [a] -> a -> [a]
 accumulator (x:xs) state = (x + state) : accumulator xs (x + state)
 accumulator [] state = []
 duplicate :: [Integer] -> Maybe Integer
 duplicate list = dup list [0]
 where
 dup (x:xs) visited =
 if elem x visited
 then Just x
 else dup xs (x:visited)
 dup [] _ = Nothing
 firstCyclicDuplicate :: [Integer] -> Maybe Integer
 firstCyclicDuplicate list = duplicate (accumulate cycledList)
 where
 cycledList = cycle list
 main :: IO ()
 main = do
 contents <- readFile inputFileName
 case (firstCyclicDuplicate (input contents)) of
 Just a -> print (show a)
 Nothing -> print "There is no first duplicate"

This seems to be related to slow advent of code 2018 day 1 part 2 solution in haskell although my algorithm is different.

200_success
145k22 gold badges190 silver badges478 bronze badges
asked Dec 1, 2018 at 23:17
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

I'm not convinced that your algorithm is very different, but since it'll take me more than a comment to outline it, I figured I might as well do a (削除) quick (削除ここまで) code review.

I've also done some name switching -- in general I tried to keep my comments referring to the same named functions as yours, but if you see an unfamiliar name you can reference my sample solution. I probably used the name I gave in it.

Small things

input

Splitting input into a bunch of different functions is good, but there are some changes I'd make.

I would rewrite removeLeadingPlus as a more general function that isn't partial (I'm pretty sure your implementation errors on the empty string -- not that it matters that much since read will error on an empty string too)

removeLeadingPlus :: String -> String
removeLeadingPlus ('+':s) = s
removeLeadingPlus s = s

Note that both cleanNumbers and toNum can be defined without the helper variable strs/string as so

cleanNumbers = map removeLeadingPlus
toNum = map read

I also think that

toNum (cleanNumbers (splitNumbers contents))

is a bit hard on the eyes (though if you're coming from Lisp, I'd understand your inclination). You've correctly identified that input can be realized as the composition of a few functions, so why not define it as such?

input :: String -> [Integer]
input = toNum . cleanNumbers . lines
 where 
 cleanNumbers = map removeLeadingPlus
 toNum = map read
 removeLeadingPlus ('+':s) = s
 removeLeadingPlus s = s

And we can go one step further. Now that we've realized that toNum and cleanNumbers are both just maps, we can redefine input as

input :: String -> [Integer]
input = map (read . removeLeadingPlus) . lines
 where 
 removeLeadingPlus ('+':s) = s
 removeLeadingPlus s = s

Maybe this looks less clean to you. I think that

input = map (read . removeLeadingPlus) . lines

and

input contents = map (read . removeLeadingPlus) $ lines contents

are both fine and pretty readable.

accumulate

I like this function and I think it's a better approach to obtaining the frequencies (your inefficiency is still in duplicate). However, accumulate looks suspiciously similar to a function in Prelude.

accumulate takes in a list of integers and returns all of the prefix sums of them. So it is essentially foldl of every prefix of the input list.

You might be familiar with scanl which is a combinator that does this.

*DayOnePartTwo> accumulate [1,2,3,4]
[1,3,6,10]
*DayOnePartTwo> scanl (+) 0 [1,2,3]4
[0,1,3,6,10]

It even gives us the desired starting value of 0 (which you presently hardcode into duplicate).

So we can write a much shorter definition of accumulate as

accumulate :: [Integer] -> [Integer]
accumulate = scanl (+) 0

I also think a better name is freqs since I think accumulate is vague.

firstCyclicDuplicate

This can also be realized as the composition of a bunch of functions

firstCyclicDuplicate :: [Integer] -> Maybe Integer
firstCyclicDuplicate = duplicate . accumulate . cycle

Again, up to personal preference.

duplicate

I would rename this to firstDuplicate to be consistent with firstCyclicDuplicate.

Indentation

I'm not a Haskell pro, but your indentation seems a bit off and my syntax highlighting agrees with it (it doesn't highlight the function declarations when they're indented). I would keep everything top level unindented. Also my personal preference is indenting by two spaces, so that's what I've done in my sample solution.

Imports

Your imports are unnecessary as far as I can tell (my code compiles without them). If you thought you needed the imports for the types, I think they're in Prelude.

Use of Maybe in duplicate

I like the use of Maybe here and I suggested the same to dagda1, but I realize now that its use, while stylistically good, is somewhat pointless.

Since you're operating on an infinite list of frequencies, your code will run forever (or until the stack blows). Even though you have

 dup [] _ = Nothing

you'll never pattern match an empty list since you're running this function on an infinite one.

Stylistically, I think using Maybe is better than having a partial function, but I just wanted to point out that having pure functions doesn't necessarily save you from an infinite loop.

Efficiency

Your implementation is, for all intents and purposes, the same algorithm as dagda1's in the other post.

They chose to accumulate the frequencies inside of their function to find the duplicate, whereas you've decoupled the two. Due to laziness I would argue the approaches are the same.

Again, the inefficiency comes down to the use of lists over sets in duplicate. This is a great problem to demonstrate why elem is something you should be wary of using and why linked lists make poor sets.

Your issue is in your duplicate lookup

 if elem x visited

as elem is an O(n) operation. You can read the Efficiency section on my answer to the other post for a bit more details.

The solution is the same, and I'll copy paste my response for convenience.

Introducing Data.Set

Data.Set provides unordered sets with O(log n) lookup and insert time. Yeah, it's O(log n), but the total runtime for me when I checked the implementation was less than a second.

I'm including a sample implementation of day 1 below that I've tested and confirmed on my inputs if you want to compare. Here are some pointers if you want to fix it yourself

  • You can keep your code mostly the same (although I'd recommend making the style changes I suggested)
  • You will want to use two functions from Data.Set: member and insert.
  • Your only need to change duplicate, specifically your use of data structure (see section header, hint hint)

Sample solution

module DayOnePartTwo where
import Data.Set (Set)
import qualified Data.Set as S
inputFileName = "input.txt"
input :: String -> [Integer]
input = map (read . removeLeadingPlus) . lines
 where
 removeLeadingPlus ('+':s) = s
 removeLeadingPlus s = s
freqs :: [Integer] -> [Integer]
freqs = scanl (+) 0
firstDuplicate :: [Integer] -> Maybe Integer
firstDuplicate list = dup list S.empty
 where
 dup (x:xs) visited =
 if x `S.member` visited
 then Just x
 else dup xs (S.insert x visited)
 -- This will never be reached on the cycled list
 dup [] _ = Nothing
firstCyclicDuplicate :: [Integer] -> Maybe Integer
firstCyclicDuplicate = firstDuplicate . freqs . cycle
main :: IO ()
main = do
 contents <- input <$> readFile inputFileName
 case firstCyclicDuplicate contents of
 Just a -> print a
 Nothing -> print "There is no first duplicate"
answered Dec 2, 2018 at 22:42
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.