I tried to do the second part of the 1. December challenge of Advent of Code in Haskell. I'm fairly new to Haskell but I have plenty of experience in other (procedural) languages. I struggled with the challenge for hours because the program would hang and it wouldn't produce any output although I couldn't find a problem in my code. Eventually it turned out that my programm worked correctly but it just took very long. To be exact the program took 2 minutes and 40 seconds. According to Advent of Code every challenge should able to execute within 15 seconds though.
So what makes my code so slow?
The task:
You notice that the device repeats the same frequency change list over and over. To calibrate the device, you need to find the first frequency it reaches twice.
For example, using the same list of changes above, the device would loop as follows:
Current frequency 0, change of +1; resulting frequency 1. Current frequency 1, change of -2; resulting frequency -1. Current frequency -1, change of +3; resulting frequency 2. Current frequency 2, change of +1; resulting frequency 3. (At this point, the device continues from the start of the list.) Current frequency 3, change of +1; resulting frequency 4. Current frequency 4, change of -2; resulting frequency 2, which has already been seen.
In this example, the first frequency reached twice is 2. Note that your device might need to repeat its list of frequency changes many times before a duplicate frequency is found, and that duplicates might be found while in the middle of processing the list.
Here are other examples:
+1, -1 first reaches 0 twice. +3, +3, +4, -2, -4 first reaches 10 twice. -6, +3, +8, +5, -6 first reaches 5 twice. +7, +7, -2, -7, -4 first reaches 14 twice.
What is the first frequency your device reaches twice?
My code:
module DayOnePartTwo where
import System.IO
import Data.Maybe
inputFileName = "input.txt"
input :: String -> [Integer]
input contents = toNum (cleanNumbers (splitNumbers contents))
where
cleanNumbers strs = map removeLeadingPlus strs
splitNumbers string = words string
toNum numbers = map read numbers
removeLeadingPlus str = if str !! 0 == '+' then tail str else str
accumulate :: [Integer] -> [Integer]
accumulate list = accumulator list 0
where
accumulator :: Num a => [a] -> a -> [a]
accumulator (x:xs) state = (x + state) : accumulator xs (x + state)
accumulator [] state = []
duplicate :: [Integer] -> Maybe Integer
duplicate list = dup list [0]
where
dup (x:xs) visited =
if elem x visited
then Just x
else dup xs (x:visited)
dup [] _ = Nothing
firstCyclicDuplicate :: [Integer] -> Maybe Integer
firstCyclicDuplicate list = duplicate (accumulate cycledList)
where
cycledList = cycle list
main :: IO ()
main = do
contents <- readFile inputFileName
case (firstCyclicDuplicate (input contents)) of
Just a -> print (show a)
Nothing -> print "There is no first duplicate"
This seems to be related to slow advent of code 2018 day 1 part 2 solution in haskell although my algorithm is different.
1 Answer 1
I'm not convinced that your algorithm is very different, but since it'll take me more than a comment to outline it, I figured I might as well do a (削除) quick (削除ここまで) code review.
I've also done some name switching -- in general I tried to keep my comments referring to the same named functions as yours, but if you see an unfamiliar name you can reference my sample solution. I probably used the name I gave in it.
Small things
input
Splitting input
into a bunch of different functions is good, but there are some changes I'd make.
I would rewrite removeLeadingPlus
as a more general function that isn't partial (I'm pretty sure your implementation errors on the empty string -- not that it matters that much since read
will error on an empty string too)
removeLeadingPlus :: String -> String
removeLeadingPlus ('+':s) = s
removeLeadingPlus s = s
Note that both cleanNumbers
and toNum
can be defined without the helper variable strs
/string
as so
cleanNumbers = map removeLeadingPlus
toNum = map read
I also think that
toNum (cleanNumbers (splitNumbers contents))
is a bit hard on the eyes (though if you're coming from Lisp, I'd understand your inclination). You've correctly identified that input
can be realized as the composition of a few functions, so why not define it as such?
input :: String -> [Integer]
input = toNum . cleanNumbers . lines
where
cleanNumbers = map removeLeadingPlus
toNum = map read
removeLeadingPlus ('+':s) = s
removeLeadingPlus s = s
And we can go one step further. Now that we've realized that toNum
and cleanNumbers
are both just map
s, we can redefine input
as
input :: String -> [Integer]
input = map (read . removeLeadingPlus) . lines
where
removeLeadingPlus ('+':s) = s
removeLeadingPlus s = s
Maybe this looks less clean to you. I think that
input = map (read . removeLeadingPlus) . lines
and
input contents = map (read . removeLeadingPlus) $ lines contents
are both fine and pretty readable.
accumulate
I like this function and I think it's a better approach to obtaining the frequencies (your inefficiency is still in duplicate
). However, accumulate
looks suspiciously similar to a function in Prelude
.
accumulate
takes in a list of integers and returns all of the prefix sums of them. So it is essentially foldl
of every prefix of the input list.
You might be familiar with scanl
which is a combinator that does this.
*DayOnePartTwo> accumulate [1,2,3,4]
[1,3,6,10]
*DayOnePartTwo> scanl (+) 0 [1,2,3]4
[0,1,3,6,10]
It even gives us the desired starting value of 0 (which you presently hardcode into duplicate
).
So we can write a much shorter definition of accumulate
as
accumulate :: [Integer] -> [Integer]
accumulate = scanl (+) 0
I also think a better name is freqs
since I think accumulate
is vague.
firstCyclicDuplicate
This can also be realized as the composition of a bunch of functions
firstCyclicDuplicate :: [Integer] -> Maybe Integer
firstCyclicDuplicate = duplicate . accumulate . cycle
Again, up to personal preference.
duplicate
I would rename this to firstDuplicate
to be consistent with firstCyclicDuplicate
.
Indentation
I'm not a Haskell pro, but your indentation seems a bit off and my syntax highlighting agrees with it (it doesn't highlight the function declarations when they're indented). I would keep everything top level unindented. Also my personal preference is indenting by two spaces, so that's what I've done in my sample solution.
Imports
Your imports are unnecessary as far as I can tell (my code compiles without them). If you thought you needed the imports for the types, I think they're in Prelude
.
Use of Maybe
in duplicate
I like the use of Maybe
here and I suggested the same to dagda1, but I realize now that its use, while stylistically good, is somewhat pointless.
Since you're operating on an infinite list of frequencies, your code will run forever (or until the stack blows). Even though you have
dup [] _ = Nothing
you'll never pattern match an empty list since you're running this function on an infinite one.
Stylistically, I think using Maybe
is better than having a partial function, but I just wanted to point out that having pure functions doesn't necessarily save you from an infinite loop.
Efficiency
Your implementation is, for all intents and purposes, the same algorithm as dagda1's in the other post.
They chose to accumulate the frequencies inside of their function to find the duplicate, whereas you've decoupled the two. Due to laziness I would argue the approaches are the same.
Again, the inefficiency comes down to the use of lists over sets in duplicate
. This is a great problem to demonstrate why elem
is something you should be wary of using and why linked lists make poor sets.
Your issue is in your duplicate lookup
if elem x visited
as elem
is an O(n)
operation. You can read the Efficiency section on my answer to the other post for a bit more details.
The solution is the same, and I'll copy paste my response for convenience.
Introducing Data.Set
Data.Set provides unordered sets with O(log n)
lookup and insert time. Yeah, it's O(log n)
, but the total runtime for me when I checked the implementation was less than a second.
I'm including a sample implementation of day 1 below that I've tested and confirmed on my inputs if you want to compare. Here are some pointers if you want to fix it yourself
- You can keep your code mostly the same (although I'd recommend making the style changes I suggested)
- You will want to use two functions from
Data.Set
:member
andinsert
. - Your only need to change
duplicate
, specifically your use of data structure (see section header, hint hint)
Sample solution
module DayOnePartTwo where
import Data.Set (Set)
import qualified Data.Set as S
inputFileName = "input.txt"
input :: String -> [Integer]
input = map (read . removeLeadingPlus) . lines
where
removeLeadingPlus ('+':s) = s
removeLeadingPlus s = s
freqs :: [Integer] -> [Integer]
freqs = scanl (+) 0
firstDuplicate :: [Integer] -> Maybe Integer
firstDuplicate list = dup list S.empty
where
dup (x:xs) visited =
if x `S.member` visited
then Just x
else dup xs (S.insert x visited)
-- This will never be reached on the cycled list
dup [] _ = Nothing
firstCyclicDuplicate :: [Integer] -> Maybe Integer
firstCyclicDuplicate = firstDuplicate . freqs . cycle
main :: IO ()
main = do
contents <- input <$> readFile inputFileName
case firstCyclicDuplicate contents of
Just a -> print a
Nothing -> print "There is no first duplicate"
Explore related questions
See similar questions with these tags.