Wednesday, June 6, 2007
Solving Collatz Sequences
Steve, over at cod3po37ry, is learning Haskell and working through some contest problems, starting with the Collatz sequence. Steve doesn't mention it explicitly, but this is an interesting unsolved problem in mathematics. The problem starts out with a simple function over a single positive integer n:
Steve's solution is a little, um, cumbersome, and he makes it clear that his goal is to learn:
Unlike most languages I have used, Haskell has a very interesting property -- if you find yourself writing a lot of code, chances are you're doing something wrong. Generally, this is because the Prelude and the standard library have a rich set of tools for building solutions from the bottom up. Writing a lot of code usually means that you're ignoring a key abstraction somewhere. Haskell programs tend to be short because they can be short; many common problems have generalized solutions in the library already. If you find a general problem that's not solved in the standard library, implementing a general solution typically involves writing one missing function.
The first step to solving this problem involves the collatz function which implements the interesting 3n + 1 property:
Therefore, expanding this simple collatz function to a function that produces a collatz sequence is simply:
- if n is even, halve it
- if n is odd, triple it and add one
It's an open question as to whether or not this function always converges on a value of 1 for any positive integer n. Wikipedia claims that this property has been proven up through at least 10 × 258, but it's unclear if it's true for all positive integers.f 2 = [2,1]
f 3 = [3,10,5,16,8,4,2,1]
f 4 = [4,2,1]
f 5 = [5,16,8,4,2,1]
f 6 = [6,3,10,5,16,8,4,2,1]
f 7 = [7,22,11,34,17,52,26,13,40,20,10,5,16,8,4,2,1]
Steve's solution is a little, um, cumbersome, and he makes it clear that his goal is to learn:
This post follows the Law of Usenet: if you post a question, nobody will answer it. If you post an answer, they'll all rush to correct it. I welcome corrections.In that spirit, here is my response. :-)
Unlike most languages I have used, Haskell has a very interesting property -- if you find yourself writing a lot of code, chances are you're doing something wrong. Generally, this is because the Prelude and the standard library have a rich set of tools for building solutions from the bottom up. Writing a lot of code usually means that you're ignoring a key abstraction somewhere. Haskell programs tend to be short because they can be short; many common problems have generalized solutions in the library already. If you find a general problem that's not solved in the standard library, implementing a general solution typically involves writing one missing function.
The first step to solving this problem involves the collatz function which implements the interesting 3n + 1 property:
Next, there's the function to produce a "collatz sequence" starting with a number n and (hopefully) terminating with the number 1. Fortunately, this behavior is precisely what the iterate function in the Prelude provides:collatz :: Int -> Int
collatz 1 = 1
collatz n = if (odd n)
then (3 * n + 1)
else n `div` 2
That is, iterate takes a function and a seed value, and returns a list that contains the seed, and an infinite sequence of values produced by applying the function to the previous value.iterate :: (a -> a) -> a -> [a]
Therefore, expanding this simple collatz function to a function that produces a collatz sequence is simply:
Actually, that's not quite correct, since a collatz sequence terminates with a value of 1:collatzSequence :: Int -> [Int]
collatzSequence = iterate collatz
Sure enough, this function works as expected:collatzSequence :: Int -> [Int]
collatzSequence = terminate . iterate collatz
where
terminate (1:_) = [1]
terminate (x:xs) = x:terminate xs
That solves the problem of generating collatz sequences. But the contest problem was to find the longest collatz sequence within a range of numbers. Given:*Main> collatzSequence 7
[7,22,11,34,17,52,26,13,40,20,10,5,16,8,4,2,1]
produce this output:1 10
100 200
201 210
900 1000
Generating these results is simple enough. First, convert a collatz sequence to its length:1 10 20
100 200 125
201 210 89
900 1000 174
Next, convert a range of integers into their collatz lengths:*Main> length $ collatzSequence 7
17
Next, pick out the largest sequence in the range:
*Main> map (length . collatzSequence) [1..10]
[1,2,8,3,6,9,17,4,20,7]
Next, consume a line of input, perform this calculation, and produce a line of output:*Main> maximum $ map (length . collatzSequence) [1..10]
17
And finally, write the main function that consumes all input and produces the desired output:run :: String -> IO ()
run s = do let (i:j:_) = map read (words s)
m = maximum $ map (length . collatzSequence) [i..j]
putStrLn (concat [show i, " ", show j, " ", show m])
Here is the completed program in its entirety:main = do inp <- getContents
mapM_ run (lines inp)
Hope this helps!collatz :: Int -> Int
collatz 1 = 1
collatz n = if (odd n)
then (3 * n + 1)
else n `div` 2
collatzSequence :: Int -> [Int]
collatzSequence = terminate . iterate collatz
where
terminate (1:_) = [1]
terminate (x:xs) = x:terminate xs
run :: String -> IO ()
run s = do let (i:j:_) = map read (words s)
m = maximum $ map (length . collatzSequence) [i..j]
putStrLn (concat [show i, " ", show j, " ", show m])
main = do inp <- getContents
mapM_ run (lines inp)
Subscribe to:
Post Comments (Atom)
5 comments:
And, of course, your list-generating function can be written using 'unfoldr':
collatzSequence :: Int -> [Int]
collatzSequence = unfoldr collatz
where
collatz 0 = Nothing
collatz 1 = Just (1, 0)
collatz x = if odd x
then Just (x, 3 * x + 1)
else Just (x, x `div` 2)
Since zero cannot be generated by collatz, I used it as the terminating value for 'unfoldr'. Sorry for the poor formatting; I couldn't use the <pre> tag.
Keep up the good work! This is an informative blog...
Small world. I'm going through the same list of problems in an effort to pick up Haskell. I like the iterate/unfoldr solutions more than the one I came up with; not perverting the collatz function such that it returns a list is much more clear. Thanks for posting this! Very helpful.
Nice code. I'm still learning Haskell as well, and don't remember seeing "$" before; I'll have to see if I can use it to make my code less cumbersome.
Couple of nit-picks:
* I've always thought for some reason that the sequence ends in a 3-cycle:
1, 4, 2, 1, 4, 2 ...
(see the Wikipedia page). With this assumption, you can remove the pattern match
collatz 1 = 1
- the terminate function should still work (at least for positive inputs :)
* In the outputs, the length of the Collatz sequence for 9 is given as 20, but the maximum of the lengths when the inputs range from 1 to 10 is 17?
Is there a reason you prefer to define terminate rather than use takeWhile?
I defined terminate instead of using takeWhile, because (takeWhile (/= 1)) will remove the last element of the sequence, while terminate (as defined here) preserves it.
Post a Comment