Merge Sort: exercise

Question 1

I've decided to learn Haskell by following the current Coursera algorithms course using Haskell to implement the algorithms discussed.

First up is merge sort, which I've implemented as shown here. I'm brand new to Haskell so looking for any suggestions for improvements to the code - it seems quite verbose - and pointers on anything I may be doing wrong or in a weird way because I don't know what I'm doing.

module Main where
import System.Environment
main :: IO ()
main = do
 args <- getArgs
 mapM_ putStrLn $ sort args
sort :: (Ord a) => [a] -> [a]
sort xs = reduce True $ map (arr) xs
arr :: a -> [a]
arr x = [x]
reduce :: (Ord a) => Bool -> [[a]] -> [a]
reduce _ [] = []
reduce forwards (x:[]) = if forwards then x else reverse x
reduce forwards xs = reduce (not forwards) (combine forwards [] xs)
combine :: (Ord a) => Bool -> [[a]] -> [[a]] -> [[a]]
combine forwards acc [] = acc
combine forwards acc (x:[]) = (reverse x):acc
combine forwards acc (x:y:zs) = let merged = merge forwards x y
 acc' = merged:acc
 in combine forwards acc' zs
merge :: (Ord a) => Bool -> [a] -> [a] -> [a]
merge forwards xs ys = mergeIter forwards [] xs ys
mergeIter :: (Ord a) => Bool -> [a] -> [a] -> [a] -> [a]
mergeIter _ acc [] [] = acc
mergeIter forwards acc (x:xs) [] = mergeIter forwards (x:acc) xs []
mergeIter forwards acc [] (y:ys) = mergeIter forwards (y:acc) [] ys
mergeIter forwards acc (x:xs) (y:ys)
 | x <= y && forwards = mergeIter forwards (x:acc) xs (y:ys)
 | x > y && not forwards = mergeIter forwards (x:acc) xs (y:ys)
 | otherwise = mergeIter forwards (y:acc) (x:xs) ys

One specific question I have is around recursive function with accumulator arguments - you'll see in the listing that I've had to create a mergeIter function just to allow a default for the initial argument for the accumulator.

As far as I can see Haskell doesn't support default values for arguments - should I be doing something else instead? A fold maybe?

Question 2

OK, so let's see. arr x = [x] is usually called wrap, or just the operator section (:[]) is used. But map can also be written with list comprehension, giving us

sort :: (Ord a) => [a] -> [a]
sort xs = reduce True [[x] | x <- xs]

I find it much more visually apparent that way. We usually use short variable names in Haskell, with the same goal. When I see a variable fw and then not fw I get it that it is a toggle, a flag of some sorts, I don't need to see the long name like forwardorbackwardsflag explaining it to me in words while at the same time obscuring this visually. So we get

 where
 reduce _ [] = []
 reduce fw [xs] = if fw then xs else reverse xs
 reduce fw s = reduce (not fw) (combine fw [] s)

we make it an internal function to sort, because this is what it is. It has no otherwise uses on its own. It's too tailored for the specific use here. -- (x:[]) is written also [x], which is a lot simpler. -- We don't have to annotate types of everything. Here reduce is defined right under its use site, so we see that the initial value of fw is True. For me, it is self-documenting enough, but YMMV. Next, combine. There's no need to name all those interim entities. The name of the function itself isn't right, it's too generic. What it does, specifically, is process the elements of its input list in pairs:

 rpairs fw acc [] = acc
 rpairs fw acc [x] = reverse x : acc
 rpairs fw acc (x:y:zs) = rpairs fw (merge fw x y : acc) zs

It also builds its result in reversed order. We can make it an internal function to reduce, so that its argument fw could be referred to in rpairs body. At the same time there is no need in the separate accumulator really, we can just build the result naturally, in lazy fashion:

 where
 reduce _ [] = []
 reduce fw [xs] = if fw then xs else reverse xs
 reduce fw s = reduce (not fw) (reverse $ pairs s)
 where
 pairs [] = []
 pairs [x] = [reverse x]
 pairs (x:y:zs) = merge fw x y : pairs zs

For merge to be able to refer to fw it too should be put here as an internal function to reduce. It is usual to split such functions into two like you did, the interface function ("the wrapper") and the internal "worker" function, except that normally the worker is made internal to the wrapper. But here the wrapper itself is too specialized, it reverses its result so is unlikely to be useable in general, so no need for the split. Thus we arrive at:

sort :: (Ord a) => [a] -> [a]
sort xs = reduce True [[x] | x <- xs]
 where
 reduce _ [] = []
 reduce fw [xs] = if fw then xs else reverse xs
 reduce fw s = reduce (not fw) . reverse . pairs $ s
 where
 pairs (x:y:t) = rmerge [] x y : pairs t
 pairs [x] = [reverse x]
 pairs [] = []
 rmerge acc [] [] = acc
 rmerge acc xs [] = reverse xs ++ acc
 rmerge acc [] ys = reverse ys ++ acc
 rmerge acc (x:xs) (y:ys)
 | fw && x <= y || not fw && x > y
 = rmerge (x:acc) xs (y:ys)
 | otherwise = rmerge (y:acc) (x:xs) ys

There's no shame in calling rmerge with the initial value as we do here, it's just an internal function here, called from another internal function.

But actually the reversals that rmerge performs are unnecessary, because of Haskell's lazy evaluation. merge now becomes a general function, usable in general. pairs too. reduce performs the same processing step on its input iteratively until a condition is met - there's a higher order function until that already captures this pattern. This leads to some more simplification of the code:

sort :: (Ord a) => [a] -> [a]
sort [] = []
sort xs = head $ until (null . tail) (pairwise merge) [[x] | x <- xs]
 -- (reverse . pairwise merge) 
pairwise f (x:y:t) = f x y : pairwise f t
pairwise _ t = t
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys)
 | x <= y = x : merge xs (y:ys)
 | otherwise = y : merge (x:xs) ys

I think we would need to toggle between <= and < in merge to ensure the sort's stability, if we were to proceed by (reverse . pairwise merge) steps. But, perhaps we can just throw that reverse away.

Question 3

Heh, looks a lot nicer & shorter... but I don't really understand it. Could you maybe go into how you got from what I'd written to what you've ended up with and why you made the changes you made? I really don't know any Haskell so you might need to explain it in quite basic terms :-)

Question 4

I was hoping you'd ask something specific. Start by comparing my 1st code and yours. Convince yourself it does the same thing. If you have questions, ask. :) btw a . b . c $ x == (a . b . c) x == a $ b $ c $ x = a (b (c x)).

Question 5

@Russell I've added some derivation.

Question 6

Phew! That's exactly what I needed, thanks. There was just so much going on I didn't know what to ask. Genuinely amazed by Haskell - it's like learning to program all over again, which is exciting :-).

Question 7

you're welcome. :) yes, it isn't tail recursion, because of the laziness. In Haskell it's known as "guarded recursion", at times as corecursion. I think it is very close to the tail recursion modulo cons. cf. stackoverflow.com/questions/12057658/… :)

Question 8

The code looks way to complicated to me, in fact this isn't even the kind of merge sort I'm used to, which is "split the list in two parts, sort the parts, merge them back to one list". Here is a short implementation:

sort xs@(_:_:_) = merge $ map sort $ split xs 
sort xs = xs
split = foldr (\x [l,r]-> [x:r,l]) [[],[]]
merge [x:xs,y:ys]
 | x < y = x : merge [xs, y:ys]
 | otherwise = y : merge [x:xs, ys]
merge xss = concat xss

Question 9

It was a similar, but much more complicated version of this I did first, but as I was using a linked list I decided that finding the halfway point of the list and splitting it for every recursive call was too inefficient. That's why I decided to try working from the bottom up instead. This is really nice and concise though.

Question 10

[1..16] --> [1,3..16] # [2,4..16] --> ([1,5..15]#[3,7..15]) # ([2,6..16]#[4,8..16]) --> ( ([1,9]#[5,13]) # ([3,11]#[7,15]) ) # ( ([2,10]#[6,14]) # ([4,12]#[8,16]) ) ==> access patterns are all over the place, so it seems that this top-down variant will need more space to operate (and so will probably be slower too) than the bottom-up method where the initial access to the input list is one sequential pass. With split (x:xs) = g (x:) xs xs where g f xs [] = (f [], xs); g f (x:xs) (_:_:ys) = g (f.(x:)) xs ys top-down is sequential access, but it does repeated traversals of the halves.

Will Ness Will Ness 1,0361 gold badge8 silver badges25 bronze badges · Accepted Answer · 2013-07-06 19:55:12Z

OK, so let's see. arr x = [x] is usually called wrap, or just the operator section (:[]) is used. But map can also be written with list comprehension, giving us

sort :: (Ord a) => [a] -> [a]
sort xs = reduce True [[x] | x <- xs]

I find it much more visually apparent that way. We usually use short variable names in Haskell, with the same goal. When I see a variable fw and then not fw I get it that it is a toggle, a flag of some sorts, I don't need to see the long name like forwardorbackwardsflag explaining it to me in words while at the same time obscuring this visually. So we get

 where
 reduce _ [] = []
 reduce fw [xs] = if fw then xs else reverse xs
 reduce fw s = reduce (not fw) (combine fw [] s)

we make it an internal function to sort, because this is what it is. It has no otherwise uses on its own. It's too tailored for the specific use here. -- (x:[]) is written also [x], which is a lot simpler. -- We don't have to annotate types of everything. Here reduce is defined right under its use site, so we see that the initial value of fw is True. For me, it is self-documenting enough, but YMMV. Next, combine. There's no need to name all those interim entities. The name of the function itself isn't right, it's too generic. What it does, specifically, is process the elements of its input list in pairs:

 rpairs fw acc [] = acc
 rpairs fw acc [x] = reverse x : acc
 rpairs fw acc (x:y:zs) = rpairs fw (merge fw x y : acc) zs

It also builds its result in reversed order. We can make it an internal function to reduce, so that its argument fw could be referred to in rpairs body. At the same time there is no need in the separate accumulator really, we can just build the result naturally, in lazy fashion:

 where
 reduce _ [] = []
 reduce fw [xs] = if fw then xs else reverse xs
 reduce fw s = reduce (not fw) (reverse $ pairs s)
 where
 pairs [] = []
 pairs [x] = [reverse x]
 pairs (x:y:zs) = merge fw x y : pairs zs

For merge to be able to refer to fw it too should be put here as an internal function to reduce. It is usual to split such functions into two like you did, the interface function ("the wrapper") and the internal "worker" function, except that normally the worker is made internal to the wrapper. But here the wrapper itself is too specialized, it reverses its result so is unlikely to be useable in general, so no need for the split. Thus we arrive at:

sort :: (Ord a) => [a] -> [a]
sort xs = reduce True [[x] | x <- xs]
 where
 reduce _ [] = []
 reduce fw [xs] = if fw then xs else reverse xs
 reduce fw s = reduce (not fw) . reverse . pairs $ s
 where
 pairs (x:y:t) = rmerge [] x y : pairs t
 pairs [x] = [reverse x]
 pairs [] = []
 rmerge acc [] [] = acc
 rmerge acc xs [] = reverse xs ++ acc
 rmerge acc [] ys = reverse ys ++ acc
 rmerge acc (x:xs) (y:ys)
 | fw && x <= y || not fw && x > y
 = rmerge (x:acc) xs (y:ys)
 | otherwise = rmerge (y:acc) (x:xs) ys

There's no shame in calling rmerge with the initial value as we do here, it's just an internal function here, called from another internal function.

But actually the reversals that rmerge performs are unnecessary, because of Haskell's lazy evaluation. merge now becomes a general function, usable in general. pairs too. reduce performs the same processing step on its input iteratively until a condition is met - there's a higher order function until that already captures this pattern. This leads to some more simplification of the code:

sort :: (Ord a) => [a] -> [a]
sort [] = []
sort xs = head $ until (null . tail) (pairwise merge) [[x] | x <- xs]
 -- (reverse . pairwise merge) 
pairwise f (x:y:t) = f x y : pairwise f t
pairwise _ t = t
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys)
 | x <= y = x : merge xs (y:ys)
 | otherwise = y : merge (x:xs) ys

I think we would need to toggle between <= and < in merge to ensure the sort's stability, if we were to proceed by (reverse . pairwise merge) steps. But, perhaps we can just throw that reverse away.

Heh, looks a lot nicer & shorter... but I don't really understand it. Could you maybe go into how you got from what I'd written to what you've ended up with and why you made the changes you made? I really don't know any Haskell so you might need to explain it in quite basic terms :-)
I was hoping you'd ask something specific. Start by comparing my 1st code and yours. Convince yourself it does the same thing. If you have questions, ask. :) btw a . b . c $ x == (a . b . c) x == a $ b $ c $ x = a (b (c x)).
Phew! That's exactly what I needed, thanks. There was just so much going on I didn't know what to ask. Genuinely amazed by Haskell - it's like learning to program all over again, which is exciting :-).
you're welcome. :) yes, it isn't tail recursion, because of the laziness. In Haskell it's known as "guarded recursion", at times as corecursion. I think it is very close to the tail recursion modulo cons. cf. stackoverflow.com/questions/12057658/… :)

Stack Exchange Network

Merge Sort: exercise

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Merge Sort: exercise

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions