Haskell monad- and error-handling-style

Question 1

As an answer to excersise 2.3 in Chris Okasaki's book, Purely Functional Data Structures,

Inserting an existing element into a binary search tree copies the entire search path even though the copied nodes are indistinguishable from the originals. Rewrite insert using exceptions to avoid this copying. Establish one handler per insertion rather than one handler per iteration.

I've written the following haskell implementations of insertion in a binary heap:

import Data.Maybe
data Tree e = Leaf | T (Tree e) e (Tree e) deriving Show
empty = Leaf
insert x Leaf = T Leaf x Leaf
insert x tree@(T l e r)
 | x == e = tree
 | x < e = T (insert x l) e r
 | otherwise = T l e (insert x r)
ins x t =
 t >>= \t -> case t of
 Leaf -> Just $ T Leaf x Leaf
 tree@(T l e r)
 | x == e -> Nothing
 | x < e -> (ins x (Just l)) >>= \nt -> Just $ T nt e r
 | ot herwise -> (ins x (Just r)) >>= \nt -> Just $ T l e nt
fastInsert x t =
 fromMaybe t $ ins x $ Just t

The fastInsert function (using ins as a 'helper') and the insert function will return the same tree given the same input, but fastInsert wont copy sub-trees if no insertion is needed (if the tree already contains the node to be inserted).

The question is: how would this look if written more idiomatically, with regard to monadic operations/syntax in ins?

Another question: I'm using monads as a substitute for exceptions as suggested in the book. The point is to avoid rebuilding (reference copying) the tree when returning in the recursive calls. How should I have gone about this?

Question 2

I would use less indentation in the case statement (personal choice). I've found that aligning everything as close to the left as you can (with sufficient indentation) is neater. Including the type signatures would probably also help cleanliness. And I just made a near full Map implementation based on a binary tree, and I didn't use Monad at all. If you're using them just to play around, then OK, but they're probably unnecessary here in the first place.

Question 3

Noted about type annotations and indentation. Thanks. Monads here actually serve a function performance wise.

Question 4

I agree w/ Petr's answer for all essential parts.

I will show a more step-by-step refactoring.

In your ins:

ins x t =
 t >>= \t -> case t of
 -- omitted
 tree@(T l e r)

t stands for two different variables of two different types. You do not need to name the second matched case, if you are not going to use that name. I re-scan an expression when I see an unused name, to see if I missed anything, which impedes readability.
Even if you needed to refer to T l e r, it already had a name t, giving it a second name is also confusing.

Now the reason there seems to be to much indentation is that there is too much nesting. That huge anonymous lambda is the second argument to the >>=. That function deserves it's own name:

ins x t = t >>= ins'
 where
 ins' Leaf = Just $ T Leaf x Leaf
 ins' (T l e r)
 | x == e = Nothing
 | x < e = (ins x (Just l)) >>= \nt -> Just $ T nt e r
 | otherwise = (ins x (Just r)) >>= \nt -> Just $ T l e nt

Indentation is already easier on the eyes. Also vertically aligning symmetric cases helps understanding.

It is customary to rename variables x' if it stands for a modified version of a variable x. Let's rename variables and remove unnecessary parens:

ins' (T l e r)
 | x == e = Nothing
 | x < e = ins x (Just l) >>= \l' -> Just $ T l' e r
 | otherwise = ins x (Just r) >>= \r' -> Just $ T l e r'

Now it is easier to make ins' recursive, instead of ins and ins' co-recursive, as ins isn't doing much (ins x (Just l) ⊢ (Just l) >>= ins' ⊢ ins' l):

ins' (T l e r)
 | x == e = Nothing
 | x < e = ins' l >>= \l' -> Just $ T l' e r
 | otherwise = ins' r >>= \r' -> Just $ T l e r'

The question is: how would this look if written more idiomatically, with regard to monadic operations/syntax in ins?

The above function isn't terribly complicated, and can be left alone.

If there are more intermediate results, do notation may be indicated:

ins' (T l e r)
 | x == e = Nothing
 | x < e = do { l' <- ins' l; return (T l' e r) }
 | otherwise = do { r' <- ins' r; return (T l e r') }

Verbosity may be further reduced, using monad comprehension:

{-# LANGUAGE MonadComprehensions #-}
....
ins' (T l e r)
 | x == e = Nothing
 | x < e = [ T l' e r | l' <- ins' l]
 | otherwise = [ T l e r' | r' <- ins' r]

Now doing further refactorings, s.a. those suggested by Petr, are easier to see and make.

Lambda lift x in ins':

ins x t = t >>= (ins' x)
 where
 ins' x Leaf = ....

Move ins' top-level, inline ins in fastInsert, (ins x $ Just t ⊢ (Just l) >>= (ins' x) ⊢ ins' x l):

fastInsert x t = fromMaybe t $ ins' x t

After inlining ins, it can be deleted; and ins' can be renamed to ins.

Side note

If we now look at ins it doesn't depend on Maybe specifically. We can re-write it for any MonadPlus instead.

import Control.Monad
ins :: (Ord e, MonadPlus m) => e -> Tree e -> m (Tree e)
ins x Leaf = return $ T Leaf x Leaf
ins x (T l e r) = case compare x e of -- let's do case as Petr said this time
 EQ -> mzero
 LT -> [ T l' e r | l' <- ins x l] 
 GT -> [ T l e r' | r' <- ins x r]

Both of these, now, work:

foldM (flip ins) Leaf [1 .. 10] :: [Tree Integer]
foldM (flip ins) Leaf [1 .. 10] :: Maybe (Tree Integer)

This can also be written like this:

ins x (T l e r) = mplus
 [ T l' e r | x < e, l' <- ins x l] 
 [ T l e r' | x > e, r' <- ins x r]

EDIT

Another question: I'm using monads as a substitute for exceptions as suggested in the book.

Maybe is what to use to implement an operation that may or may not succeed.

The point is to avoid rebuilding (reference copying) the tree when returning in the recursive calls.

It won't allocate a new path of nodes as asked in the exercise.

How should I have gone about this?

My knowledge of Haskell is elementary, so I cannot say that there is no better way; but I do not know of any.

Question 5

Awesome step-by-step of the refactorization, Thanks! Is there anything you can say about the "error handling" currently done by the Maybe monad? - avoiding to copy the tree when inserting an existing element?

Question 6

@Bladt I edited in a few words.

Question 7

There are couple of points that I'd suggest:

Be sure to always include types for top-level functions. Without them, the code becomes very quickly unreadable. For example, would you know what ins does and what is its type, if you didn't write it yourself? In many cases you can immediately judge what a function does just looking at its name and its type.
If you do
```
 | x < e = ...
 | x == y = ...
 | otherwise = ...
```
you're performing the comparison twice. In most cases this shouldn't be a problem, but if writing
```
case compare x e of
 LT -> ...
 EQ -> ...
 GT -> ...
```
will compare only once, and also allows the compiler to do more optimizations.
This isn't a problem, just a note: Be aware that fastInsert trades speed for lazyiness. If you examine the root of the tree returned by a fastInsert call, the tree must be examined to check if it contains the inserted element or not. Under most circumstances you wouldn't mind that.
There is no need for ins to take Maybe (Tree e) as the argument, only to apply >>= on it. Most monadic functions take pure arguments and return a monadic one.
It's often convenient to use functions from Control.Applicative to lift pure functions/values to monadic computations. In particular pure, <$> and <*>. See also Functor, Applicative and Monad.

I wasn't sure what error handling you have in mind, as these operations should never fail.

The updated program:

import Control.Applicative
import Data.Maybe
data Tree e = Leaf | T (Tree e) e (Tree e) deriving Show
empty = Leaf
insert :: (Ord e) => e -> Tree e -> Tree e
insert x Leaf = T Leaf x Leaf
insert x tree@(T l e r) = case compare x e of
 EQ -> tree
 LT -> T (insert x l) e r
 GT -> T l e (insert x r)
ins :: (Ord e) => e -> Tree e -> Maybe (Tree e)
ins x Leaf = return $ T Leaf x Leaf
ins x tree@(T l e r) = case compare x e of
 EQ -> Nothing
 LT -> (\l' -> T l' e r) <$> ins x l
-- you could also write a point-free version, if you like,
-- which better preserves the order of arguments
-- LT -> T <$> ins x l <*> pure e <*> pure r
 GT -> T l e <$> ins x r
fastInsert :: (Ord e) => e -> Tree e -> Tree e
fastInsert x t =
 fromMaybe t $ ins x t

Question 8

Thank you for your answer. I should have explained what I meant by "error handling" better, sorry. I've updated my question. I'm afraid I don't understand point 3. What do you mean I have to examine the returned tree to see the node has been inserted?

abuzittin gillifirca abuzittin gillifirca 6,37715 silver badges25 bronze badges · Accepted Answer · 2014-10-13 12:48:24Z

I agree w/ Petr's answer for all essential parts.

I will show a more step-by-step refactoring.

In your ins:

ins x t =
 t >>= \t -> case t of
 -- omitted
 tree@(T l e r)

t stands for two different variables of two different types. You do not need to name the second matched case, if you are not going to use that name. I re-scan an expression when I see an unused name, to see if I missed anything, which impedes readability.
Even if you needed to refer to T l e r, it already had a name t, giving it a second name is also confusing.

Now the reason there seems to be to much indentation is that there is too much nesting. That huge anonymous lambda is the second argument to the >>=. That function deserves it's own name:

ins x t = t >>= ins'
 where
 ins' Leaf = Just $ T Leaf x Leaf
 ins' (T l e r)
 | x == e = Nothing
 | x < e = (ins x (Just l)) >>= \nt -> Just $ T nt e r
 | otherwise = (ins x (Just r)) >>= \nt -> Just $ T l e nt

Indentation is already easier on the eyes. Also vertically aligning symmetric cases helps understanding.

It is customary to rename variables x' if it stands for a modified version of a variable x. Let's rename variables and remove unnecessary parens:

ins' (T l e r)
 | x == e = Nothing
 | x < e = ins x (Just l) >>= \l' -> Just $ T l' e r
 | otherwise = ins x (Just r) >>= \r' -> Just $ T l e r'

Now it is easier to make ins' recursive, instead of ins and ins' co-recursive, as ins isn't doing much (ins x (Just l) ⊢ (Just l) >>= ins' ⊢ ins' l):

ins' (T l e r)
 | x == e = Nothing
 | x < e = ins' l >>= \l' -> Just $ T l' e r
 | otherwise = ins' r >>= \r' -> Just $ T l e r'

The question is: how would this look if written more idiomatically, with regard to monadic operations/syntax in ins?

The above function isn't terribly complicated, and can be left alone.

If there are more intermediate results, do notation may be indicated:

ins' (T l e r)
 | x == e = Nothing
 | x < e = do { l' <- ins' l; return (T l' e r) }
 | otherwise = do { r' <- ins' r; return (T l e r') }

Verbosity may be further reduced, using monad comprehension:

{-# LANGUAGE MonadComprehensions #-}
....
ins' (T l e r)
 | x == e = Nothing
 | x < e = [ T l' e r | l' <- ins' l]
 | otherwise = [ T l e r' | r' <- ins' r]

Now doing further refactorings, s.a. those suggested by Petr, are easier to see and make.

Lambda lift x in ins':

ins x t = t >>= (ins' x)
 where
 ins' x Leaf = ....

Move ins' top-level, inline ins in fastInsert, (ins x $ Just t ⊢ (Just l) >>= (ins' x) ⊢ ins' x l):

fastInsert x t = fromMaybe t $ ins' x t

After inlining ins, it can be deleted; and ins' can be renamed to ins.

Side note

If we now look at ins it doesn't depend on Maybe specifically. We can re-write it for any MonadPlus instead.

import Control.Monad
ins :: (Ord e, MonadPlus m) => e -> Tree e -> m (Tree e)
ins x Leaf = return $ T Leaf x Leaf
ins x (T l e r) = case compare x e of -- let's do case as Petr said this time
 EQ -> mzero
 LT -> [ T l' e r | l' <- ins x l] 
 GT -> [ T l e r' | r' <- ins x r]

Both of these, now, work:

foldM (flip ins) Leaf [1 .. 10] :: [Tree Integer]
foldM (flip ins) Leaf [1 .. 10] :: Maybe (Tree Integer)

This can also be written like this:

ins x (T l e r) = mplus
 [ T l' e r | x < e, l' <- ins x l] 
 [ T l e r' | x > e, r' <- ins x r]

EDIT

Another question: I'm using monads as a substitute for exceptions as suggested in the book.

Maybe is what to use to implement an operation that may or may not succeed.

The point is to avoid rebuilding (reference copying) the tree when returning in the recursive calls.

It won't allocate a new path of nodes as asked in the exercise.

How should I have gone about this?

My knowledge of Haskell is elementary, so I cannot say that there is no better way; but I do not know of any.

Awesome step-by-step of the refactorization, Thanks! Is there anything you can say about the "error handling" currently done by the Maybe monad? - avoiding to copy the tree when inserting an existing element?

Stack Exchange Network

Haskell monad- and error-handling-style

2 Answers 2

Side note

EDIT

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Haskell monad- and error-handling-style

2 Answers 2

Side note

EDIT

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions