Mini Binary Tree "library"

Question 1

This is my first attempt at learning binary trees. I'm looking for general feedback on how it could be made cleaner, more idiomatic, or more efficient.

data Tree a = ETree | Node (Tree a) (Tree a) a deriving (Show)
tInsert :: Ord o => o -> Tree o -> Tree o
tInsert x ETree = Node ETree ETree x
tInsert x (Node lT rT a)
 | x < a = Node (tInsert x lT) rT a
 | x > a = Node lT (tInsert x rT) a
 | otherwise = Node lT rT x
-- Folds a list into a tree
tFromListL, tFromListR :: Ord o => [o] -> Tree o
tFromListL = foldl (flip tInsert) ETree
tFromListR = foldr tInsert ETree
-- Turns a tree into a list
tToList :: Ord o => Tree o -> [o]
tToList ETree = []
tToList (Node lT rT a) = (tToList lT) ++ [a] ++ (tToList rT)
-- Splits a list roughly in half (as part of balancing)
splitInHalf :: [a] -> ([a],[a])
splitInHalf xs = splitAt (round $ (fromIntegral $ length xs) / 2.0) xs
-- Returns how unbalanced a node is
tUnbalancedBy :: Tree a -> Int
tUnbalancedBy ETree = 0
tUnbalancedBy (Node lT rT _) = absDiff (tDepth lT) (tDepth rT)
-- Arranges a list in such a way that it forms a more balanced tree
balanceList :: [a] -> [a]
balanceList xs = let (fH,sH) = splitInHalf xs in (reverse fH) ++ sH
-- "Inefficient balance"
tIneffBalance :: Ord o => Tree o -> Tree o
tIneffBalance = tFromListL . balanceList . tToList
-- Finds the min/max values of a tree
tMin, tMax :: Ord o => Tree o -> o
tMin ETree = error "tMin called on an Empty Tree"
tMin (Node lT _ a) = case lT of
 ETree -> a
 (Node lT' _ _) -> tMin lT'
tMax ETree = error "tMax called on an Empty Tree"
tMax (Node _ rT a) = case rT of
 ETree -> a
 (Node _ rT' _) -> tMax rT'
-- Find the max depth of a tree 
tDepth :: Tree a -> Int
tDepth ETree = 0
tDepth (Node lT rT _) = 1 + max (tDepth lT) (tDepth rT)
-- Finds how many nodes a tree contains
tSize :: Tree a -> Int
tSize ETree = 0
tSize (Node lT rT _) = 1 + (tSize lT) + (tSize rT)
absDiff :: Int -> Int -> Int
absDiff x y = abs $ x - y
-- Checks if a node is balanced
tIsBalanced :: Tree a -> Bool
tIsBalanced ETree = True
tIsBalanced n
 | tUnbalancedBy n > 2 = False
 | otherwise = True
-- Checks if a value is an element of the tree
tElem :: Ord o => o -> Tree o -> Bool
tElem x ETree = False
tElem x (Node lT rT a)
 | x < a = tElem x lT
 | x > a = tElem x rT
 | otherwise = True
tDelete :: Ord o => o -> Tree o -> Tree o
tDelete _ ETree = ETree
tDelete _ n@(Node ETree ETree _) = n -- Or give "Not found" error?
tDelete tD n@(Node lT rT a)
 | tD < a = Node (tDelete tD lT) rT a
 | tD > a = Node lT (tDelete tD rT) a
 | otherwise = case (lT,rT) of
 (ETree,t) -> t
 (t,ETree) -> t
 (t,t') -> let fMin = tMin t' in Node t (tDelete (tMin t') t') fMin

My main concerns are:

The balancing algorithm. I thought it would work beautifully at first, then I realized that it basically just turns the tree into a "V". It's more efficient then a linked list, but still not very balanced
The 2 "tToList*" functions. Should I use foldl or foldr? They seem to act fairly equivalently; although foldl is easier to use because you typically read a list from the left
The "tDelete" function. It works (from my testing), but it looks very ugly. I basically just wrote it case-by-case, which made it very long.

Question 2

Function Visibility

In the code there's some functions that seem to be intended for internal use only. To hide these from external modules you should have a "proper" module declaration that exports those functions you want to make "public".

module MyTree ( Tree(ETree, Node), tInsert, tFromListL, tFromListR, ... ) where

this allows you to hide (and change) implementation details of your exposed function.

Balancing Your Tree

Your algorithm to balance that tree is something along the lines of \$\mathcal{O}(3 * n)\$. That's bad compared to an almost constant time-effort when balancing in tree form. Then again why are you even exposing this functionality?

If you allow a tree to be unbalanced, you shouldn't expose an operation to balance it and vice versa. You should strongly consider extracting balanced behaviour into either a separate type and using instance to expose a common "interface" or extract this into a completely separate module.

Something like this:

class (Ord a) => Tree a where
 tInsert, tDelete :: a -> Tree a -> Tree a
 tElem :: a -> Tree a -> Bool
 tIsBalanced :: Tree a -> Bool
 tSize, tDepth :: Tree a -> Int
 tMin, tMax :: Tree a -> a
 tToList :: Tree a -> [a]
data BalancedTree a = Empty | Node (BalancedTree a) (BalancedTree a) a deriving (Show)
instance NormalTree BalancedTree where
-- | because we are an instance of NormalTree we can access tInsert'
 tInsert = balance . tInsert'
-- and so on and so forth
data NormalTree a = Empty | Node (Tree a) (Tree a) a deriving (Show)
instance Tree NormalTree where
 tInsert = tInsert'
 -- | default implementation for insert without balancing
 tInsert' :: a -> Tree a -> Tree a

Note that I haven't even used this myself and I'm not sure whether this works. The code should give you a gist as to how this could / should work and isn't even intended to be syntactically correct.

This defines the "interface" of Tree in a class and then uses data types to implement said "interface".

Errors In Expected Circumstances

When you call tMax or tMin respectively on an ETree (which would be better named as Empty) an error is raised. That's an IMO really ugly and declarative way of solving this. It'd be preferrable to have a pure function. This requires a small change to the type-signature:

tMin, tMax :: Ord o => Tree o -> Maybe o

Which in turn allows you to do the following:

tMin Empty = Nothing
tMin (Node lT _ a) = case lT of 
 Empty -> a
 (Node lT' _ _) -> tMin lT'

This function is total and pure. As such it's much easier and cleaner to use.

Vogel612 Vogel612 25.5k7 gold badges59 silver badges141 bronze badges · Accepted Answer · 2016-12-31 01:42:48Z

Function Visibility

In the code there's some functions that seem to be intended for internal use only. To hide these from external modules you should have a "proper" module declaration that exports those functions you want to make "public".

module MyTree ( Tree(ETree, Node), tInsert, tFromListL, tFromListR, ... ) where

this allows you to hide (and change) implementation details of your exposed function.

Balancing Your Tree

Your algorithm to balance that tree is something along the lines of \$\mathcal{O}(3 * n)\$. That's bad compared to an almost constant time-effort when balancing in tree form. Then again why are you even exposing this functionality?

If you allow a tree to be unbalanced, you shouldn't expose an operation to balance it and vice versa. You should strongly consider extracting balanced behaviour into either a separate type and using instance to expose a common "interface" or extract this into a completely separate module.

Something like this:

class (Ord a) => Tree a where
 tInsert, tDelete :: a -> Tree a -> Tree a
 tElem :: a -> Tree a -> Bool
 tIsBalanced :: Tree a -> Bool
 tSize, tDepth :: Tree a -> Int
 tMin, tMax :: Tree a -> a
 tToList :: Tree a -> [a]
data BalancedTree a = Empty | Node (BalancedTree a) (BalancedTree a) a deriving (Show)
instance NormalTree BalancedTree where
-- | because we are an instance of NormalTree we can access tInsert'
 tInsert = balance . tInsert'
-- and so on and so forth
data NormalTree a = Empty | Node (Tree a) (Tree a) a deriving (Show)
instance Tree NormalTree where
 tInsert = tInsert'
 -- | default implementation for insert without balancing
 tInsert' :: a -> Tree a -> Tree a

Note that I haven't even used this myself and I'm not sure whether this works. The code should give you a gist as to how this could / should work and isn't even intended to be syntactically correct.

This defines the "interface" of Tree in a class and then uses data types to implement said "interface".

Errors In Expected Circumstances

When you call tMax or tMin respectively on an ETree (which would be better named as Empty) an error is raised. That's an IMO really ugly and declarative way of solving this. It'd be preferrable to have a pure function. This requires a small change to the type-signature:

tMin, tMax :: Ord o => Tree o -> Maybe o

Which in turn allows you to do the following:

tMin Empty = Nothing
tMin (Node lT _ a) = case lT of 
 Empty -> a
 (Node lT' _ _) -> tMin lT'

This function is total and pure. As such it's much easier and cleaner to use.

Stack Exchange Network

Mini Binary Tree "library"

1 Answer 1

Function Visibility

Balancing Your Tree

Errors In Expected Circumstances

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Mini Binary Tree "library"

1 Answer 1

Function Visibility

Balancing Your Tree

Errors In Expected Circumstances

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions