While watching a sports tournament, I noticed that the tournament tree looks a lot like a heap. I came up with the following data structure: A complete binary tree where the leaves are elements of some set, and each internal node is the $\max$ of its two children. I came up with a BuildHeap algorithm that's $O(n)$, a GetMax algorithm that's $O(1)$, an Insert algorithm that's $O(\log n)$, and a Delete algorithm that's $O(\log n)$. The number of nodes in this "heap" is 2ドルn-1$ where $n$ is the number of elements in the underlying set. The structure is simpler than a binary heap.
Is there a name for this data structure?
[edit]
Code:
import Data.List
import Data.Ord
data Heap a = Leaf a | Branch a (Heap a) (Heap a) deriving Show
getMax :: Heap a -> a
getMax (Leaf x) = x
getMax (Branch x _ _) = x
leaf :: Ord a => a -> Heap a
leaf = Leaf
branch :: Ord a => Heap a -> Heap a -> Heap a
branch h1 h2 = Branch (max (getMax h1) (getMax h2)) h1 h2
buildHeap :: Ord a => [a] -> Heap a
buildHeap xs = fst $ buildSubHeap (length xs) xs
where buildSubHeap 1 (x:xs) = (Leaf x, xs)
buildSubHeap n xs = let (leftSubHeap, remainder1) = buildSubHeap (div n 2) xs
(rightSubHeap, remainder2) = buildSubHeap (n - div n 2) remainder1
in (branch leftSubHeap rightSubHeap, remainder2)
insertIntoHeap :: Ord a => a -> Heap a -> Heap a
insertIntoHeap x (Leaf y) = branch (leaf x) (leaf y)
insertIntoHeap x (Branch m h1 h2) = branch h2 (insertIntoHeap x h1)
deleteInsignificantElement :: Ord a => Heap a -> Heap a
deleteInsignificantElement (Branch _ (Leaf x) (Leaf y)) = Leaf (max x y)
deleteInsignificantElement (Branch _ h1 h2) = branch (deleteInsignificantElement h2) h1
getInsignificantElement :: Ord a => Heap a -> a
getInsignificantElement (Branch _ (Leaf x) (Leaf y)) = min x y
getInsignificantElement (Branch _ h1 h2) = getInsignificantElement h2
replaceMax :: Ord a => Heap a -> a -> Heap a
replaceMax (Leaf x) y = Leaf y
replaceMax (Branch m h1 h2) y | getMax h1 == m = branch (replaceMax h1 y) h2
| getMax h2 == m = branch h1 (replaceMax h2 y)
| otherwise = undefined
deleteMax :: Ord a => Heap a -> Heap a
deleteMax heap = replaceMax (deleteInsignificantElement heap) (getInsignificantElement heap)
data HeapObserver a = Singleton a | PushHeap a (Heap a) deriving Show
popHeap :: Ord a => Heap a -> HeapObserver a
popHeap (Leaf x) = Singleton x
popHeap heap = PushHeap (getMax heap) (deleteMax heap)
undown :: Down a -> a
undown (Down x) = x
heapsort :: Ord a => [a] -> [a]
heapsort [] = []
heapsort xs = map undown . flattenHeap . buildHeap . map Down $ xs
where flattenHeap heap = case popHeap heap of
Singleton y -> [y]
PushHeap y ys -> y : flattenHeap ys
example_list = [234,234245,13235,14223,12,5,41,24,132,4134,25,234, 94875937, 34059, 784, 34875, 234, 765, 909]
main :: IO ()
main = print (heapsort example_list == reverse (sort example_list))
2 Answers 2
This is essentially a Segment tree which is a data structure that augments an array with a binary tree as you describe such that:
- You have fast set and get at any index
- You have fast "aggregate" queries on ranges
- You can support fast update queries on ranges, for some combinations of updates and queries
The $j$th node at height $k$ in the tree "summarizes" a subarray $[j*2^k, (j+1)*2^k)$ of the original array. Since each element of the array appears in only logarithmically many such subarrays, we can do updates in $O(\log n)$ time.
The range queries can use any associative operation. In your example the operation is $\max$, but other examples include sum, product, even standard deviation (via sum and sum of squares).
I originally called this a Fenwick Tree (aka Binary Indexed Tree), which is a similar structure but which compresses the tree into only exactly $n$ storage with no overhead(but loses access to the original array).
-
$\begingroup$ I think you're right and I confuse them all the time $\endgroup$Curtis F– Curtis F2019年01月26日 22:39:08 +00:00Commented Jan 26, 2019 at 22:39
-
3$\begingroup$ For clarification: a Segment tree also uses linear storage. $\endgroup$Jakube– Jakube2019年01月26日 23:12:28 +00:00Commented Jan 26, 2019 at 23:12
You've just reinvented a range-query structure. This is an instance of the more general idea that, if you put all data at only the leaves of a binary tree, you can use internal nodes to represent substructures, and it will work for any kind of structure that can be recursively defined with an $O(1)$ recursive initialization.
In this case, you can see that the structure is just an unordered set with a maximum, which is obviously $O(1)$ recursively definable. It is also clear that you can support Insert and DeleteMax and $O(1)$ GetMax, but not $O(\log n)$ Search. Additionally, if you think a bit you should be able to figure out how to support $O(\log n)$ MergeHeap.
Explore related questions
See similar questions with these tags.