I am learning fundamental data structures, and I want to know if this is a valid implementation of a heap. Can any C# features be used to improve or extend the implementation? In addition to heapsort, what are some applications of a heap?
class BinaryHeap
{
/// <summary>
/// A binary heap implementation used to sort an integer array.
///
/// 1. The Max-Heapify and Min-Heapify methods maintain the heap properties.
///
/// 2. The Build-Heap methods produce the heap from an unordered array.
///
/// 3. The HeapSort methods sort the array in place and runs in O(n ln n) time.
/// </summary>
private int heapSize;
public BinaryHeap()
{
heapSize = 0;
}
private int ParentIndex(int currentIndex)
{
return currentIndex / 2;
}
private int LeftIndex(int currentIndex)
{
return currentIndex * 2;
}
private int RightIndex(int currentIndex)
{
return currentIndex * 2 + 1;
}
// Building the heap
#region
public void BuildMaxHeap(int[] A)
{
heapSize = A.Length - 1;
for(int i = A.Length/2; i >= 0; i--)
{
MaxHeapify(A, i);
}
}
public void BuildMinHeap(int[] A)
{
heapSize = A.Length - 1;
for (int i = A.Length / 2; i >= 0; i--)
{
MinHeapify(A, i);
}
}
#endregion
// Maintaining heap properties
// MaxHeapify: Ensure that parents are larger than children
// MinHeapify: Ensure that children are larger than parents
#region
public void MaxHeapify(int[] A, int i)
{
int leftIndex = LeftIndex(i);
int rightIndex = RightIndex(i);
int largestIndex = 0;
// Check to see which node in the tree subset has the largest value
if (leftIndex <= heapSize && A[leftIndex] > A[i])
{
largestIndex = leftIndex;
}
else
{
largestIndex = i;
}
if (rightIndex <= heapSize && A[rightIndex] > A[largestIndex])
{
largestIndex = rightIndex;
}
// Do not make any switches if the largest node is the parent
if(largestIndex != i)
{
int temp = A[largestIndex];
A[largestIndex] = A[i];
A[i] = temp;
MaxHeapify(A, largestIndex);
}
}
public void MinHeapify(int[] A, int i)
{
int leftIndex = LeftIndex(i);
int rightIndex = RightIndex(i);
int smallest;
if (leftIndex <= heapSize && A[leftIndex] < A[i])
{
smallest = leftIndex;
}
else
{
smallest = i;
}
if (rightIndex <= heapSize && A[rightIndex] < A[smallest])
{
smallest = rightIndex;
}
if (smallest != i)
{
int temp = A[i];
A[i] = A[smallest];
A[smallest] = temp;
MinHeapify(A, smallest);
}
}
#endregion
// Heapsort
#region
public int[] AscendingHeapSort(int[] A)
{
// Ensure all parents are greater than their children
BuildMaxHeap(A);
for (int i = A.Length - 1; i >= 0; i--)
{
int temp = A[0];
A[0] = A[i];
A[i] = temp;
heapSize--;
MaxHeapify(A, 0);
}
return A;
}
public int[] DescendingHeapSort(int[] A)
{
// Ensure all parents are less than their children
BuildMinHeap(A);
for (int i = A.Length - 1; i >= 0; i--)
{
int temp = A[0];
A[0] = A[i];
A[i] = temp;
heapSize--;
MinHeapify(A, 0);
}
return A;
}
#endregion
}
2 Answers 2
C#/Design related comments
- Use small-letters for passing arguments.
- Put your comments (The
<summary>
) about the class on top of the class definition, not the field. - The way of using # regions is not really done the way you do in C# usually. There should be a description of the region right after the region on same line.
Use C# comments styling when commenting classes, methods, their parameters.
There is not much sense in designing Heap class that stores just a single integer (heap size). You could simply have
public static
methods (utility methods) for that purpose, because it seems none of your methods really need to store a state- If you design this as a Heap, depends on use-cases you might want to design
Push
/Pop
methods and store the underlying array in the class. Then having a class would make sense. - Try to use C# Generics instead of strongly defined
int
. This way you can make your code re-usable for other types (e.g. short) and/or for any object that implementsIComparable
(check out MSDN for more info on these).
Where to use Heap
.
In addition to sorting you can use Heap
as PriorityQueue
.
If you add Push
/Pop
methods and slightly modify the implementation, then it can serve as a PriorityQueue
, which allows you to add elements(any object that implements IComparable
) in random order, and the higher/lower ones will get to the top/bottom of the queue. You can go even further by designing another method - Update(...)
, which can modify the priority of an element (technically you could implement it by removing the old value from queue and then adding the updated value back to the queue).
Didn't validate your code correctness, but you can/should validate it by testing on some regular and corner cases etc.
Hamo Abajyan's answer is good. I would add a few more thoughts.
heapSize
-- is fine, but my first thought was "what other kind of size is going to be stored in here?" I would have just named itsize
and hoped that it would be clear from context that thesize
of a heap is the size of the heap.As Hamo says, you could be more generic in the element type of the container; anything that can be compared to itself is sufficient. Similarly you could be more generic in the container kind itself. That is, Hamo suggests that instead of taking
int[]
, you takeT[] where T : IComparable<T>
. (Or, alternatively, pass in a comparator that can compare two Ts.) I suggest that you go one step further and take anIList<T>
. That way you can heapsort any mutable indexible collection, like aList<double>
, for instance.largestIndex
,leftIndex
,rightIndex
, eventemp
-- awesome. Names describe the purpose of the variable. What isi
? I have no idea unless I already know the algorithm. Make that more clear.Add preconditions and postconditions. The methods should start and end with
Debug.Assert
s. The ones at the start are the preconditions: the things that you know to be true before the method starts. The ones at the end are the postconditions: the thing that the method guarantees to be true when you're done. You believe, for example, that your sort method produces a sequence where every element is less than or equal to the one that comes after it. OK, then assert that. If the assertion is violated then you've automatically found a bug.