Operations on a heap

Question 1

Introduction

I started going through some basic algorithms as I'll have algorithms course next semester. The last time I've written any heap operations is around 5 years ago. This time around it took me an hour to debug the case where the right child wouldn't exist, but left child did. Otherwise it was straightforward.

The code below implements four basic operations on a heap:

Push (sift down)
Pop (swap with the last element and then perform push on truncated heap)
Build heap (push one by one starting from the middle, achieving linear complexity)
Sort heap (continuosly pop elements and truncate the heap, pushing greater elements to the end of the underlying container)

The code works on iterator ranges, requires RandomAccessIterator, Swappable value types of the iterator type.

Code

#include <vector>
#include <iostream>
#include <random>
#include <unordered_set>
#include <numeric>
#include <algorithm>
std::vector<int> generate_vector(std::size_t size)
{
 if (size == 0)
 return {};
 if (size == 1)
 return {0};
 static std::mt19937 twister{};
 std::vector<int> numbers(size);
 std::iota(numbers.begin(), numbers.end(), 0);
 std::shuffle(numbers.begin(), numbers.end(), twister);
 return numbers;
}
#include <iterator>
namespace shino
{
 template <typename RandomAccessIterator>
 void push_heap(RandomAccessIterator first,
 RandomAccessIterator last,
 RandomAccessIterator element)
 {
 auto dist = std::distance(first, element) + 1;
 auto left_child = first + dist * 2 - 1;
 if (left_child >= last)
 return;
 auto right_child = first + dist * 2;
 if (right_child >= last)
 {
 if (*element < *left_child)
 {
 std::iter_swap(element, left_child);
 push_heap(first, last, left_child);
 }
 return;
 }
 if (*element >= *left_child and *element >= *right_child)
 return;
 auto next_location =
 (*left_child >= *right_child) ? left_child : right_child;
 std::iter_swap(next_location, element);
 shino::push_heap(first, last, next_location);
 }
 template <typename RandomAccessIterator>
 void build_heap(RandomAccessIterator first, RandomAccessIterator last)
 {
 for (auto middle = first + std::distance(first, last) / 2;
 middle != first; --middle)
 {
 shino::push_heap(first, last, middle);
 }
 shino::push_heap(first, last, first);
 }
 template <typename RandomAccessIterator>
 void pop_heap(RandomAccessIterator first, RandomAccessIterator last)
 {
 if (std::distance(first, last) < 2)
 return;
 auto new_last = std::prev(last);
 std::iter_swap(first, new_last);
 shino::push_heap(first, new_last, first);
 }
 template <typename RandomAccessIterator>
 void sort_heap(RandomAccessIterator first, RandomAccessIterator last)
 {
 auto current_last = last;
 while (first != current_last)
 {
 shino::pop_heap(first, current_last);
 --current_last;
 }
 }
}
#include <algorithm>
int main()
{
 for (std::size_t i = 0; i <= 10'000; ++i)
 {
 std::vector<int> v(generate_vector(i));
 std::cout << "heapifying vector of size " << i << '\n';
 shino::build_heap(v.begin(), v.end());
 if (not std::is_heap(v.begin(), v.end()))
 std::cerr << "incorrect heapifying on size " << i << '\n';
 shino::sort_heap(v.begin(), v.end());
 if (not std::is_sorted(v.begin(), v.end()))
 std::cerr << "incorrect heap sorting on size " << i << '\n';
 }
}

Concerns

push_heap looks very ugly

The control flow is so obfuscated and hard to follow. I couldn't make it better though.
Indexing looks ugly too

One of the weakest point of iterators: algorithms where indexing is important. Is there a way to make it better?
Anything else

Question 2

Did you spot a bug? I had the feeling that something is wrong, but tests were silent. If there is something wrong with the post, it would be great to get a feedback so I could fix it.

Question 3

What class are you taking?

Question 4

@JDługosz, it’s in my university. I can’t recall the exact name, but it is something similar to intro to algorithms

Question 5

What level? If you already studied them 5 years ago, and you can write that now, you hardly need an "intro"!

Question 6

To whoever is voting to close this question: please clarify what should be improved.

Question 7

It looks like you know what you’re doing.

One thing I spotted: Don’t write std::iter_swap quallified like that. You have to use ADL to pick up the right version provided for the template arguments. You need the "two step":

using std::iter_swap;
iter_swap (a,b);

just like with regular swap.

You don’t need to qualify your own names when you are in that namespace. Or is that intentional to guard against ADL? I never see that technique used.

Check out Catch2 for beefing up your main into a comprehensive unit test.

generate_vector is missing out on NVRO due to your precondition tests. Declare numbers at the top; always return that.

Question 8

Qualified names and some other weird things are artifacts from multiple improvement iterations (at the start it wasn’t even a template and it lacked namespace). Thanks for Catch2, I’ve heard positive things about it, except compilation times.

Question 9

Putting CATCH_CONFIG_MAIN in a separate CPP file that never changes makes the compilation time OK. My current project takes 1.4 seconds for optimized build.

Question 10

that looks nice. I’ll give it a shot tomorrow.

Question 11

I've never seen iter_swap used as a customization point, so qualifying it looks perfectly reasonable to me.

Question 12

@Rakete1111 I wonder why they would bother with defining a function where using it is longer than not using it (swap(*a,*b) vs iter_swap(a,b)) unless it was there to abstract something, like a special optimal way of swapping when using a specific kind of iterator. What else would it be for?

Question 13

List all your #include directives at the top of the file. Then you'd see that you #include <algorithm> twice.

Question 14

Does it effect compile time too much? I mean, I know the practice has always been redundant.

Question 15

@AniketChowdhury It will have a very small (possibly minimal) effect on compile time. The exact amount will depend on how that header is implemented.

Question 16

@1201ProgramAlarm, I believe that is an artifact from a debugging session. I spent considerable time on finding a bug in the shino::push_heap().

JDługosz JDługosz 11.7k19 silver badges40 bronze badges · Accepted Answer · 2018-05-25 22:02:21Z

5

\$\begingroup\$

It looks like you know what you’re doing.

One thing I spotted: Don’t write std::iter_swap quallified like that. You have to use ADL to pick up the right version provided for the template arguments. You need the "two step":

using std::iter_swap;
iter_swap (a,b);

just like with regular swap.

You don’t need to qualify your own names when you are in that namespace. Or is that intentional to guard against ADL? I never see that technique used.

Check out Catch2 for beefing up your main into a comprehensive unit test.

generate_vector is missing out on NVRO due to your precondition tests. Declare numbers at the top; always return that.

Share

edited May 25, 2018 at 22:06

answered May 25, 2018 at 22:02

JDługosz's user avatar

JDługosz JDługosz

11.7k19 silver badges40 bronze badges

\$\endgroup\$

9

\$\begingroup\$ Qualified names and some other weird things are artifacts from multiple improvement iterations (at the start it wasn’t even a template and it lacked namespace). Thanks for Catch2, I’ve heard positive things about it, except compilation times. \$\endgroup\$

Incomputable
– Incomputable

2018年05月25日 22:05:00 +00:00
Commented May 25, 2018 at 22:05
\$\begingroup\$ Putting CATCH_CONFIG_MAIN in a separate CPP file that never changes makes the compilation time OK. My current project takes 1.4 seconds for optimized build. \$\endgroup\$

JDługosz
– JDługosz

2018年05月25日 22:08:42 +00:00
Commented May 25, 2018 at 22:08
\$\begingroup\$ that looks nice. I’ll give it a shot tomorrow. \$\endgroup\$

Incomputable
– Incomputable

2018年05月25日 22:10:07 +00:00
Commented May 25, 2018 at 22:10
\$\begingroup\$ I've never seen iter_swap used as a customization point, so qualifying it looks perfectly reasonable to me. \$\endgroup\$

Rakete1111
– Rakete1111

2018年05月30日 06:59:46 +00:00
Commented May 30, 2018 at 6:59
\$\begingroup\$ @Rakete1111 I wonder why they would bother with defining a function where using it is longer than not using it (swap(*a,*b) vs iter_swap(a,b)) unless it was there to abstract something, like a special optimal way of swapping when using a specific kind of iterator. What else would it be for? \$\endgroup\$

JDługosz
– JDługosz

2018年05月30日 20:57:56 +00:00
Commented May 30, 2018 at 20:57

| Show 4 more comments

Stack Exchange Network

Operations on a heap

Introduction

Code

Concerns

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Operations on a heap

Introduction

Code

Concerns

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions