Generating and re-using unique, sequential, positive integers

Question 1

(re-posting from StackOverflow as this is more of a review)

In the past, I've often encountered the following problem and so far my go-to solution was to combine a counter with a hashtable (with open-addressing).

The only operations I need to implement are the following:

next: generates a new integer or re-uses a previously freed one
free: frees a previously generated integer
size: returns the number of generated integers (excluding those who have been freed and not yet re-used)

Here's some working code that I use very often when this case pops up.

class IdGenerator
{
 typedef std::uint64_t Id;
public:
 IdGenerator():
 total_ids_(0)
 {
 }
 Id next()
 {
 Id id;
 if (available_ids_.empty())
 {
 id = total_ids_;
 ++total_ids_;
 }
 else
 {
 auto it = available_ids_.begin();
 id = *it;
 available_ids_.erase(it);
 }
 return id;
 }
 void free(Id id)
 {
 available_ids_.insert(id);
 }
 Id size() const
 {
 return total_ids_ - available_ids_.size();
 }
private:
 std::unordered_set<Id> available_ids_; // usually, I use open-adressing here
 Id total_ids_;
};

While this code generally brings good performance, I've always wondered if there was a faster way of doing this.

Question 2

Do you need set to protect against double free? How about protection against freeing unallocated ids? If you do not need protection - I think the same could be done using std::forward_list which guarantees O(1) for insert/erase of 1 element - but you will have to count free elements yourself...

Question 3

Thanks for the tip, to answer your questions: 1) I do not care about managing double-frees and 2) I do not care about order of re-use but counting is important, therefore an std::stack/std::vector would work just fine.

Question 4

Note that vector will resize (moving all existing elements to the new location) each time the capacity is exceeded adding the O(n) to the "worst case" scenario

Question 5

@ArtemyVysotsky O(n) might be the worst single-alloc case, but overall performance will be guaranteed to be an amortized O(1), even with std::vector<>

Question 6

The only improvement I can see here is pretty marginal:

unordered containers have an average insert complexity of O(1), but that can become O(N) in degenerate cases (because of hash collision handling). That's pretty rare, but enough in my book to not use the datastructure if a simpler alternative would work just as well.

In your case, std::unordered_set doesn't provide anything you wouldn't get from a simple std::stack or std::queue anyways. So I would personally switch to one of these.

That being said, depending on the context behind how and why you use such an index generator, there are sometimes better ways to do this.

For example, if you are using this logic in order to manage a small block allocator, you can store the free list inside the unalocated memory. see example

Question 7

Come to think of it, I don't really care about managing double-frees, that is a nice improvement on my version. I am indeed writing a small block allocator here. Thanks for the info!

Question 8

@Jean-MarieComets I suspected as much, that's the most common place this pattern pops up.

Question 9

I would do this entirely differently.

class IdGenerator {
 unsigned long long counter = 0;
 unsigned long long freed = 0;
public:
 unsigned long long next() { return ++counter; }
 void free() { ++freed; }
 unsigned long long size() { return counter - freed; }
};

Since we don't need to reuse a previous ID, this doesn't--ever. Nor does it attempt to keep track of which IDs have been freed. It just uses a big enough number (at least 64 bits) that it'll never run out of new numbers to use, so every time you allocate an ID, it generates an entirely new one.

Storage is kept to a minimum (128 bits) and every operation is so trivial that it's guaranteed to be extremely fast.

In case you're worried about a 64-bit number not being big enough: let's assume you have a 5 GHz computer, and that each allocation takes only a single clock cycle. Assuming you do nothing but allocate new numbers as fast as possible (i.e., use them up at a rate of 5 billion per second), it still takes more than a century to use them up.

Question 10

I'd definitely go with that if I didn't need to re-use previously generated IDs, which is a big requirement of the generator.

user128454user128454 · Accepted Answer · 2017-09-21 23:39:51Z

The only improvement I can see here is pretty marginal:

unordered containers have an average insert complexity of O(1), but that can become O(N) in degenerate cases (because of hash collision handling). That's pretty rare, but enough in my book to not use the datastructure if a simpler alternative would work just as well.

In your case, std::unordered_set doesn't provide anything you wouldn't get from a simple std::stack or std::queue anyways. So I would personally switch to one of these.

That being said, depending on the context behind how and why you use such an index generator, there are sometimes better ways to do this.

For example, if you are using this logic in order to manage a small block allocator, you can store the free list inside the unalocated memory. see example

Come to think of it, I don't really care about managing double-frees, that is a nice improvement on my version. I am indeed writing a small block allocator here. Thanks for the info!
@Jean-MarieComets I suspected as much, that's the most common place this pattern pops up.

Stack Exchange Network

Generating and re-using unique, sequential, positive integers

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Generating and re-using unique, sequential, positive integers

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions