(re-posting from StackOverflow as this is more of a review)
In the past, I've often encountered the following problem and so far my go-to solution was to combine a counter with a hashtable (with open-addressing).
The only operations I need to implement are the following:
- next: generates a new integer or re-uses a previously freed one
- free: frees a previously generated integer
- size: returns the number of generated integers (excluding those who have been freed and not yet re-used)
Here's some working code that I use very often when this case pops up.
class IdGenerator
{
typedef std::uint64_t Id;
public:
IdGenerator():
total_ids_(0)
{
}
Id next()
{
Id id;
if (available_ids_.empty())
{
id = total_ids_;
++total_ids_;
}
else
{
auto it = available_ids_.begin();
id = *it;
available_ids_.erase(it);
}
return id;
}
void free(Id id)
{
available_ids_.insert(id);
}
Id size() const
{
return total_ids_ - available_ids_.size();
}
private:
std::unordered_set<Id> available_ids_; // usually, I use open-adressing here
Id total_ids_;
};
While this code generally brings good performance, I've always wondered if there was a faster way of doing this.
2 Answers 2
The only improvement I can see here is pretty marginal:
unordered containers have an average insert complexity of O(1), but that can become O(N) in degenerate cases (because of hash collision handling). That's pretty rare, but enough in my book to not use the datastructure if a simpler alternative would work just as well.
In your case, std::unordered_set
doesn't provide anything you wouldn't get from a simple std::stack
or std::queue
anyways. So I would personally switch to one of these.
That being said, depending on the context behind how and why you use such an index generator, there are sometimes better ways to do this.
For example, if you are using this logic in order to manage a small block allocator, you can store the free list inside the unalocated memory. see example
-
\$\begingroup\$ Come to think of it, I don't really care about managing double-frees, that is a nice improvement on my version. I am indeed writing a small block allocator here. Thanks for the info! \$\endgroup\$Jean-Marie Comets– Jean-Marie Comets2017年09月22日 07:07:59 +00:00Commented Sep 22, 2017 at 7:07
-
\$\begingroup\$ @Jean-MarieComets I suspected as much, that's the most common place this pattern pops up. \$\endgroup\$user128454– user1284542017年09月22日 16:09:27 +00:00Commented Sep 22, 2017 at 16:09
I would do this entirely differently.
class IdGenerator {
unsigned long long counter = 0;
unsigned long long freed = 0;
public:
unsigned long long next() { return ++counter; }
void free() { ++freed; }
unsigned long long size() { return counter - freed; }
};
Since we don't need to reuse a previous ID, this doesn't--ever. Nor does it attempt to keep track of which IDs have been freed. It just uses a big enough number (at least 64 bits) that it'll never run out of new numbers to use, so every time you allocate an ID, it generates an entirely new one.
Storage is kept to a minimum (128 bits) and every operation is so trivial that it's guaranteed to be extremely fast.
In case you're worried about a 64-bit number not being big enough: let's assume you have a 5 GHz computer, and that each allocation takes only a single clock cycle. Assuming you do nothing but allocate new numbers as fast as possible (i.e., use them up at a rate of 5 billion per second), it still takes more than a century to use them up.
-
\$\begingroup\$ I'd definitely go with that if I didn't need to re-use previously generated IDs, which is a big requirement of the generator. \$\endgroup\$Jean-Marie Comets– Jean-Marie Comets2017年10月09日 09:39:37 +00:00Commented Oct 9, 2017 at 9:39
std::vector<>
\$\endgroup\$