Does this look good for combining/retrieving two 32 bit integers (a type and an index) into/from an unsigned 64 bit integer (a unique ID)?
The point is to hide in the API the composition of the unique ID and allow its calculation to change later without changing the API (e.g. possibility to change later to consecutive 64 bit ids without changing API).
#include <limits>
uint64_t combine(uint32_t low, uint32_t high)
{
return (((uint64_t) high) << 32) | ((uint64_t) low);
}
uint32_t high(uint64_t combined)
{
return combined >> 32;
}
uint32_t low(uint64_t combined)
{
uint64_t mask = std::numeric_limits<uint32_t>::max();
return mask & combined; // should I just do "return combined;" which gives same result?
}
Or would a union approach like below be better? Is this union
guaranteed to fit into 64 bits (e.g. guarantee no padding in the struct
)?
union Id
{
struct
{
uint32_t index; // lower 32 bits
uint32_t type; // upper 32 bits
} split;
uint64_t unique_id;
};
Here is some code I used to test:
#include <iostream>
#include <assert.h>
#include <sstream>
template <typename T>
std::string bits(T num)
{
const int num_bits = sizeof(num) * 8;
T maxPow = T(1) << (num_bits - 1);
std::stringstream ss;
for(int i=0; i < num_bits; ++i)
{
// print last bit and shift left.
ss << (num & maxPow ? 1 : 0);
if ((i+1) % 8 == 0) ss << " ";
num = num << 1;
}
return ss.str();
}
void test1()
{
{
int in_low = -3;
int in_high = 99;
uint64_t combined = combine(in_low, in_high);
assert( bits(in_high) + bits(in_low) == bits(combined) );
assert( in_low == low(combined) );
assert( in_high == high(combined) );
}
{
uint32_t in_low = 3;
int in_high = -99;
uint64_t combined = combine(in_low, in_high);
assert( bits(in_high) + bits(in_low) == bits(combined) );
assert( in_low == low(combined) );
assert( in_high == high(combined) );
}
{
uint32_t in_low = std::numeric_limits<uint32_t>::max();
int in_high = std::numeric_limits<int32_t>::min();
uint64_t combined = combine(in_low, in_high);
assert( bits(in_high) + bits(in_low) == bits(combined) );
assert( in_low == low(combined) );
assert( in_high == high(combined) );
}
}
void test2() {
Id in; // would "Id in = {-3, 99};" be better? Is this legal C++03? Would a constructor be better?
in.split.type = -3;
in.split.index = 99;
std::cout << in.unique_id << std::endl; // prints 18446744060824649827
std::cout << bits(in.split.type) << bits(in.split.index) << std::endl;
std::cout << bits(in.unique_id) << std::endl; // prints 11111111 11111111 11111111 11111101 00000000 00000000 00000000 01100011
assert(bits(in.split.type) + bits(in.split.index) == bits(in.unique_id));
Id out;
out.unique_id = in.unique_id;
std::cout << int(out.split.type) << ", " << out.split.index << std::endl; // prints -3, 99
assert(in.split.type == out.split.type);
assert(in.split.index == out.split.index);
}
2 Answers 2
Union provides no guarantee of memory arrangement for this. The C++ standard merely says that it's no bigger than the storage required for the largest element, and that it's designed to only have one active data member.
If you are only targeting one compiler and one architecture it may be okay to use it to translate between types, but it is definitely skiing off-piste.
I'd be tempted to wrap the whole thing up in a class and provide getter methods. This insulates the user from changes in the implementation of the API in the most flexible way. And, use the bitwise operations (>>, <<, | and &) inside the class. These operations will happen on a register, rather than in memory, and therefore do not suffer for endian issues.
If you are not going to wrap it in a class, I would suggest bit shifting. It is more portable.
Using union
is hard to make it correct and portable, and with no benefit over the bit-shift method. There is #pragma pack
directive for dictating byte alignment, but it is not a standard directive. You also need to deal with endianness, because byte position of low and high 32-bit will be opposite between little-endian and big-endian platform.
uint64_t
? Why not use aunion
andstruct
, for example? \$\endgroup\$union
instead of casting. \$\endgroup\$combined
like this:cout << combined << end
in functiontest1
you'll see the difference at different platforms, however you can see the difference intest2
when you printedunique_id
\$\endgroup\$