This is the function I've implemented in an answer to this question. I've tried to make this as simple and idiomatic as possible, using C++11. Could this still be improved in any way?
#include <algorithm>
#include <cstdint>
#include <iostream>
#include <iterator>
#include <vector>
typedef std::vector<int> SplitValues;
SplitValues splitter(std::int32_t number)
{
SplitValues values(8);
for (auto& iter : values)
{
iter = static_cast<int>(number & 0xF);
number >>= 4;
}
return values;
}
int main()
{
std::int32_t number = 432214123;
SplitValues values = splitter(number);
std::reverse_copy(values.begin(), values.end(),
std::ostream_iterator<int>(std::cout, " "));
}
Questions:
- Is
SplitValues
an appropriate name for atypedef
, or does it sound more like a variable? - If I pass the hard-coded number instead of the
number
variable, will it still be treated as the same type as the parameter?
2 Answers 2
The first problem I see is that you're doing a right shift on a signed integer type (std::int32_t
).
If a negative value is passed, the result will be implementation defined. When you're going to treat something as a collection of bits instead of treating the whole thing together as a number, you usually want to use an unsigned type, not signed.
I'd also prefer to get rid of the magic (and inter-related) numbers sprinkled throughout the code, such as the 8
, 4
and the 0xf
. These three are inter-related--the 8
being the number of 4-bit fields contained in a single 32-bit item, and 0xf
being a mask with 4 bits set. I'd prefer to specify only one of these values in one place, and compute the others from that one.
As to the specific questions you've asked:
I don't think
SplitValues
does a good job of indicating a type. This has been a problem for a long time, and I don't think I have a silver bullet to finally put it to rest. In this case, I'd rather just avoid it completely (about which, more below).If you pass a literal to a function instead of passing a variable, that literal will start with some type--in this case, since it's digits and can fit in an
int
, its type will beint
. That will then be converted to the function's parameter type (std::int32_t
). In this case, at least on a typical compiler whereint
is a 32-bit type, that "conversion" will really be a NOP because the source and target types are basically identical.
As to how to avoid the problem with having to define names, C++ has two mechanisms that can help quite a bit: templates (especially function templates) and auto
. Both of these can deduce a result type based on input types, to avoid having to explicitly specify types throughout much of the code.
I'd also generally prefer to write splitter
as a generic algorithm. While we're at it, its name should be changed from a noun to a verb. Non-functor classes should have nouns for names, but functors and functions do things, so their names should normally be verbs.
template <classs T, class OutIt>
void split(T number, OutIt result) {
unsigned bits = 4;
unsigned mask = (1<<bits) - 1;
unsigned iter_count = sizeof(T) * CHAR_BIT / bits;
for (unsigned i=0; i<iter_count; i++) {
*result++ = number & mask;
number >>= bits;
}
}
This part is a little longer than the code in your answer, but most of that extra length stems from specifying the number of bits, then computing the mask and iteration count from that shift value instead of having interrelated magic numbers throughout.
With this code, we can change the type of the input (e.g., to a 64-bit unsigned long long) or the shift count (e.g., to 8 bits) independently of each other, without having to patch the rest of the code to compensate.
Although it's a rather minor point, all else being equal, I prefer to avoid creating variables with default values, then filling in the real values later. This is fairly innocuous when the values involved are at least initialized as they are in a vector, but it's still better to just initialize values when possible.
That leads us to the calling code. Since we changed the function to look like a standard algorithm, we need to change the calling code to suit.
int main() {
std::deque<int> values;
split(432214123, std::front_inserter(values));
std::copy(values.begin(), values.end(), std::ostream_iterator<int>(std::cout, " "));
}
Of course, there are a lot of other ways this job could be handled, but I guess that's enough for the moment anyway.
-
\$\begingroup\$ So I should be using
std::uint32_t
here, then? \$\endgroup\$Jamal– Jamal2014年06月18日 04:47:12 +00:00Commented Jun 18, 2014 at 4:47 -
\$\begingroup\$ @Jamal: Yes (probably anyway). \$\endgroup\$Jerry Coffin– Jerry Coffin2014年06月18日 04:47:48 +00:00Commented Jun 18, 2014 at 4:47
-
\$\begingroup\$ Should I still cast to an
int
, or should it be a signed type from<cstdint>
? \$\endgroup\$Jamal– Jamal2014年06月18日 04:49:55 +00:00Commented Jun 18, 2014 at 4:49 -
\$\begingroup\$ @Jamal: Doesn't really matter--you're grabbing only 4 bits at a time, and you're guaranteed that every standard integer type can represent at least 7 bits worth of non-negative numbers. \$\endgroup\$Jerry Coffin– Jerry Coffin2014年06月18日 04:52:34 +00:00Commented Jun 18, 2014 at 4:52
-
\$\begingroup\$ @Jamal - no, you should not use
std::uint32_t
here. Useunsigned long
, which will exist on all systems and will always have at least 32 bits. There is no need to restrict your code to systems that have a built-in 32-bit integral type. \$\endgroup\$Pete Becker– Pete Becker2014年06月18日 11:53:23 +00:00Commented Jun 18, 2014 at 11:53
I don't think that splitting into a vector is a right way to go. I'd rather opt for an operator[]
, along the lines of
struct Splitter {
unsigned long value;
Splitter(unsigned long value): value(value) {}
const unsigned long operator[](int i) const {
return (value >> (i*4)) & 0x0f;
}
};
with all necessary bells and whistles added upon request.