Optimization and Methods for Reversing Nibbles of a Byte

Question 1

I'm currently studying encryption and I was coming up with ways of reversing the nibbles of a byte (ex. 0xF5 => 0x5F). I came up with this solution:

byte >> 4 | (byte & 0x0F) << 4

Other solutions I found online were similar, but added one extra operand by masking the left nibble of a byte:

(byte & 0xF0) >> 4 | (byte & 0x0F) << 4

In binary, the second solution looks like this:

# Extract the right nibble of a byte and shift to the left
[0xF5] 1111 0101 # Original Value
[0x0F] 0000 1111 & # Mask Right Nibble
[0x05] 0000 0101 = # Extracted Right Nibble
[0x50] 1010 0000 << 4 # Shift Four Bits to the Left
# Extract the left nibble of a byte and shift to the right
[0xF5] 1111 0101 # Original Value
[0xF0] 1111 0000 & # Mask Left Nibble
[0xF0] 1111 0000 = # Extracted Left Nibble
[0x0F] 0000 1111 >> 4 # Shift Four Bits to the Right
# Combine the shifted nibbles together
[0x05] 0000 1111 # Left Nibble Shifted to the Right
[0xF0] 0101 0000 | # Right Nibble Shifted to the Left
[0xF5] 0101 1111 = # New Value

Correct me if I'm wrong, but the second solution is useful if you're dealing with a data type that is larger than one byte and are solely focused on the least significant byte. So if you were to shift without masking, the bits from the higher-ordered bytes will propagate into the left nibble. Thus, masking the left nibble is necessary in regards to that scenario.

# Bits shift into the least significant byte without masking
[0x0AF5] 0000 1010 1111 0101
[0x00AF] 0000 0000 1010 1111 >> 4

On the other hand, if you're truly working with one byte, isn't masking the left nibble redundant as the left nibble bits will get zeroed out after right-shifting four bits?

[0xF5] 1111 0101
[0x0F] 0000 1111 >> 4

Maybe there are other reasons to mask the left nibble bits that I'm unaware of. For example, other systems might behave differently where the second solution is necessary.

Do I have my head in the right place or is there anything I should consider?

Here is an example code for further clarification:

typedef uint8_t byte;
static inline
byte swap_nibbles(byte bits) {
 return bits >> 4 | (bits & 0x0F) << 4;
}

Question 2

It depends on exactly what C type you mean when you say "byte". For instance, if you mean char, that is a signed type on some platforms, and promotion to int will sign-extend which would mess up some cases. If you mean unsigned char or uint8_t then everything you say should be correct.

Question 3

@NateEldredge: You should promote that to an answer before I do and steal your reputation.

Question 4

@Legended Sorry, meant to ask more like: Which do you find easier to understand: (byte & 0xF0) >> 4 | (byte & 0x0F) << 4 or byte >> 4 | (byte & 0x0F) << 4? (IAC, same object name).

Question 5

byte >> 4 | byte << 4 is great if 1) byte is a uint8_t and 2) the result, which is an int, has only its uint8_t part used as in uint8_t swap_nibbles(uint8_t byte) { return byte >> 4 | byte << 4; }. Yet the goal is not code golf. (byte & 0xF0) >> 4 | (byte & 0x0F) << 4 is OK too.

Question 6

As far as optimization is concerned: in my tests, all (correct) versions of this code compiled into identical assembly.

Question 7

You do not show the definition of byte. If it has a signed eight-bit integer, then this code:

signed char byte = -111; /* 0x91 */
printf("0x%hhX\n", byte);
byte = byte >> 4 | (byte & 0x0F) << 4;
printf("0x%hhX\n", byte);

prints, in many C implementations:

0x91
0xF9

Although byte initially contains the bits 91₁₆, in byte >> 4, it is promoted to int. Since those bits represent the value −111, this produces an int with that value, which is, with a four-byte int, FFFFFF91₁₆. Then >> 4 produces FFFFFF9₁₆. (This is implementation-defined.) ORing this with the 10₁₆ from the right side of the | produces FFFFFF9₁₆, and then assigning that to byte converts it to signed char. That is implementation-defined, but the most common result is to wrap modulo 256, producing the bits F9₁₆, representing the value −7.

In contrast, using byte = (byte &0xF0) >> 4 | (byte & 0x0F) << 4; prints:

0x91
0x19

However, if byte has an unsigned eight-bit integer type, then the promotion to int is not a problem, as int has enough headroom to hold the values in these expressions without running into sign or overflow issues.

Modern compilers may be heavily optimized for bit-twiddling operations, so it is likely either of these expressions will be well optimized for the target architecture. Your first concern should be correctness.

Question 8

Pedantically, it should be printf("0x%hhX\n", (unsigned char)byte);

Question 9

@Lundin: This is not necessary. The C standard observes the argument will be promoted to int and explicitly specifies that its value "shall be converted to signed char or unsigned char before printing." Thus the conversion is already in printf and need not be performed by the caller.

Question 10

But neglecting the cast means that negative values will get sign extended upon default argument promotion.

Question 11

@Lundin: Yes, indeed it does, and the explicitly specified conversion will produce an unsigned char value.

Eric Postpischil 234k15 gold badges200 silver badges383 bronze badges · Accepted Answer · 2025-02-11 20:00:01Z

You do not show the definition of byte. If it has a signed eight-bit integer, then this code:

signed char byte = -111; /* 0x91 */
printf("0x%hhX\n", byte);
byte = byte >> 4 | (byte & 0x0F) << 4;
printf("0x%hhX\n", byte);

prints, in many C implementations:

0x91
0xF9

Although byte initially contains the bits 91₁₆, in byte >> 4, it is promoted to int. Since those bits represent the value −111, this produces an int with that value, which is, with a four-byte int, FFFFFF91₁₆. Then >> 4 produces FFFFFF9₁₆. (This is implementation-defined.) ORing this with the 10₁₆ from the right side of the | produces FFFFFF9₁₆, and then assigning that to byte converts it to signed char. That is implementation-defined, but the most common result is to wrap modulo 256, producing the bits F9₁₆, representing the value −7.

In contrast, using byte = (byte &0xF0) >> 4 | (byte & 0x0F) << 4; prints:

0x91
0x19

However, if byte has an unsigned eight-bit integer type, then the promotion to int is not a problem, as int has enough headroom to hold the values in these expressions without running into sign or overflow issues.

Modern compilers may be heavily optimized for bit-twiddling operations, so it is likely either of these expressions will be well optimized for the target architecture. Your first concern should be correctness.

Pedantically, it should be printf("0x%hhX\n", (unsigned char)byte);
@Lundin: This is not necessary. The C standard observes the argument will be promoted to int and explicitly specifies that its value "shall be converted to signed char or unsigned char before printing." Thus the conversion is already in printf and need not be performed by the caller.
But neglecting the cast means that negative values will get sign extended upon default argument promotion.
@Lundin: Yes, indeed it does, and the explicitly specified conversion will produce an unsigned char value.

CollectivesTM on Stack Overflow

Optimization and Methods for Reversing Nibbles of a Byte

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related