2

I'm currently studying encryption and I was coming up with ways of reversing the nibbles of a byte (ex. 0xF5 => 0x5F). I came up with this solution:

byte >> 4 | (byte & 0x0F) << 4

Other solutions I found online were similar, but added one extra operand by masking the left nibble of a byte:

(byte & 0xF0) >> 4 | (byte & 0x0F) << 4

In binary, the second solution looks like this:

# Extract the right nibble of a byte and shift to the left
[0xF5] 1111 0101 # Original Value
[0x0F] 0000 1111 & # Mask Right Nibble
[0x05] 0000 0101 = # Extracted Right Nibble
[0x50] 1010 0000 << 4 # Shift Four Bits to the Left
# Extract the left nibble of a byte and shift to the right
[0xF5] 1111 0101 # Original Value
[0xF0] 1111 0000 & # Mask Left Nibble
[0xF0] 1111 0000 = # Extracted Left Nibble
[0x0F] 0000 1111 >> 4 # Shift Four Bits to the Right
# Combine the shifted nibbles together
[0x05] 0000 1111 # Left Nibble Shifted to the Right
[0xF0] 0101 0000 | # Right Nibble Shifted to the Left
[0xF5] 0101 1111 = # New Value

Correct me if I'm wrong, but the second solution is useful if you're dealing with a data type that is larger than one byte and are solely focused on the least significant byte. So if you were to shift without masking, the bits from the higher-ordered bytes will propagate into the left nibble. Thus, masking the left nibble is necessary in regards to that scenario.

# Bits shift into the least significant byte without masking
[0x0AF5] 0000 1010 1111 0101
[0x00AF] 0000 0000 1010 1111 >> 4

On the other hand, if you're truly working with one byte, isn't masking the left nibble redundant as the left nibble bits will get zeroed out after right-shifting four bits?

[0xF5] 1111 0101
[0x0F] 0000 1111 >> 4

Maybe there are other reasons to mask the left nibble bits that I'm unaware of. For example, other systems might behave differently where the second solution is necessary.

Do I have my head in the right place or is there anything I should consider?

Here is an example code for further clarification:

typedef uint8_t byte;
static inline
byte swap_nibbles(byte bits) {
 return bits >> 4 | (bits & 0x0F) << 4;
}
asked Feb 11, 2025 at 17:49
9
  • 4
    It depends on exactly what C type you mean when you say "byte". For instance, if you mean char, that is a signed type on some platforms, and promotion to int will sign-extend which would mess up some cases. If you mean unsigned char or uint8_t then everything you say should be correct. Commented Feb 11, 2025 at 17:56
  • @NateEldredge: You should promote that to an answer before I do and steal your reputation. Commented Feb 11, 2025 at 18:00
  • 1
    @Legended Sorry, meant to ask more like: Which do you find easier to understand: (byte & 0xF0) >> 4 | (byte & 0x0F) << 4 or byte >> 4 | (byte & 0x0F) << 4? (IAC, same object name). Commented Feb 11, 2025 at 21:30
  • 2
    byte >> 4 | byte << 4 is great if 1) byte is a uint8_t and 2) the result, which is an int, has only its uint8_t part used as in uint8_t swap_nibbles(uint8_t byte) { return byte >> 4 | byte << 4; }. Yet the goal is not code golf. (byte & 0xF0) >> 4 | (byte & 0x0F) << 4 is OK too. Commented Feb 11, 2025 at 21:56
  • 1
    As far as optimization is concerned: in my tests, all (correct) versions of this code compiled into identical assembly. Commented Feb 12, 2025 at 4:53

1 Answer 1

2

You do not show the definition of byte. If it has a signed eight-bit integer, then this code:

signed char byte = -111; /* 0x91 */
printf("0x%hhX\n", byte);
byte = byte >> 4 | (byte & 0x0F) << 4;
printf("0x%hhX\n", byte);

prints, in many C implementations:

0x91
0xF9

Although byte initially contains the bits 9116, in byte >> 4, it is promoted to int. Since those bits represent the value −111, this produces an int with that value, which is, with a four-byte int, FFFFFF9116. Then >> 4 produces FFFFFF916. (This is implementation-defined.) ORing this with the 1016 from the right side of the | produces FFFFFF916, and then assigning that to byte converts it to signed char. That is implementation-defined, but the most common result is to wrap modulo 256, producing the bits F916, representing the value −7.

In contrast, using byte = (byte &0xF0) >> 4 | (byte & 0x0F) << 4; prints:

0x91
0x19

However, if byte has an unsigned eight-bit integer type, then the promotion to int is not a problem, as int has enough headroom to hold the values in these expressions without running into sign or overflow issues.

Modern compilers may be heavily optimized for bit-twiddling operations, so it is likely either of these expressions will be well optimized for the target architecture. Your first concern should be correctness.

answered Feb 11, 2025 at 20:00
Sign up to request clarification or add additional context in comments.

4 Comments

Pedantically, it should be printf("0x%hhX\n", (unsigned char)byte);
@Lundin: This is not necessary. The C standard observes the argument will be promoted to int and explicitly specifies that its value "shall be converted to signed char or unsigned char before printing." Thus the conversion is already in printf and need not be performed by the caller.
But neglecting the cast means that negative values will get sign extended upon default argument promotion.
@Lundin: Yes, indeed it does, and the explicitly specified conversion will produce an unsigned char value.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.