Printing a number in binary in C

Question 1

I wrote a short function whose purpose is simple: Print the binary representation of number to the console. I also aim to be as portable as possible, hence my use of CHAR_BIT and + instead of | in case the character encoding for '0' happens to be odd. Is my code as good and portable as I intended?

#include <stdio.h>
#include <limits.h>
#define SIZE (sizeof(unsigned long long)*CHAR_BIT)
int putb(const unsigned long long n) {
 char b[SIZE+1];
 for (unsigned i = SIZE; i--; b[i^SIZE-1] = ((char)(n>>i)&1)+'0');
 b[SIZE] = 0;
 return puts(b);
}

Question 2

The obvious portability problem is that we have b[i^SIZE-1] where I'd expect b[SIZE-1-i]. That looks like an error of judgement: it produces the same results when SIZE is an exact power of 2, but not otherwise.

Instead of the SIZE preprocessor macro, I'd probably use a constant within the function:

 static const size_t length = sizeof n * CHAR_BIT;

If you do stick with the macro, consider #undef SIZE afterwards so it's available to other code.

As a style issue, I don't like the for loop with empty body on the same line. That looks more like code-golf than something that's intended to be readable - especially with the "work" of the loop stuffed into the control expression.

We really want the cast to char to happen to the sum, since char+char yields int. That said, gcc -pedantic -Wconversion doesn't complain without it.

I would write the character constant 0 as '0円' to better convey the intent. That's especially important as we have '0' close by.

You might find it easier, clearer and more efficient to work backwards from the least-significant bit:

int putb(unsigned long long n)
{
 char b[(sizeof n * CHAR_BIT) + 1];
 char *p = b + sizeof b;
 *--p = '0円';
 for (; p-- > b; n >>= 1) {
 *p = '0' + (char)(n & 1);
 }
 return puts(b);
}

This gives the option of a version that doesn't print leading zeros (and therefore modified to accept any size integer):

#include <stdint.h>
int putb(uintmax_t n)
{
 char b[(sizeof n * CHAR_BIT) + 1];
 char *p = b + sizeof b;
 *--p = '0円';
 do {
 *--p = '0' + (n & 1);
 } while (n >>= 1);
 return puts(p);
}

This one is trivially modified to add a 0b prefix, should that be desired.

Question 3

it happens to produce the same results when SIZE is an exact power of 2, but not otherwise - It's not just a coincidence; it's a known bithack which is mostly useful in assembly language, where 31 - reg can't be done in a single instruction on most ISAs (ARM has an rsb reverse-subtract instruction) but reg ^= 31 can. I've maybe only seen it in the context of how GCC implements __builtin_clz() in terms of x86's bsr instruction, which produces a bit-index instead of leading-zero count. The builtin (like the instruction) doesn't define the result for input=0, so GCC can use ^=31.

Question 4

100% agreed the XOR bithack is not a good idea in this code, though! If you want to write something easily readable that's more likely to compile to efficient asm given C's lack of a rotate builtin or capturing carry-out from a left shift, looping backwards like you're doing and taking the low bit is good. (@user16217248). (In x86 asm you'd want something like shl rcx, 1 / mov rax, '0' / adc rax, 0 to turn each bit into an ASCII digit via the carry flag, starting from the top bit. Or various other options like shl / setc al / add al, '0'. Or use SSE2 to do 16 bits in parallel.)

Question 5

char b[length+1]; Is this considered a variable-length array, even though in practice it's not?

Question 6

@user16217248: Yes in C it's a VLA, but compilers will in practice see the compile-time constant and not waste instructions, as long as you enable optimization. With -O0 (godbolt.org/z/jenhMx7W3), clang stores a pointer to the array into a local var, like it would do with an actual VLA. Fun fact: in C++ it's not a VLA. In C++, a const int initialized with a constant expression becomes constexpr.

Question 7

(Unfortunately with -O3, both GCC and clang compile this to amazingly bad code, especially clang which does 45 vpinsrb instructions + 3x vmovd to shuffle 1 byte at a time into SIMD vectors. As in How to perform the inverse of _mm256_movemask_epi8 (VPMOVMSKB)? / is there an inverse instruction to the movemask instruction in intel avx2? . And Convert 16 bits mask to 16 bytes mask has a version that does printing order, since that's what the question secretly wanted)

Question 8

Is my code as good and portable as I intended?

Not quite.

C23

Use specifier "%b".

#include <stdio.h>
#include <limits.h>
printf("%0*llb\n", ULLONG_WIDTH, n); // Print all bits
printf("%llb\n", n); // Print significant bits

For highly portability C99 onward

OP used a constant sized buffer - good. Variable length arrays since C11 are optionally supported.
Do not assume no padding bits as with sizeof(unsigned long long)*CHAR_BIT. The number of value bits in an unsigned long long depends on ULLONG_MAX.

#include <stdio.h>
#include <limits.h>
// https://stackoverflow.com/a/4589384/2410359
/* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 2040 */
#define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))
#define SIZE (IMAX_BITS(ULLONG_MAX))
...

Consider simply code such as below which looks at the least significant bit per iteration.

int putb(unsigned long long n) {
 char b[SIZE+1];
 b[SIZE] = '0円';
 for (unsigned i = SIZE; i-- > 0; ) {
 b[i] = (char) ((n&1) + '0');
 n >>= 1;
 }
 return puts(b);
}

Pre-C99

There is no unsigned long long.

Question 9

Presence of padding bits may cause the buffer to be slightly oversized, but is that particularly harmful? Compared to the readability of the alternative, and the fact we're doing I/O, it seems irrelevant.

Question 10

@TobySpeight "Presence of padding bits may cause the buffer to be slightly oversized, but is that particularly harmful?" --> No - an extra char or so in a small local buffer is insignificant. Yet code miscalculates with for (; p-- > b; n >>= 1) { the number of bits to print. It prints extra leading "0" characters when the type is padded.

Question 11

Ah yes, of course. An alternative is to use an all-ones constant of same type (i.e. ~0ULL) and shift that along with n as the count (for unsigned long long i = ~0ULL; i; i >>= 1)).

Question 12

@TobySpeight true, yet I find for (unsigned i = SIZE; i-- > 0; ) { more direct.

Toby Speight Toby Speight 87.3k14 gold badges104 silver badges322 bronze badges · Accepted Answer · 2023-06-10 08:29:18Z

The obvious portability problem is that we have b[i^SIZE-1] where I'd expect b[SIZE-1-i]. That looks like an error of judgement: it produces the same results when SIZE is an exact power of 2, but not otherwise.

Instead of the SIZE preprocessor macro, I'd probably use a constant within the function:

 static const size_t length = sizeof n * CHAR_BIT;

If you do stick with the macro, consider #undef SIZE afterwards so it's available to other code.

As a style issue, I don't like the for loop with empty body on the same line. That looks more like code-golf than something that's intended to be readable - especially with the "work" of the loop stuffed into the control expression.

We really want the cast to char to happen to the sum, since char+char yields int. That said, gcc -pedantic -Wconversion doesn't complain without it.

I would write the character constant 0 as '0円' to better convey the intent. That's especially important as we have '0' close by.

You might find it easier, clearer and more efficient to work backwards from the least-significant bit:

int putb(unsigned long long n)
{
 char b[(sizeof n * CHAR_BIT) + 1];
 char *p = b + sizeof b;
 *--p = '0円';
 for (; p-- > b; n >>= 1) {
 *p = '0' + (char)(n & 1);
 }
 return puts(b);
}

This gives the option of a version that doesn't print leading zeros (and therefore modified to accept any size integer):

#include <stdint.h>
int putb(uintmax_t n)
{
 char b[(sizeof n * CHAR_BIT) + 1];
 char *p = b + sizeof b;
 *--p = '0円';
 do {
 *--p = '0' + (n & 1);
 } while (n >>= 1);
 return puts(p);
}

This one is trivially modified to add a 0b prefix, should that be desired.

it happens to produce the same results when SIZE is an exact power of 2, but not otherwise - It's not just a coincidence; it's a known bithack which is mostly useful in assembly language, where 31 - reg can't be done in a single instruction on most ISAs (ARM has an rsb reverse-subtract instruction) but reg ^= 31 can. I've maybe only seen it in the context of how GCC implements __builtin_clz() in terms of x86's bsr instruction, which produces a bit-index instead of leading-zero count. The builtin (like the instruction) doesn't define the result for input=0, so GCC can use ^=31.
100% agreed the XOR bithack is not a good idea in this code, though! If you want to write something easily readable that's more likely to compile to efficient asm given C's lack of a rotate builtin or capturing carry-out from a left shift, looping backwards like you're doing and taking the low bit is good. (@user16217248). (In x86 asm you'd want something like shl rcx, 1 / mov rax, '0' / adc rax, 0 to turn each bit into an ASCII digit via the carry flag, starting from the top bit. Or various other options like shl / setc al / add al, '0'. Or use SSE2 to do 16 bits in parallel.)
char b[length+1]; Is this considered a variable-length array, even though in practice it's not?
@user16217248: Yes in C it's a VLA, but compilers will in practice see the compile-time constant and not waste instructions, as long as you enable optimization. With -O0 (godbolt.org/z/jenhMx7W3), clang stores a pointer to the array into a local var, like it would do with an actual VLA. Fun fact: in C++ it's not a VLA. In C++, a const int initialized with a constant expression becomes constexpr.
(Unfortunately with -O3, both GCC and clang compile this to amazingly bad code, especially clang which does 45 vpinsrb instructions + 3x vmovd to shuffle 1 byte at a time into SIMD vectors. As in How to perform the inverse of _mm256_movemask_epi8 (VPMOVMSKB)? / is there an inverse instruction to the movemask instruction in intel avx2? . And Convert 16 bits mask to 16 bytes mask has a version that does printing order, since that's what the question secretly wanted)

Stack Exchange Network

Printing a number in binary in C

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Printing a number in binary in C

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions