Improving an itoa function

Question 1

I wrote an itoa function yesterday, which based on feedback I am now working on improving. How does the below look? Are there any ways in which I can improve it?

#include <stdio.h>
#include <string.h>
#include <limits.h>
char* itoa(int number, size_t sz, char buffer[sz])
{
 // do error handling and edge cases
 if (buffer == NULL || sz == 0)
 return NULL;
 else if (number == 0) 
 return strcpy(buffer, "0");
 char tmp[sz];
 int idx = 0;
 bool is_negative = (number < 0);
 if (!is_negative) number = -number; // so we can handle INT_MIN
 for (; number != 0; number/=10, idx++) {
 if (sz-- == 0) return NULL;
 tmp[idx] = '0' - (number % 10); // negative version of number + '0'
 }
 if (is_negative) {
 if (sz-- == 0) return NULL;
 tmp[idx++] = '-';
 }
 // reverse
 for (int i=0; i<idx; i++)
 buffer[idx-i-1] = tmp[i];
 buffer[idx] = '0円';
 return buffer;
}
int main(void)
{
 char buffer[100];
 int test[] = {123, 234, 0, 17, -4, -7, -17, -27, INT_MAX, INT_MIN};
 size_t len = sizeof(test) / sizeof(*test);
 
 // give it a small buffer and see how it works
 for (int i=0, num; i < len; i++) {
 num = test[i];
 char* str = itoa(num, 2, buffer); 
 printf("%-10d <%s>\n", num, str);
 }
 
 // now give it the correctly-sized buffer
 for (int i=0, num; i < len; i++) {
 num = test[i];
 char* str = itoa(num, sizeof buffer / sizeof *buffer, buffer);
 printf("%-10d <%s>\n", num, str);
 }
}

Working code here: https://onlinegdb.com/H1xKCemNu. A few questions about this:

It seems like there are just so many edge cases to consider for the most trivial of functions to write in C. For example, the above function took me about two hours to write (and was after some very helpful feedback from @chux). Is that a common issue in writing C code? Or is this more a beginner issue that will go away with experience?
With the for loop I often find myself being bitten by an off-by-one error, for example, not realizing the idx is incremented after the previous loop but before the next loop is checked. Are there particular ways to look out for that?
Any other areas to improve here?

Question 2

To answer your first question (which is not a CR Answer, since it's not reviewing the code): yes, C demands a lot of care with edge cases. It's lower-level than many modern languages you might be used to, giving the programmer great control over (therefore great responsibility for) memory management, indexing and so on. With experience, you will learn the common hazards and anticipate them as you write, and probably develop a more defensive programming mindset.

Question 3

Toby is correct, however I'd like to add that C++ gives you the same level of control (most valid C code will compile straight up with a C++ compiler, albeit with warnings for casting etc) with more advanced concepts to take away some sharp edges (e.g. RAII, de/copy/-constructors etc) and a more well rounded standard library removing needs to write the nasty bird yourself and to make memory handling safer.

Question 4

C is often used as the lowest common denominator, just about everything can interface with C libraries due to its ubiquitous legacy. And it is (IMHO wrongly) the default choice for embedded/OS code due to low overhead and direct to metal, but C++ can just as well be used with possibly lower overhead (at least in the OS kernel I ported from C to C++) with the same properties as C, but you can also use classes and other helpful features to keep your sanity and hair growth intact. Really, outside of interoperability or legacy I see no reason to ever use C over C++.

Question 5

(I'll get it my soapbox now)

Question 6

Incorrect size check

itoa(num, 2, buffer); forms "17", which needs 3 char, even though only 2 provided.

Perhaps OP meant 2 to be the string length allowed and not the string size needed. In which case sizeof buffer / sizeof *buffer is not the length allowed.

Missing header

Add #include <stdbool.h>

Missing size check

Corner case: strcpy(buffer, "0"); occurs even if sz == 1.

A simple alternative to buffer overflow concerns is to form the initial characters in a local buffer that is right-sized for worst case (INT_MIN) like char local_buf[12]. Check idx against sz. Then reverse into the true destination.

// Re-write end-of-code.
// ....
// Prior code does not check `sz` nor change sz as characters are saved in ample sized local_buf
if (idx + 1 > sz) {
 if (sz > 0) buffer[0] = 0; // Set buffer[] to "" if possible
 return NULL;
}
for (int i=0; i<idx; i++)
 buffer[idx-i-1] = local_buf[i];
buffer[idx] = '0円';
return buffer;

Rather than hard-code 12 use code to scale to the size needed. E.g.

// Scale by log10(2)
#define INT_STR_SZ (sizeof(int)*CHAR_BIT*301030/1000000 + 3)
char local_buf[INT_STR_SZ];

Question 7

thanks again. Yes I was incorrectly doing sz as "string size" rather than buffer size so I have (yet another case of) an off-by-one error, thanks for pointing that out. Also, when someone defines a size for a string function, does that usually mean buffer-size or string-size? Or does it depend?

Question 8

Also, could you please explain the CHAR_BIT*30103/100000 ? How does that evaluate to a log10 ?

Question 9

The number of characters needed to convert the worst case int is log10(2) (0.3010299956...) for each value bit, 1 to round-up, 1 for sign, 1 for 0円. A few extra certainly does not hurt.

Question 10

Change 30103/100000 to 301030/1000000 to insure math done with wider than 16-bit math. Other good fractions exist.

Question 11

Yes. Recursion is not a good fit here. Good you did not give up on buffer protection. Yet another apporach.

chux chux 36.2k2 gold badges43 silver badges96 bronze badges · Accepted Answer · 2021-03-20 07:04:27Z

Incorrect size check

itoa(num, 2, buffer); forms "17", which needs 3 char, even though only 2 provided.

Perhaps OP meant 2 to be the string length allowed and not the string size needed. In which case sizeof buffer / sizeof *buffer is not the length allowed.

Missing header

Add #include <stdbool.h>

Missing size check

Corner case: strcpy(buffer, "0"); occurs even if sz == 1.

A simple alternative to buffer overflow concerns is to form the initial characters in a local buffer that is right-sized for worst case (INT_MIN) like char local_buf[12]. Check idx against sz. Then reverse into the true destination.

// Re-write end-of-code.
// ....
// Prior code does not check `sz` nor change sz as characters are saved in ample sized local_buf
if (idx + 1 > sz) {
 if (sz > 0) buffer[0] = 0; // Set buffer[] to "" if possible
 return NULL;
}
for (int i=0; i<idx; i++)
 buffer[idx-i-1] = local_buf[i];
buffer[idx] = '0円';
return buffer;

Rather than hard-code 12 use code to scale to the size needed. E.g.

// Scale by log10(2)
#define INT_STR_SZ (sizeof(int)*CHAR_BIT*301030/1000000 + 3)
char local_buf[INT_STR_SZ];

thanks again. Yes I was incorrectly doing sz as "string size" rather than buffer size so I have (yet another case of) an off-by-one error, thanks for pointing that out. Also, when someone defines a size for a string function, does that usually mean buffer-size or string-size? Or does it depend?
Also, could you please explain the CHAR_BIT*30103/100000 ? How does that evaluate to a log10 ?
The number of characters needed to convert the worst case int is log10(2) (0.3010299956...) for each value bit, 1 to round-up, 1 for sign, 1 for 0円. A few extra certainly does not hurt.
Change 30103/100000 to 301030/1000000 to insure math done with wider than 16-bit math. Other good fractions exist.
Yes. Recursion is not a good fit here. Good you did not give up on buffer protection. Yet another apporach.

Stack Exchange Network

Improving an itoa function

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Improving an itoa function

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions