2
\$\begingroup\$

I wrote an itoa function yesterday, which based on feedback I am now working on improving. How does the below look? Are there any ways in which I can improve it?

#include <stdio.h>
#include <string.h>
#include <limits.h>
char* itoa(int number, size_t sz, char buffer[sz])
{
 // do error handling and edge cases
 if (buffer == NULL || sz == 0)
 return NULL;
 else if (number == 0) 
 return strcpy(buffer, "0");
 char tmp[sz];
 int idx = 0;
 bool is_negative = (number < 0);
 if (!is_negative) number = -number; // so we can handle INT_MIN
 for (; number != 0; number/=10, idx++) {
 if (sz-- == 0) return NULL;
 tmp[idx] = '0' - (number % 10); // negative version of number + '0'
 }
 if (is_negative) {
 if (sz-- == 0) return NULL;
 tmp[idx++] = '-';
 }
 // reverse
 for (int i=0; i<idx; i++)
 buffer[idx-i-1] = tmp[i];
 buffer[idx] = '0円';
 return buffer;
}
int main(void)
{
 char buffer[100];
 int test[] = {123, 234, 0, 17, -4, -7, -17, -27, INT_MAX, INT_MIN};
 size_t len = sizeof(test) / sizeof(*test);
 
 // give it a small buffer and see how it works
 for (int i=0, num; i < len; i++) {
 num = test[i];
 char* str = itoa(num, 2, buffer); 
 printf("%-10d <%s>\n", num, str);
 }
 
 // now give it the correctly-sized buffer
 for (int i=0, num; i < len; i++) {
 num = test[i];
 char* str = itoa(num, sizeof buffer / sizeof *buffer, buffer);
 printf("%-10d <%s>\n", num, str);
 }
}

Working code here: https://onlinegdb.com/H1xKCemNu. A few questions about this:

  • It seems like there are just so many edge cases to consider for the most trivial of functions to write in C. For example, the above function took me about two hours to write (and was after some very helpful feedback from @chux). Is that a common issue in writing C code? Or is this more a beginner issue that will go away with experience?
  • With the for loop I often find myself being bitten by an off-by-one error, for example, not realizing the idx is incremented after the previous loop but before the next loop is checked. Are there particular ways to look out for that?
  • Any other areas to improve here?
Reinderien
71k5 gold badges76 silver badges256 bronze badges
asked Mar 20, 2021 at 4:50
\$\endgroup\$
4
  • \$\begingroup\$ To answer your first question (which is not a CR Answer, since it's not reviewing the code): yes, C demands a lot of care with edge cases. It's lower-level than many modern languages you might be used to, giving the programmer great control over (therefore great responsibility for) memory management, indexing and so on. With experience, you will learn the common hazards and anticipate them as you write, and probably develop a more defensive programming mindset. \$\endgroup\$ Commented Mar 20, 2021 at 8:09
  • \$\begingroup\$ Toby is correct, however I'd like to add that C++ gives you the same level of control (most valid C code will compile straight up with a C++ compiler, albeit with warnings for casting etc) with more advanced concepts to take away some sharp edges (e.g. RAII, de/copy/-constructors etc) and a more well rounded standard library removing needs to write the nasty bird yourself and to make memory handling safer. \$\endgroup\$ Commented Mar 20, 2021 at 17:33
  • \$\begingroup\$ C is often used as the lowest common denominator, just about everything can interface with C libraries due to its ubiquitous legacy. And it is (IMHO wrongly) the default choice for embedded/OS code due to low overhead and direct to metal, but C++ can just as well be used with possibly lower overhead (at least in the OS kernel I ported from C to C++) with the same properties as C, but you can also use classes and other helpful features to keep your sanity and hair growth intact. Really, outside of interoperability or legacy I see no reason to ever use C over C++. \$\endgroup\$ Commented Mar 20, 2021 at 17:42
  • \$\begingroup\$ (I'll get it my soapbox now) \$\endgroup\$ Commented Mar 20, 2021 at 17:42

1 Answer 1

4
\$\begingroup\$

Incorrect size check

itoa(num, 2, buffer); forms "17", which needs 3 char, even though only 2 provided.

Perhaps OP meant 2 to be the string length allowed and not the string size needed. In which case sizeof buffer / sizeof *buffer is not the length allowed.

Missing header

Add #include <stdbool.h>

Missing size check

Corner case: strcpy(buffer, "0"); occurs even if sz == 1.


A simple alternative to buffer overflow concerns is to form the initial characters in a local buffer that is right-sized for worst case (INT_MIN) like char local_buf[12]. Check idx against sz. Then reverse into the true destination.

// Re-write end-of-code.
// ....
// Prior code does not check `sz` nor change sz as characters are saved in ample sized local_buf
if (idx + 1 > sz) {
 if (sz > 0) buffer[0] = 0; // Set buffer[] to "" if possible
 return NULL;
}
for (int i=0; i<idx; i++)
 buffer[idx-i-1] = local_buf[i];
buffer[idx] = '0円';
return buffer;

Rather than hard-code 12 use code to scale to the size needed. E.g.

// Scale by log10(2)
#define INT_STR_SZ (sizeof(int)*CHAR_BIT*301030/1000000 + 3)
char local_buf[INT_STR_SZ];
answered Mar 20, 2021 at 7:04
\$\endgroup\$
9
  • \$\begingroup\$ thanks again. Yes I was incorrectly doing sz as "string size" rather than buffer size so I have (yet another case of) an off-by-one error, thanks for pointing that out. Also, when someone defines a size for a string function, does that usually mean buffer-size or string-size? Or does it depend? \$\endgroup\$ Commented Mar 20, 2021 at 7:08
  • \$\begingroup\$ Also, could you please explain the CHAR_BIT*30103/100000 ? How does that evaluate to a log10 ? \$\endgroup\$ Commented Mar 20, 2021 at 7:10
  • 1
    \$\begingroup\$ The number of characters needed to convert the worst case int is log10(2) (0.3010299956...) for each value bit, 1 to round-up, 1 for sign, 1 for 0円. A few extra certainly does not hurt. \$\endgroup\$ Commented Mar 20, 2021 at 7:14
  • 1
    \$\begingroup\$ Change 30103/100000 to 301030/1000000 to insure math done with wider than 16-bit math. Other good fractions exist. \$\endgroup\$ Commented Mar 20, 2021 at 7:33
  • 1
    \$\begingroup\$ Yes. Recursion is not a good fit here. Good you did not give up on buffer protection. Yet another apporach. \$\endgroup\$ Commented Mar 20, 2021 at 8:14

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.