I wrote an itoa function yesterday, which based on feedback I am now working on improving. How does the below look? Are there any ways in which I can improve it?
#include <stdio.h>
#include <string.h>
#include <limits.h>
char* itoa(int number, size_t sz, char buffer[sz])
{
// do error handling and edge cases
if (buffer == NULL || sz == 0)
return NULL;
else if (number == 0)
return strcpy(buffer, "0");
char tmp[sz];
int idx = 0;
bool is_negative = (number < 0);
if (!is_negative) number = -number; // so we can handle INT_MIN
for (; number != 0; number/=10, idx++) {
if (sz-- == 0) return NULL;
tmp[idx] = '0' - (number % 10); // negative version of number + '0'
}
if (is_negative) {
if (sz-- == 0) return NULL;
tmp[idx++] = '-';
}
// reverse
for (int i=0; i<idx; i++)
buffer[idx-i-1] = tmp[i];
buffer[idx] = '0円';
return buffer;
}
int main(void)
{
char buffer[100];
int test[] = {123, 234, 0, 17, -4, -7, -17, -27, INT_MAX, INT_MIN};
size_t len = sizeof(test) / sizeof(*test);
// give it a small buffer and see how it works
for (int i=0, num; i < len; i++) {
num = test[i];
char* str = itoa(num, 2, buffer);
printf("%-10d <%s>\n", num, str);
}
// now give it the correctly-sized buffer
for (int i=0, num; i < len; i++) {
num = test[i];
char* str = itoa(num, sizeof buffer / sizeof *buffer, buffer);
printf("%-10d <%s>\n", num, str);
}
}
Working code here: https://onlinegdb.com/H1xKCemNu. A few questions about this:
- It seems like there are just so many edge cases to consider for the most trivial of functions to write in C. For example, the above function took me about two hours to write (and was after some very helpful feedback from @chux). Is that a common issue in writing C code? Or is this more a beginner issue that will go away with experience?
- With the
for
loop I often find myself being bitten by an off-by-one error, for example, not realizing theidx
is incremented after the previous loop but before the next loop is checked. Are there particular ways to look out for that? - Any other areas to improve here?
-
\$\begingroup\$ To answer your first question (which is not a CR Answer, since it's not reviewing the code): yes, C demands a lot of care with edge cases. It's lower-level than many modern languages you might be used to, giving the programmer great control over (therefore great responsibility for) memory management, indexing and so on. With experience, you will learn the common hazards and anticipate them as you write, and probably develop a more defensive programming mindset. \$\endgroup\$Toby Speight– Toby Speight2021年03月20日 08:09:49 +00:00Commented Mar 20, 2021 at 8:09
-
\$\begingroup\$ Toby is correct, however I'd like to add that C++ gives you the same level of control (most valid C code will compile straight up with a C++ compiler, albeit with warnings for casting etc) with more advanced concepts to take away some sharp edges (e.g. RAII, de/copy/-constructors etc) and a more well rounded standard library removing needs to write the nasty bird yourself and to make memory handling safer. \$\endgroup\$Emily L.– Emily L.2021年03月20日 17:33:35 +00:00Commented Mar 20, 2021 at 17:33
-
\$\begingroup\$ C is often used as the lowest common denominator, just about everything can interface with C libraries due to its ubiquitous legacy. And it is (IMHO wrongly) the default choice for embedded/OS code due to low overhead and direct to metal, but C++ can just as well be used with possibly lower overhead (at least in the OS kernel I ported from C to C++) with the same properties as C, but you can also use classes and other helpful features to keep your sanity and hair growth intact. Really, outside of interoperability or legacy I see no reason to ever use C over C++. \$\endgroup\$Emily L.– Emily L.2021年03月20日 17:42:09 +00:00Commented Mar 20, 2021 at 17:42
-
\$\begingroup\$ (I'll get it my soapbox now) \$\endgroup\$Emily L.– Emily L.2021年03月20日 17:42:28 +00:00Commented Mar 20, 2021 at 17:42
1 Answer 1
Incorrect size check
itoa(num, 2, buffer);
forms "17"
, which needs 3 char
, even though only 2 provided.
Perhaps OP meant 2 to be the string length allowed and not the string size needed. In which case sizeof buffer / sizeof *buffer
is not the length allowed.
Missing header
Add #include <stdbool.h>
Missing size check
Corner case: strcpy(buffer, "0");
occurs even if sz == 1
.
A simple alternative to buffer overflow concerns is to form the initial characters in a local buffer that is right-sized for worst case (INT_MIN
) like char local_buf[12]
. Check idx
against sz
. Then reverse into the true destination.
// Re-write end-of-code.
// ....
// Prior code does not check `sz` nor change sz as characters are saved in ample sized local_buf
if (idx + 1 > sz) {
if (sz > 0) buffer[0] = 0; // Set buffer[] to "" if possible
return NULL;
}
for (int i=0; i<idx; i++)
buffer[idx-i-1] = local_buf[i];
buffer[idx] = '0円';
return buffer;
Rather than hard-code 12
use code to scale to the size needed. E.g.
// Scale by log10(2)
#define INT_STR_SZ (sizeof(int)*CHAR_BIT*301030/1000000 + 3)
char local_buf[INT_STR_SZ];
-
\$\begingroup\$ thanks again. Yes I was incorrectly doing
sz
as "string size" rather than buffer size so I have (yet another case of) an off-by-one error, thanks for pointing that out. Also, when someone defines a size for a string function, does that usually mean buffer-size or string-size? Or does it depend? \$\endgroup\$carl.hiass– carl.hiass2021年03月20日 07:08:34 +00:00Commented Mar 20, 2021 at 7:08 -
\$\begingroup\$ Also, could you please explain the
CHAR_BIT*30103/100000
? How does that evaluate to a log10 ? \$\endgroup\$carl.hiass– carl.hiass2021年03月20日 07:10:43 +00:00Commented Mar 20, 2021 at 7:10 -
1\$\begingroup\$ The number of characters needed to convert the worst case
int
is log10(2) (0.3010299956...) for each value bit, 1 to round-up, 1 for sign, 1 for 0円. A few extra certainly does not hurt. \$\endgroup\$chux– chux2021年03月20日 07:14:27 +00:00Commented Mar 20, 2021 at 7:14 -
1\$\begingroup\$ Change
30103/100000
to301030/1000000
to insure math done with wider than 16-bit math. Other good fractions exist. \$\endgroup\$chux– chux2021年03月20日 07:33:34 +00:00Commented Mar 20, 2021 at 7:33 -
1\$\begingroup\$ Yes. Recursion is not a good fit here. Good you did not give up on buffer protection. Yet another apporach. \$\endgroup\$chux– chux2021年03月20日 08:14:48 +00:00Commented Mar 20, 2021 at 8:14