I decided to make my own version of snprintf
in C. I intentionally changed some things though. My version guarantees the buffer printed to will be null-terminated, and it returns the number of characters printed to the buffer, not the number that would have been printed if the buffer's size was not limited. And I only worried about some of the main formatting features, like %s
, %c
, %d
, %h
, and %H
.
I would love to know what I could do better in this and improve, or what aspects of it I did or did not implement well.
#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
int INT_TO_STR_DIGITS_L[16] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };
int INT_TO_STR_DIGITS_U[16] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' };
int int_to_str(int x, char *buf, size_t size, int base, int uppercase) {
int length = (int)ceil(log((double)x)/log((double)base));
int r, i = 0;
char c;
if (size < length) {
x /= (int)pow(base, (float)(length - size));
length = size;
}
do {
if (i >= size) break;
r = x % base;
if (uppercase) {
c = INT_TO_STR_DIGITS_U[r];
} else {
c = INT_TO_STR_DIGITS_L[r];
}
buf[length-i-1] = c;
x /= base;
i++;
} while (x != 0);
return i;
}
int my_snprintf(char *str, size_t max_size, const char *fmt, ...) {
va_list arg_list;
va_start(arg_list, fmt);
int chars_printed = 0;
char *start_str = str;
char c, *str_arg;
int num, len;
int uppercase = 0, base = 10;
for (int i = 0; fmt[i] != 0; i++) {
if (max_size - chars_printed <= 0) {
break;
} else if (fmt[i] == '%') {
i++;
switch (fmt[i]) {
case 'c':
c = va_arg(arg_list, int);
str[chars_printed++] = c;
break;
case '%':
str[chars_printed++] = '%';
break;
case 's':
str_arg = va_arg(arg_list, char *);
len = strnlen(str_arg, max_size - chars_printed);
strncpy(str+chars_printed, str_arg, len);
chars_printed += len;
break;
case 'H':
uppercase = 1;
case 'h':
base = 16;
case 'd':
num = va_arg(arg_list, int);
len = int_to_str(num, str+chars_printed, max_size - chars_printed, base, uppercase);
chars_printed += len;
break;
default:
printf("Invalid format.\n");
va_end(arg_list);
return -1;
}
} else {
str[chars_printed++] = fmt[i];
}
}
if (chars_printed == max_size) chars_printed--;
str[chars_printed] = 0;
va_end(arg_list);
return chars_printed;
}
2 Answers 2
Bug
Your int to string conversion isn't working correctly when the number being printed is an exact power of the base. Here is a program that demonstrates the bug:
int main(void)
{
char buf[256];
memset(buf, 'z', 256);
my_snprintf(buf, 256, "abc%ddef", 1000);
printf("%s\n", buf);
}
Expected output:
abc1000def
Actual output:
ab1000zdef
As you can see, the 1000
portion was written one too far to the left. The problem is that your number length computation is off by one for exact powers of the base.
Unnecessary and unsafe floating point operations
The Floating Point PoliceTM would like to point out that the use of floating point in int_to_str()
is both unnecessary and dangerous. First of all, this line:
int length = (int)ceil(log((double)x)/log((double)base));
could be rewritten to use a loop to count the number of digits. By using floating point, you open yourself up to rounding errors. For example, if x
were 125 and base
were 5, you would expect length
to be 3. However, when I ran the above code using 125 and 5 on my x86 machine, I got a length
of 4 instead. This is because the division evaluated to something like 3.00000001 and ceil
rounded it up to 4. (Of course there is already an unrelated off by one bug mentioned in the previous section. This floating point use is a separate cause of concern).
The same thing applies to this line:
x /= (int)pow(base, (float)(length - size));
This could be rewritten to be a loop where you divide by base
once per loop iteration. By using pow()
and casting to int
, you run the risk of the result of pow
erroneously rounding down to the previous int.
Bug:
int_to_str()
fails for negativeint
.Undefined specifiers:
%h
and%H
are not part of the standard library. So without a specification, hard to know if they are performing correctly. Did you mean%x
and%X
?Mixing
int
andsize_t
math. This is pedantic point. As the max of those 2 types are not specified to which is larger, there is a worst case chancemax_size - chars_printed <= 0
will never be true shouldmax_size > INT_MAX
. Suggest adding the**
line below and avoid math that relies on signed math as it is likely unsigned math. or usesize_t chars_printed
and cope with returningint
at the end. (chars_printed
should be of the type with the greater positive range.)int my_snprintf(char *str, size_t max_size, const char *fmt, ...) { if (max_size > INT_MAX) Handle_PathologicalCase_TBD(); // ** int chars_printed = 0; ... // if (max_size - chars_printed <= 0) { if (max_size <= chars_printed) {
Bug
int length = (int)ceil(log((double)x)/log((double)base));
is not as reliable as hoped for. Detailed well in another answer. The alternative is to convert to a string with an internal max-sized buffer likechar buf[34]
forint32_t
in base 2. Then copy the buffer result.Style: fall through. Cases that drop though look like an error without a
break
. Add comment to show intentcase 'H': uppercase = 1; // fall though case 'h': base = 16; // fall though case 'd':
printf("Invalid format.\n");
and such are better printed tostderr
.fprintf(stderr, "Invalid format.\n");
Wrong type for
len
// int len; size_t len;
Style: No need to declare
d
so soon. Same forc
. Suggest type changeunsigned char c = va_arg(arg_list, int);
. Same forstr_arg
.// int num; // ... ~30 lines // case 'd': // num = va_arg(arg_list, int); case 'd': int num = va_arg(arg_list, int);
Minor: Code simplification
// if (uppercase) { // c = INT_TO_STR_DIGITS_U[r]; // ... if (uppercase) { c = "0123456789ABCDEF"[r]; } else { c = "0123456789abcdef"[r]; }
Corner bug: Below code fails (UB). Watch for
size == 0
my_snprintf(str, 0, fmt, ...)
[Edit] Bug:
int uppercase = 0, base = 10;
is initialized outside thefor()
loop. So a"%d"
after a"%x"
will be treated as hex. A"%h"
after a"%H"
will be treated as print with uppercase letters. Simple fix, moveint uppercase = 0, base = 10;
to afterelse if (fmt[i] == '%')
. Better fix: pass base and upper/lower as parameters to newint_to_str()
.char *start_str
unused. Recommend deletion.Variables/functions like
INT_TO_STR_DIGITS_L[]
andint_to_str()
that are only intended for local use should bestatic
. Unclear why all uppercase. Avoid long line that exceed presentation width// int INT_TO_STR_DIGITS_L[16] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };
static int int_to_str_digits_l[16] = "0123456789abcdef";
-
\$\begingroup\$ Thanks a lot, I am very pedantic so I appreciate number 3! I have a question though, what's the advantage of using
unsigned char
overchar
for the variablec
? \$\endgroup\$addison– addison2016年06月28日 14:32:18 +00:00Commented Jun 28, 2016 at 14:32 -
1\$\begingroup\$ @addison With
case 'c': c = va_arg(arg_list, int);
, usingunsigned char c = va_arg(arg_list, int);
is consistent with the C spec concerningfprintf(... "%c" ...)
: "theint
argument is converted to anunsigned char
, and the resulting character is written." C11 §7.21.6.1 8. As this code sends data to achar *
, not likely to make much difference ifchar
orunsigned char
is used. IAC, assigning anint
to aunsigned char c
is well defined. Not so sure assigning anint
to a signedchar
is so well defined. Note:the argument passed is anint
, perhaps not inchar
range. \$\endgroup\$chux– chux2016年06月28日 15:25:31 +00:00Commented Jun 28, 2016 at 15:25