Correct usage of snprintf
involves very long and repetitive lines:
if (snprintf(buff, sizeof(buff), format, ...) >= (int)sizeof(buff))
goto err;
I first encapsulated this macro: #define SSIZEOF(x) ((ssize_t)sizeof(x))
. The result is I now have a slightly shorter line:
if (snprintf(buff, sizeof(buff), format, ...) >= SSIZEOF(buff))
goto err;
But I'm not yet happy. Writing sizeof
twice is still very long and annoying, and more if the buffer has a relatively long name.
I decided to do a macro that behaves this way (with the same exact safety) (The b
in the name stands for buffer):
if (sbprintf(buff, format, ...))
goto err;
It needs to be a macro to avoid the array decaying to a pointer, so that I can still use sizeof(buff)
inside the macro.
sbprintf.h
:
/******************************************************************************
******* include guard ********************************************************
******************************************************************************/
#ifndef ALX_STDIO_SBPRINTF_H
#define ALX_STDIO_SBPRINTF_H
/******************************************************************************
******* headers **************************************************************
******************************************************************************/
#include <stdio.h>
#include "libalx/base/assert/assert.h"
#include "libalx/base/compiler/size.h"
/******************************************************************************
******* macros ***************************************************************
******************************************************************************/
/* int sbprintf(char buff[restrict], const char *restrict format, ...); */
#define sbprintf(buff, format, ...) ( \
{ \
alx_static_assert_array(buff); \
\
snprintf(buff, sizeof(buff), format, ##__VA_ARGS__) >= SSIZEOF(buff) \
} \
)
/******************************************************************************
******* include guard ********************************************************
******************************************************************************/
#endif /* libalx/base/stdio/sbprintf.h */
Used macros (defined in other headers from my library (libalx)):
#include <assert.h>
#include <sys/types.h>
#define alx_same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
#define alx_static_assert_array(a) do \
{ \
static_assert(!alx_same_type((a), &(a)[0]), "Not an array!"); \
} while (0)
#define SSIZEOF(x) ((ssize_t)sizeof(x))
I use GCC8 and C17 (gcc -std=gnu17 -Wall -Wextra -Werror
), but if there is any easy and not very ugly fix that helps portability, it is welcome. The same about C++: This is about C (I'll ask the same question about C++ after this one has been answered), but if there is any fix that would improve compatibility with C++, it is also welcome.
What do you think about sbprintf
? Feel free to comment the other macros, of course!
EDIT:
The actual name of the macro in my library is alx_sbprintf
to avoid using a name that may end up being used by a future implementation.
Related links:
4 Answers 4
This is a good and useful idea. The name is intuitive and memorable.
My first thought was that if a pointer were passed, then we'd use the size of the pointer, but you've found a good way to ensure only arrays are passed.
Obviously, this means that we can't use this for those occasions where we build up the content in pieces (with several prints into a larger array), but that's less common than having a plain char[]
we're allowed to fill, so it's probably reasonable to require callers to use snprintf()
directly in those cases.
Although I'd probably provide SSIZEOF(x)
, I wouldn't depend on it in sbprintf()
- there's no great overhead to writing it in full here.
Although we normally enclose macro arguments in ()
to prevent higher-precedence operators tearing expressions apart, I think you're right that it's not needed for the first use of buff
, where it's a single argument to a function call. I don't think there's a valid use where sizeof
would break an expression of array type, but I'm willing to be defensive there.
One concern is that we now get a boolean result, but have lost the actual number of characters written, necessitating a call to strlen()
if we need the length (e.g. to compose a network protocol packet). It might be possible to write a version that also stores the actual length written:
/* untested */
#define sbprintf(buff, written, format, ...) \
( { \
alx_static_assert_array(buff); \
\
(*written = snprintf(buff,sizeof(buff),format, ##__VA_ARGS__)) \
>= (int)sizeof(buff) \
} )
Of course, a really usable version of this would allow written
to be a null pointer, to discard the result. That's not hard, but I'm feeling too lazy to write that myself.
-
1\$\begingroup\$ Thank you very much! I wrote it (see my answer) :) \$\endgroup\$alx - recommends codidact– alx - recommends codidact2019年06月28日 10:14:52 +00:00Commented Jun 28, 2019 at 10:14
-
-
1\$\begingroup\$ @CacahueteFrito Set aside type concerns: consider value: With
sbprintf(buff, written, "%c%c%c", 'a', 0, 'b')
Do you want the return value ofsnprintf()
or "the actual length of the string" \$\endgroup\$chux– chux2019年07月01日 13:28:20 +00:00Commented Jul 1, 2019 at 13:28 -
1\$\begingroup\$ @chux My thoughts: The theoretical value that would have been printed if there had been enough space isn't very useful (it's main use is reporting truncation, and we already have that). The length of the string can only be calculated by using
strlen
, and that would be heavy; also, the case that someone prints a NUL in a buffer and still wants to use it as a string is, at least, rare. The third option is the actual number of characters written, which is the return value ofsnprintf
truncated bysizeof(buff) - 1
. That one is simple to get, and I think the most useful. \$\endgroup\$alx - recommends codidact– alx - recommends codidact2019年07月01日 15:14:16 +00:00Commented Jul 1, 2019 at 15:14 -
2\$\begingroup\$ @CacahueteFrito Re: Option 3: If the value returned included the written append null character, then a return of 0 is error. Useful, yet a different-by-1 paradigm. IAC, good documentation that reflects these corner cases is important. \$\endgroup\$chux– chux2019年07月02日 00:38:20 +00:00Commented Jul 2, 2019 at 0:38
Pedantically, if (snprintf(buff, sizeof(buff), format, ...) >= SSIZEOF(buff)) goto err;
is an insufficient test. Test for < 0
is also needed.
The
snprintf
function returns the number of characters that would have been written hadn
been sufficiently large, not counting the terminating null character, or a negative value if an encoding error occurred.... C111 §7.21.6.5 3
Note that ssize_t
itself is not in standard C.
To cope with both issues, perform 2 sided test and drop using ssize_t
.
int i = snprintf(buff, sizeof(buff), format, ...);
if (i < 0 || (unsigned) i >= sizeof(buff)) goto err;
... or if INT_MAX <= UINT_MAX/2
(a very common implementation), code can cheat with a one sided test as i<0
coverts to a large unsigned
. Again no need for ssize_t
.
int i = snprintf(buff, sizeof(buff), format, ...);
if ((unsigned) i >= sizeof(buff)) goto err;
-
1\$\begingroup\$ True, I thought about it once and discarded adding that to my code because it would mean adding a variable and many lines, but now that it is inside a macro it is doable :) \$\endgroup\$alx - recommends codidact– alx - recommends codidact2019年06月29日 08:05:15 +00:00Commented Jun 29, 2019 at 8:05
No need to resort to non-portable GNU C extensions:
1. Replacement for GNU C's __builtin_types_compatible_p()
:
The intent I believe is to differentiate between arrays and pointers. ISO C has _Generic
, which makes this relatively easy to determine. Consider:
/**
* Compile-time check of whether T is an array.
*
* T - An expression. It is not evaluted.
*
* Note: IS_ARRAY() distinguishes between arrays and pointers, not between
* arrays and arbitrary other types.
*
* Returns 1 (true) only if T is an array; 0 (false) elsewise.
*
* See also: https://stackoverflow.com/a/77881417/99089 */
#define IS_ARRAY(T) \
_Generic( &(T), \
typeof(*T) (*)[] : 1, \
default : 0 \
)
See the linked StackOverflow post for an explanation of how this works. Now we can modify alx_static_assert_array()
to use this instead.
typeof()
is part of C23. Prior to that, GCC, Clang, and others have supported it as an extension for a long time, and your code already uses it. Code can conditionally define it to expand to typeof
in C23, or __typeof__
(or whatever GCC et cetera recognizes) elsewise. I show how to conditionally do so below for nodiscard
attribute.
Replacement for GNU C's warn_unused_result
:
C23 has added function specifier sequences, one of which is nodiscard
. Consider:
#define C23_PLACEHOLDER 202000L
#if define(__STDC_VERSION__) && __STDC_VERSION__ >= C23_PLACEHOLDER
#define ATTRIB_NODISCARD [[nodiscard]]
#elif defined(__GNUC__) || defined(__clang__) || defined(__INTEL_LLVM_COMPILER)
#define ATTRIB_NODISCARD __attribute((warn_unused_result))
#else
#define ATTRIB_NODISCARD /**/
#endif
The C23_PLACEHOLDER
is required because GCC 14.1 with -std=c23
still defines __STDC_VERSION__
to 202000L.
Replacement for GCC's -Werror=sizeof-pointer-div
: detect if ARRAY_SIZE()
is passed a pointer:
First, we will define another version of static_assert
that can be used in an expression:
/**
* Like C11's _Static_assert() except that it can be used in an expression.
*
* EXPR - The expression to check.
* MSG - The string literal of the error message to print only if EXPR evalutes
* to false.
*
* Always returns true. */
#define STATIC_ASSERT_EXPR(EXPR, MSG) \
(!!sizeof( struct { static_assert ( (EXPR), MSG ); char c; } ))
The trick here is that static_assert
can be used anywhere, even in a struct
definition. sizeof
return the size of the struct
, which !!
converts it to 1. char c
exists just so that the struct
is never empty. If the assertion fails, the code would not compile. If it succeeded, STATIC_ASSERT_EXPR
would return 1 (true
). This is better than the compiler flag because it does more than just warn, it causes compilation to fail.
Now we can modify ARRAY_SIZE()
to use this, which I prefer to call ARRAY_CARDINALITY()
:
/**
* Gets the number of elements of the given array. */
#define ARRAY_CARDINALITY(ARRAY) ( \
sizeof(ARRAY) / sizeof(0[ARRAY]) \
* STATIC_ASSERT_EXPR( IS_ARRAY(ARRAY), #ARRAY " must be an array" ))
0[ARRAY]
instead of ARRAY[0]
because in C++, trying to use ARRAY_CARDINALITY
on an object of a class
for which operator[]
has been overloaded would result in a compilation error, which is what one would want.
Now the only GNU C specific thing in your code is the statement expression, for which I know of no ISO C version. I believe the C committee is working on something similar.
This is the code that Toby Speight suggested in his answer:
/*
* int sbprintf(char buff[restrict], int *restrict written,
* const char *restrict format, ...);
*/
#define sbprintf(buff, written, format, ...) ( \
{ \
int len_; \
\
alx_static_assert_array(buff); \
\
len_ = snprintf((buff), sizeof(buff), format, ##__VA_ARGS__);\
if (written != NULL) \
*written = len_; \
len_ >= (int)sizeof(buff); \
} \
)
I have tested it and works as expected:
- If
written
isNULL
it doesn't write into it. - The return value is
true
when the string is truncated andfalse
otherwise. - It doesn't compile if
buff
is not an array. - It accepts a variable number of arguments after
format
, including no arguments.
The comment above the macro is the prototype that a user should see, to better understand the usage, although a real function with that prototype wouldn't work because of the array decaying to a pointer.
Comments about the style:
I tried to follow the Linux Kernel Coding Style, but there have been exceptions:
if (written != NULL)
is used instead of if (written)
to avoid the compiler complaining:
main.c:22:23: error: the address of ‘w1’ will always evaluate as ‘true’ [-Werror=address]
if (alx_sbprintf(b1, &w1, test))
^
.../libalx/base/stdio/sbprintf.h:36:6: note: in definition of macro ‘alx_sbprintf’
if (written) \
^~~~~~~
EDIT:
Given that snprintf
sets errno
(at least in POSIX), it would be good to set errno
to ENOMEM
on truncation.
EDIT:
This version includes the improvements suggested by Toby Speight and chux, as well as setting errno
on error. First I added code to the macro, but it proved to have some problems (at least it didn't compile, so no dangerous bugs). Now it's an extern
function enclosed in a simple macro.
Now the code relies less on GCC extensions.
Properties:
- If
written
isNULL
it doesn't write into it. - It doesn't compile if
buff
is not an array. - It accepts a variable number of arguments after
format
, including no arguments. - Sets
errno
on any error. - If there's a
snprintf
internal error, the error code is negative (-errno
), andwritten
is also negative. - If the string is truncated, the error code is positive (
ENOMEM
). - If the error code is negative, the string should not be trusted; if it's positive, it's been truncated, but it's a valid string.
Code:
sbprintf.h
:
#ifndef ALX_STDIO_PRINTF_SBPRINTF_H
#define ALX_STDIO_PRINTF_SBPRINTF_H
#include "libalx/base/assert/assert.h"
#include "libalx/base/stdio/printf/swnprintf.h"
/*
* int sbprintf(char buff[restrict], ptrdiff_t *restrict written,
* const char *restrict format, ...);
*/
#define sbprintf(buff, written, fmt, ...) ( \
{ \
\
alx_static_assert_array(buff); \
swnprintf(buff, written, sizeof(buff), fmt, ##__VA_ARGS__); \
} \
)
#endif /* libalx/base/stdio/printf/sbprintf.h */
swnprintf.h
:
#ifndef ALX_STDIO_PRINTF_SWNPRINTF_H
#define ALX_STDIO_PRINTF_SWNPRINTF_H
#include <stddef.h>
int swnprintf(char str[restrict], ptrdiff_t *restrict written, ptrdiff_t nmemb,
const char *restrict format, ...);
#endif /* libalx/base/stdio/printf/swnprintf.h */
swnprintf.c
:
#include "libalx/base/stdio/printf/swnprintf.h"
#include <errno.h>
#include <stdarg.h>
#include <stddef.h>
#include <stdio.h>
int swnprintf(char str[restrict], ptrdiff_t *restrict written, ptrdiff_t nmemb,
const char *restrict format, ...)
{
va_list ap;
int len;
if (nmemb < 0)
goto neg;
va_start(ap, format);
len = vsnprintf(str, nmemb, format, ap);
va_end(ap);
if (written != NULL)
*written = len;
if (len < 0)
goto err;
if (len >= nmemb)
goto trunc;
return 0;
trunc:
if (written)
*written = nmemb - 1;
errno = ENOMEM;
return ENOMEM;
neg:
errno = EOVERFLOW;
err:
if (written)
*written = 0;
return -errno;
}
Edit:
Modified to use ptrdiff_t
: it allows to detect invalid (negative) sizes, instead of using their unsigned value. Also removes a cast.
Modified to set written
to 0 on error when the string is unreliable.
The return value should always be used:
__attribute__((warn_unused_result))
Add checks to the format and varargs:
__attribute__((format(printf, 4, 5)))
EDIT:
When using __attribute__((warn_unused_result))
, the macro sbprintf
silences the warning because of the way it works; to warn the user, the following code can be used:
#define ARRAY_SIZE(a) (sizeof(a) / sizeof((a)[0]))
#define sbprintf(buff, written, fmt, ...) \
swnprintf(buff, written, ARRAY_SIZE(buff), fmt, ##__VA_ARGS__)
Note: GCC 8 warns (-Werror=sizeof-pointer-div
) if buff is not an array without needing a static_assert
, when using ARRAY_SIZE
defined this way. If using an old version of GCC, you will need to use some tricks to block compilation.
-
1\$\begingroup\$ 1: Does the macro "return" anything? Do you want
,
instead of;
? 2: Consider(unsigned) len_ >= sizeof(buff)
\$\endgroup\$chux– chux2019年06月29日 16:27:39 +00:00Commented Jun 29, 2019 at 16:27 -
\$\begingroup\$ @chux 1: Of course macros don't return, but compound statements simulate that return value. That's why I said "return". gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html I didn't want
,
, a;
works fine (I assume you meant after the last statement). \$\endgroup\$alx - recommends codidact– alx - recommends codidact2019年06月29日 16:38:21 +00:00Commented Jun 29, 2019 at 16:38 -
\$\begingroup\$ The "return" is a gcc thing, per your link, a C extension that I was unaware. I suspect a C compliant version is possible \$\endgroup\$chux– chux2019年06月29日 16:44:35 +00:00Commented Jun 29, 2019 at 16:44
-
\$\begingroup\$ @chux Oh, now I understand what you meant. In this case, I prefer to use the extension; I think it's more reliable than the comma, I think: I can use almost anything and will work (
__auto_type
,goto
, ...). The comma has too many different meanings, so I don't trust it very much. I'll add the code I have with errno and goto so that you can see what I mean. \$\endgroup\$alx - recommends codidact– alx - recommends codidact2019年06月29日 18:03:35 +00:00Commented Jun 29, 2019 at 18:03
char []
) (not vectors or weird things)? I was thinking of using macros, but that sounds interesting. \$\endgroup\$char
array, rather than any other kind, if we want). \$\endgroup\$