I wrote a function that basically works like sprintf
, but instead returns the formatted string (std::string
). I haven't worked on this long, and I'm aware there are tons of improvements that could be made (alloc
/realloc
perhaps?). I was just looking for some advice before taking the next step.
GLOBAL string format(const string& format, ...)
{
LOCAL const size_t initialSize = 64;
string returnVal;
char buffer[initialSize];
int length;
va_list args;
va_start(args, format);
{
length = vsnprintf(buffer, initialSize, format.c_str(), args);
}
va_end(args);
char bufferCorrectSize[length];
va_start(args, format);
{
vsnprintf(bufferCorrectSize, length + 1, format.c_str(), args);
}
va_end(args);
return bufferCorrectSize;
}
Ignore GLOBAL
and LOCAL
. They are just aliases for static
I defined for organization.
-
1\$\begingroup\$ Welcome to Code Review! Good job on your first question. \$\endgroup\$SirPython– SirPython2016年01月03日 23:07:50 +00:00Commented Jan 3, 2016 at 23:07
-
\$\begingroup\$ Thank you very much! I only discovered Code Review 10 minutes ago. \$\endgroup\$oliveryas01– oliveryas012016年01月03日 23:08:45 +00:00Commented Jan 3, 2016 at 23:08
3 Answers 3
Don't invent your own language through macros
GLOBAL
and LOCAL
might look cute but they make the code harder to read for everyone else. Not to mention the name-space pollution caused by such macros.
You say that they both expand to static
. That seems wrong. A function declared static
is not "very global" – it is private to its translation unit. A local variable declared static
is not "very local" – it is shared between all invocations of the function. This also seems wrong. If you want to use format
as a library function, it shouldn't be static
. And declaring the integer initialSize
as static
seems pointless at best.
#include
all required headers
Probably you have just left it out of your question but your code should always #include
all required header files. In your case, that would be
<cstdio>
forstd::vsnprintf
,<string>
forstd::string
,<cstddef>
forstd::size_t
and<cstdarg>
forva_start
,va_end
andstd::va_list
.
Since you're using string
without the std::
qualifier, I assume you also have a using namespace std;
somewhere in your code. Get rid of it and type the extra five characters instead. It makes it clearer where the types are coming from and doesn't pollute the global name-space.
Use namespace
s to organize your functions
Maybe you just didn't show it but you should put everything except main
in some namespace
. Instead of declaring stuff static
, consider using an anonymous namespace
. But since your function ought to be useful in multiple translation units, it probably shouldn't be static
anyway.
Properly allocate and deallocate dynamic memory
The line
char bufferCorrectSize[length];
is not correct. (Your compiler should have warned you about it.) In C++, an array must have a fixed size known at compile-time. Since the value of length
is only obtained at run-time from a previous function call, this is not possible. You could dynamically allocate a buffer of the wanted size using
char * bufferCorrectSize = new char[length];
and release it again once you're done with it.
delete[] bufferCorrectSize;
But this leaves you with all kinds of memory management issues. So instead of managing it yourself, use a std::vector<char>
.
Unfortunately, you cannot use the std::string
directly because its data
member function will only give you a const
buffer. If you know exactly what you're doing, you can take the address of the first character of the string and use it a s a pointer to a writable buffer. But I'd rather not play such tricks unless you're comfortable with what you're doing and the additional performance is critical.
Don't repeat yourself
You're calling std::vsnprintf
twice; basically with the same arguments. Also, there is no need to call it again if the buffer was large enough the first time. You can re-factor this into a loop.
Get rid of the useless nested scope
While a nested scope { ... }
can be useful from time to time, in your case, it serves no purpose. You're not even declaring any variables inside it.
Handle errors properly
If std::vsnprintf
encounters an error, it will return
a negative number. You should test for this and properly handle the error. throw
ing an exception is probably an adequate reaction. Don't forget to still call va_end
in this case.
Handle embedded NUL bytes correctly
You're return
ing bufferCorrectSize
which will decay to char *
. Unlike others have commented, this is not return
ing a pointer to a local object. Since format
is declared with a return
type of std::string
and the latter is implicitly constructible from a const char *
, you're safely return
ing the std::string
as intended. However, if there is a NUL byte somewhere in the buffer, you'll truncate. Therefore, you should use the constructor taking a pointer and a length and call it explicitly.
Eliminate unused variables
The variable returnVal
is never used. Your compiler might be able to warn you about this.
Make debugging easier with function attributes
The C-style printf
family of functions has a bad reputation of being notoriously not type-safe and responsible for numerous bugs that can lead to serious security holes. A good compiler like GCC can warn you if such a function is called with arguments that do not match the type specifiers in the format string. You can enable the same warnings for your function, too. Of course, GCC can only check the arguments if it can see the format string. Using an untrusted format string is a very bad idea anyway so you should consider making it a const char *
. If somebody really wants to call your function with a computed format string, they can always obtain a C-style string via the c_str
member function of std::string
. This will give them the compiler warning they deserve so they can think twice about their decision. Having format
accept a std::string
as format string also leads to the false assumption that one can embed NUL characters in it. But since you're going to call c_str
on it anyway, std::vsnprintf
will only ever see a NUL terminated string. Making the type of the format string const char *
avoids this confusion.
The following annotation informs GCC that the function accepts a printf
-style format string as parameter number 1 and parameters 2 and following are the format arguments.
std::string
format(const char *const format, ...)
__attribute__ ((format (printf, 1, 2)));
See the GCC manual for more information about the (arguably messy) syntax.
Document your interface
Add a comment to the declaration of each public function that describes precisely how to use it.
Putting it all together
Combining all of the above advice, we end up with the following two files.
my_format.hxx
:
#ifndef MY_FORMAT_HXX
#define MY_FORMAT_HXX
#include <string> // std::string
namespace my
{
/**
* @brief
* Like `std::sprintf` except that the formatted string is `return`ed
* as a `std::string`.
*
* This function internally forwards to `std::vsnprintf`. It accepts the
* exact same parameters and has the exact same gotchas as that function
* except that allocating a buffer of sufficient size is done automatically.
*
* @param format
* C-style format string
*
* @param ...
* any format arguments required by the given format string
*
* @returns
* the formatted string
*
* @throws std::runtime_error
* if `std::vsnprintf` fails
*
* @throws std::bad_alloc
* if insufficient memory is available
*
*/
std::string
format(const char *const format, ...)
__attribute__ ((format (printf, 1, 2)));
} // namespace my
#endif // #ifndef MY_FORMAT_HXX
my_format.cxx
:
#include "my_format.hxx"
#include <cstdarg> // va_start, va_end, std::va_list
#include <cstddef> // std::size_t
#include <stdexcept> // std::runtime_error
#include <vector> // std::vector
namespace my
{
std::string
format(const char *const format, ...)
{
auto temp = std::vector<char> {};
auto length = std::size_t {63};
std::va_list args;
while (temp.size() <= length)
{
temp.resize(length + 1);
va_start(args, format);
const auto status = std::vsnprintf(temp.data(), temp.size(), format, args);
va_end(args);
if (status < 0)
throw std::runtime_error {"string formatting error"};
length = static_cast<std::size_t>(status);
}
return std::string {temp.data(), length};
}
} // namespace my
-
\$\begingroup\$ Thank you very much. A lot of useful info. Half of it I actually added after I posted, such as proper allocation \$\endgroup\$oliveryas01– oliveryas012016年01月06日 01:42:12 +00:00Commented Jan 6, 2016 at 1:42
-
\$\begingroup\$ What exactly is 'attribute'? \$\endgroup\$oliveryas01– oliveryas012016年01月10日 09:03:46 +00:00Commented Jan 10, 2016 at 9:03
-
\$\begingroup\$ It's a compiler-specific extension to the C++ language to allow specifying additional information about a function. Please see the link in the answer. \$\endgroup\$5gon12eder– 5gon12eder2016年01月11日 00:26:39 +00:00Commented Jan 11, 2016 at 0:26
-
\$\begingroup\$ Very thorough answer! As to GCC-specific attribute, I would recommend wrapping it into a macro, so that it would not break compilation with other compilers. Until you only target GCC. \$\endgroup\$Alex Che– Alex Che2024年03月26日 21:06:16 +00:00Commented Mar 26, 2024 at 21:06
Do not return local array
char bufferCorrectSize[length]; .... return bufferCorrectSize;
The
bufferCorrectSize
is DOA: it goes out of scope when the function returns. The caller obtains some address, with nothing behind it. You must make it withbufferCorrectSize = new char[length]
.Initial array is superfluous
Nothing is ever done with
buffer
.nullptr
is as good.length + 1
The target area is
length
bytes long, but you promise tovsnprintf
that it has alength + 1
arena to play with.Nit pick
An unnecessary scoping like
{ length = ... }
looks really strange.
-
\$\begingroup\$ Note that the function's return type is
std::string
which has a constructor taking aconst char *
. Writingreturn bufferCorrectSize;
implicitly calls that constructor while the buffer is still good and then returns thestd::string
so this part is (maybe accidentally) safe. There are other problems with the code, however. \$\endgroup\$5gon12eder– 5gon12eder2016年01月04日 12:27:20 +00:00Commented Jan 4, 2016 at 12:27
If you want to return the string then resize it explicitly instead of relying in the implicit construction in the return:
returnVal.resize(length);
va_start(args, format);
{
vsnprintf(returnVal.data(), length + 1, format.c_str(), args);
}
va_end(args);
return returnVal;
-
\$\begingroup\$ Unfortunately,
c_str
returns aconst char *
sou you cannot pass it tovsnprintf
as buffer. \$\endgroup\$5gon12eder– 5gon12eder2016年01月04日 12:25:47 +00:00Commented Jan 4, 2016 at 12:25 -
\$\begingroup\$ @5gon12eder use
data()
then or&returnVal.front()
\$\endgroup\$ratchet freak– ratchet freak2016年01月04日 12:28:28 +00:00Commented Jan 4, 2016 at 12:28 -
\$\begingroup\$ Too bad,
data
is alsoconst
.&returnVal[0]
could work but it seems dangerous. It actually makes sense to forbid writing to astd::string
's bytes since doing so could silently invalidate the length of the string. \$\endgroup\$5gon12eder– 5gon12eder2016年01月04日 12:30:52 +00:00Commented Jan 4, 2016 at 12:30
Explore related questions
See similar questions with these tags.