One of the most exciting new functions in C++20 is std::format
. I have never liked the ways that strings are formatted in C++ and have been guilty many times of using sprintf
and such.
Since std::format
is not implemented by any of the major compilers at this time, I thought I would try out my own very basic implementation and this is what I have come up with:
#include <string_view>
#include <iostream>
#include <string>
#include <sstream>
#include <stdexcept>
void format_helper(std::ostringstream& oss, std::string_view str) // base function
{
oss << str;
}
template<typename T, typename... Targs>
void format_helper(std::ostringstream& oss, std::string_view str, T value, Targs... args)
{
std::size_t openBracket = str.find_first_of('{');
if (openBracket != std::string::npos)
{
std::size_t closeBracket = str.find_first_of('}', openBracket + 1);
if (closeBracket == std::string::npos)
throw std::runtime_error("missing closing bracket.");
oss << str.substr(0, openBracket);
oss << value;
format_helper(oss, str.substr(closeBracket + 1), args...);
return;
}
oss << str;
}
std::string format(std::string_view str)
{
return std::string(str);
}
template<typename T, typename... Targs>
std::string format(std::string_view str, T value, Targs...args)
{
std::ostringstream oss;
format_helper(oss, str, value, args...);
return oss.str();
}
int main()
{
int a = 5;
double b = 3.14;
std::string c("hello");
// correct number of arguments
std::cout << format("a = {}, b = {}, c = {}", a, b, c) << "\n";
// too few arguments
std::cout << format("a = {}, b = {}, c = {}", a, b) << "\n";
// too many arguments
std::cout << format("a = {}, b = {}, c = {}", a, b, c, 12.4) << "\n";
}
2 Answers 2
Library includes are almost sorted, but <string_view>
is out of place - perhaps an oversight?
We can use a fold-expression to avoid having to write specific recursion in format_helper()
(yes, I know that's usually not a recursive call, just template expansion). See below.
format_helper()
should be in an implementation namespace, where it's less visible to client code.
I wouldn't throw an exception for the unterminated {}
in the format string. At first glance, this looks like it's a programming error, but consider that in a production program, the format string probably comes from a translation database. It's probably more useful to translators to see the unreplaced part of the string than to get a much more vague exception message (that doesn't even indicate which format string caused it).
There's no reason to use find_first_of(char)
rather than the shorter find(char)
. In fact, I'm not sure why the latter is provided.
Fold-expression version
From C++17 onwards, we have fold expressions, which make variadic templates much, much simpler to write and to reason about. For us, it means only one version of each function.
The key here is that we want to pass the string view by reference, so that printing each argument adjusts the format string seen by the remaining ones.
template<typename T>
void format_helper(std::ostringstream& oss,
std::string_view& str, const T& value)
{
std::size_t openBracket = str.find('{');
if (openBracket == std::string::npos) { return; }
std::size_t closeBracket = str.find('}', openBracket + 1);
if (closeBracket == std::string::npos) { return; }
oss << str.substr(0, openBracket) << value;
str = str.substr(closeBracket + 1);
}
template<typename... Targs>
std::string format(std::string_view str, Targs...args)
{
std::ostringstream oss;
(format_helper(oss, str, args),...);
oss << str;
return oss.str();
}
At this point, we could (arguably should) bring the helper inside format
as a named lambda, so we define only a single function.
-
\$\begingroup\$ If you just use it to find a single
char
, thenfind_first_of()
would indeed be redundant, but if you pass it a string thenfind_first_of()
andfind()
do something different. \$\endgroup\$G. Sliepen– G. Sliepen2021年10月27日 16:34:27 +00:00Commented Oct 27, 2021 at 16:34 -
\$\begingroup\$ @G.Sliepen: yes, that's why I was specific about the overload that takes a
char
. \$\endgroup\$Toby Speight– Toby Speight2021年10月27日 19:10:31 +00:00Commented Oct 27, 2021 at 19:10 -
\$\begingroup\$ Symmetry I guess? It would be weird if it accepts everything that
find()
does except achar
. Butfind()
indeed makes more sense here. \$\endgroup\$G. Sliepen– G. Sliepen2021年10月27日 21:06:19 +00:00Commented Oct 27, 2021 at 21:06 -
\$\begingroup\$ Thanks for your review, especially re. fold expressions! Is there any benefit over using a named lambda over simply
static void format_helper
? \$\endgroup\$jdt– jdt2021年10月28日 13:05:47 +00:00Commented Oct 28, 2021 at 13:05 -
\$\begingroup\$ One benefit is almost the same as we could get by writing
namespace format_detail { template<⋯> void format_helper(⋯); }
: it's not polluting the client code's namespace (remember that, being a template, the name must be visible in every TU that instantiatesformat()
; we don't gain much fromstatic
linkage). The other benefit, and this one is more subjective, is that it's declared right by its only use, so everything is in one place in the source. That's only a slight benefit, especially when the source file comprises just those two functions. \$\endgroup\$Toby Speight– Toby Speight2021年10月28日 13:14:20 +00:00Commented Oct 28, 2021 at 13:14
If you pass a string with no arguments, it will not do any checking for embedded control sequences and will print the string as-is without any error and without handling escaped out sequences.
You should break out the brace finding into separate functions to make it easier to extend, such as adding "{{"
to mean a literal open-brace character, and make the contents inside the braces available as formatting options.
The error should mention "}" rather than just "closing bracket" since someone reading this error might not realize you're referring to the curly brace. For that matter, it should indicate the fact that the error comes from the formatting code, and ought to show the bad string.
You should use a single character rather than a nul-terminated character string when you only have a single character. cout << '\n';
will be more efficient and generate less code than using "\n"
.
All in all, it's good and simple. Something like this was an early example for the power for variadic templates and how that could replace the ostream paradigm.
For your own stuff, you don't have to make it look somewhat similar to the fmt
library, but could use your own (simple) conventions for indicating replacements and how they can themselves be escaped out.
\n
in the format string :) \$\endgroup\$