As a small exercise I have written some string formatting and printing functions in C++11. I would like a bit of code review, not so much on the merits of using this over something like std::stringstream
or boost::format
, but simply whether or not my implementation makes sense. I realize that simply performing string replacements isn't that robust and a more sophisticated version would actually parse the format string. I especially would like to know if my use of things like rvalue references, std::forward
, std::move
and variadic templates are correct, as these are the main features I was focusing on in this exercise.
#include <boost/algorithm/string/replace.hpp>
#include <boost/lexical_cast.hpp>
#include <iostream>
#include <string>
#include <utility>
#include <cstdio>
namespace detail
{
template <unsigned N, class T>
void format_impl(std::string& str, T&& item)
{
char find[8];
snprintf(find, 8, "{%d}", N);
auto replace = boost::lexical_cast<std::string>(std::forward<T>(item));
boost::replace_all(str, find, replace);
}
template <unsigned N, class Head, class... Tail>
void format_impl(std::string& str, Head&& head, Tail&&... tail)
{
detail::format_impl<N>(str, std::forward<Head>(head));
detail::format_impl<N+1>(str, std::forward<Tail>(tail)...);
}
}
template <class... Args>
std::string format(const std::string& fmt, Args&&... args)
{
std::string temp = fmt;
detail::format_impl<0>(temp, std::forward<Args>(args)...);
return temp;
}
template <class... Args>
std::string format(std::string&& fmt, Args&&... args)
{
std::string temp = std::move(fmt);
detail::format_impl<0>(temp, std::forward<Args>(args)...);
return temp;
}
template <class... Args>
void print(const std::string& fmt, Args&&... args)
{
std::cout << format(fmt, std::forward<Args>(args)...);
}
template <class... Args>
void print(std::string&& fmt, Args&&... args)
{
std::cout << format(std::forward<std::string>(fmt), std::forward<Args>(args)...);
}
int main()
{
print("Hello, {0}! The answer is {1}.\n", "World", 42);
}
1 Answer 1
As you mention, the simple replacing is susceptible to problems if a replacement string includes a format specifier. But setting that aside, there are still a few things I would rather do differently.
Document. You should be sure to document your formatting language in comments; without reading the function, or the example in main
, it's impossible to know that the formatting language uses curly braces around indices starting at zero, whether you can repeat or skip indices, or what happens if you provide an out-of-bounds index.
Avoid C functions? This may be personal style, but I would prefer to avoid snprintf
in format_impl
. I would instead probably use something like "{" + std::to_string(N) + "}"
. The resulting local could then be marked const
, or the expression could go directly in the call to boost::replace_all
.
Remove redundancy. Jeffrey's comment about std::string&
and std::string&&
overloads is half right. It should be relatively rare that you need to provide both overloads. In places that are doing template type deduction on parameter T
, T&&
can be deduced as either type. Here there is no deduction so that cannot happen. (That's the part of the comment that was half wrong - it didn't apply here.)
Instead we have to look at how fmt
is being used. In format(const std::string&, ...)
this immediately makes a mutable copy called temp
. In format(std::string&&, ...)
, this moves the local to another mutable copy called temp
. From there, both functions are identical. I assert that there should only be one function here.
What function? Well, you want to modify the parameter, so it needs to be mutable. If the parameter is not mutable, you'll have to make a copy. So that rules out a const std::string&
. And as your example shows, you can't always convert to a std::string&&
. So what's the right option? Plan vanilla std::string
. If you call it with a const
, the caller makes a copy; the function receives a copy that it is free to mutate. If you call it with an rvalue, the caller moves it and the function receives a copy that it is free to mutate. So the result?
template <class... Args>
std::string format(std::string fmt, Args&&... args)
{
detail::format_impl<0>(fmt, std::forward<Args>(args)...);
return fmt;
}
The same argument applies to print
very similarly. The only question is if you'll want to explicitly mark the string it passes to format with a std::move
, or trust your compiler to figure it out. (In this case it's simple enough I'd probably leave it out. But it does serve as implicit documentation to a future reader, so I'll show it here.)
template <class... Args>
void print(std::string fmt, Args&&... args)
{
std::cout << format(std::move(fmt), std::forward<Args>(args)...);
}
Brush up on r-value references: Note that your implementation of print(std::string&& fmt, ...)
was incorrect. In particular, one of the two uses of std::forward
is misleading. Per Scott Meyers, it's useful to think of std::forward
as a “conditional” std::move
. It's useful in conjunction with the deduced T&&
parameters, which he calls “universal references.” You use this correctly in the code phrase std::forward<Args>(args)...
, which moves each movable sub-args
, determined by the type of the corresponding Args
, as deduced from the call site.
However the similar looking phrase std::forward<std::string>(fmt)
is not conditional at all. It will never move fmt
, as it is always casting fmt
to an lvalue type. If you were to implement this function (which I recommended against in the previous point), you would want std::forward<std::string&&>(fmt)
which is really just a long and hard to read way of writing std::move(fmt)
. This is correct because print
's fmt
is an l-value that was moved in, and now you want to move it onward.
The upshot This was a good exercise that ended up focusing a lot on std::move
, std::forward
, and r-value references in general. There don't appear to have been any functionality problems in your implementation, although it's always good to make sure with more than one test case. If you expand on this function's robustness and capability set (say fixing the limitations from the replace_all
approach, or adding a way to escape braced indices so that they aren't substituted) that could make another interesting review. Thanks!
-
\$\begingroup\$ Thanks a lot for the review! It seems like in the case of
print
pass-by-value is not going to be the most optimal in all scenarios. In the case of passing astd::string
lvalue forfmt
you will wind up performing a copy and a move. If performance in this case was critical, would it not be best to have aprint(const string&, ...)
and aprint(string&&, ...)
? That way you will only perform a move in the case of rvalue and only a copy in the case of lvalue. \$\endgroup\$Chris_F– Chris_F2014年07月07日 04:45:08 +00:00Commented Jul 7, 2014 at 4:45 -
\$\begingroup\$ Also, in the case of
format
, I can see why pass-by-value is preferable. However, I wonder if you have any opinions on allowing implementation details like this to dictate the function's interface. \$\endgroup\$Chris_F– Chris_F2014年07月07日 04:57:36 +00:00Commented Jul 7, 2014 at 4:57 -
\$\begingroup\$ @Chris_F For caller to
print
toformat
, in my answer there's either a copy or a move, followed by a move. Your proposal could turn copy+move into ref+copy, and wouldn't change move+move. Saving a single (and likely inlined) move seems like tiny savings, but profile it to be certain. As far as interfaces go, despite the two prototypes there's only one meaningful interface to learn, so it's only the implementation savings that's important. (As a general guideline, though, prefer the correct and readable interface unless you're dealing with a true bottleneck.) \$\endgroup\$Michael Urman– Michael Urman2014年07月07日 12:27:21 +00:00Commented Jul 7, 2014 at 12:27
const T&
andT&&
all over the place?T&&
implicitly allowsT&
andconst T&
due to reference collapsing. \$\endgroup\$