This was part of a code challenge. I use them to help me learn c++/c# along with tutorials and guides. Basically I was taking a string...
The quick brown fox jumps over the lazy dog.
...and reversing the words in the string like this.
ehT kciuq nworb xof spmuj revo eht yzal .god
The following code works and compiles. However, I feel like this was a verbose way to achieve the objective.
split
This function was a copy paste from another Stack Overflow page. I'm not sure if this is idiomatic in c++, but it feels like an overkill. I just used it because it worked when I tested it on single word strings.
vector_to_string
I came up with this function because I had forgotten how to do this in c++. I recall a built in function somewhere in its library.
reverse_words
This was the provided function for the challenge. Everything inside of the {} is what I came up with.
Everything in the int main() I came up as well for testing. Test are provided, but I haven't quite learned how to implement this offline yet. Following is my code, with the test after that.
#include <string>
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
vector<string> split(string s, string delimiter)
{
size_t pos_start = 0, pos_end, delim_len = delimiter.length();
string token;
vector<string> res;
while ((pos_end = s.find(delimiter, pos_start)) != string::npos)
{
token = s.substr(pos_start, pos_end - pos_start);
pos_start = pos_end + delim_len;
res.push_back(token);
}
res.push_back(s.substr(pos_start));
return res;
}
string vector_to_string(vector<string> v)
{
string s;
for (auto &e : v)
{
reverse(e.begin(), e.end());
s += e + " ";
}
return s;
}
string reverse_words(string str)
{
vector<string> v = split(str, " ");
string s = vector_to_string(v);
return s.substr(0, s.length() - 1);
}
int main()
{
cout << reverse_words("The quick brown fox jumps over the lazy dog.") << endl;
cout << reverse_words("apple") << endl;
cout << reverse_words("a b c d") << endl;
cout << reverse_words("") << endl;
cout << reverse_words("ab ba cd") << endl;
}
I was provided with these tests:
Describe(Tests) { It(Sample_Test_Cases) { Assert::That(reverse_words("The quick brown fox jumps over the lazy dog."), Equals("ehT kciuq nworb xof spmuj revo eht yzal .god")); Assert::That(reverse_words("apple"), Equals("elppa")); Assert::That(reverse_words("a b c d"), Equals("a b c d")); Assert::That(reverse_words("double spaced words"), Equals("elbuod decaps sdrow")); Assert::That(reverse_words(""), Equals("")); Assert::That(reverse_words("ab ba cd"), Equals("ba ab dc")); } };
Anyhow, any advice on this is much appreciated and thank you for your help. Again, I feel like this is a verbose way to solve this problem and I suspect it can be greatly optimized as well.
-
2\$\begingroup\$ Which test framework is that? Making it clear could help those who want to reproduce the results. BTW, kudos for including the tests - that puts you a cut above a vast majority of Code Review questions! \$\endgroup\$Toby Speight– Toby Speight2020年02月27日 12:24:30 +00:00Commented Feb 27, 2020 at 12:24
-
\$\begingroup\$ @TobySpeight I don't know, which is part of the problem for myself in learning how to do that offline. It was what the website gives you. I assume its partially there because the rest of what is needed is hidden. \$\endgroup\$Milliorn– Milliorn2020年02月27日 21:03:26 +00:00Commented Feb 27, 2020 at 21:03
-
\$\begingroup\$ Ah, I didn't realise that you didn't write the tests yourself. I've made an edit that should clarify that. \$\endgroup\$Toby Speight– Toby Speight2020年03月02日 09:46:14 +00:00Commented Mar 2, 2020 at 9:46
3 Answers 3
The split
function
vector<string> split(string s, string delimiter) { size_t pos_start = 0, pos_end, delim_len = delimiter.length(); string token; vector<string> res; while ((pos_end = s.find(delimiter, pos_start)) != string::npos) { token = s.substr(pos_start, pos_end - pos_start); pos_start = pos_end + delim_len; res.push_back(token); } res.push_back(s.substr(pos_start)); return res; }
delim_len
should be declared as const
, because it doesn't change. Or simply drop it; it is only used in one place.
This function forces all substrings to be first copied into the token
variable and then copied into the vector — that's two copies. It can be reduced to one copy by using std::move
(defined in <utility>
) on the argument to push_back
, or by simply dropping the token
variable.
The function should have std::string_view
(defined in <string_view>
) parameters instead of std::string
parameters for efficiency.
Putting everything together:
auto split(std::string_view s, std::string_view delimiter)
{
std::vector<std::string> result;
std::size_t pos_start = 0, pos_end;
while ((pos_end = s.find(delimiter, pos_start)) != s.npos) {
res.push_back(s.substr(pos_start, pos_end - pos_start));
pos_start = pos_end + delimiter.size();
}
res.push_back(s.substr(pos_start));
return res;
}
vector_to_string
string vector_to_string(vector<string> v) { string s; for (auto &e : v) { reverse(e.begin(), e.end()); s += e + " "; } return s; }
vector_to_string
is a non-descriptive name — reverse_join
may be better. The delimiter should be an argument rather than hard coded.
This function should strip the last space instead of letting its caller do so. Note that this function performs two operations logically (reverse and join).
Copying the whole vector takes a lot of space; consider copying one string at a time.
And yes, there is a simpler way to do this using std::accumulate
(defined in <numeric>
):
auto result = std::accumulate(v.begin(), v.end(), std::string{},
[](std::string lhs, std::string_view rhs) {
lhs += rhs;
lhs += ' ';
return lhs;
});
result.pop_back();
reverse_words
string reverse_words(string str) { vector<string> v = split(str, " "); string s = vector_to_string(v); return s.substr(0, s.length() - 1); }
Again, std::string_view
parameter. Even if the stripping happens here, it should be done using pop_back
to avoid copying the string and enable named return value optimization.
Simplification
Here's a simplification, using streams to do the split job:
#include <cassert>
#include <iostream>
#include <sstream>
#include <string>
#include <string_view>
std::streamsize read_spaces(std::istream& is)
{
std::streamsize count = 0;
while (is.peek() == ' ') {
is.get();
++count;
}
return count;
}
// const std::string& because std::istringstream is constructed from a std::string
std::string reverse_words(const std::string& string)
{
std::istringstream iss{string};
std::ostringstream oss{};
for (std::string word; iss >> word;) {
std::reverse(word.begin(), word.end());
oss << word << std::string(read_spaces(iss), ' ');
}
return oss.str();
}
int main()
{
assert(reverse_words("The quick brown fox jumps over the lazy dog.") == "ehT kciuq nworb xof spmuj revo eht yzal .god");
assert(reverse_words("apple") == "elppa");
assert(reverse_words("a b c d") == "a b c d");
assert(reverse_words("double spaced words") == "elbuod decaps sdrow");
assert(reverse_words("") == "");
assert(reverse_words("ab ba cd") == "ba ab dc");
}
Miscellaneous
#include <string> #include <iostream> #include <vector> #include <algorithm>
Sort the #include
s in alphabetical order to each navigation.
using namespace std;
Avoid global using namespace std;
. It brings in many common identifiers from the std
namespace and causes name clashes. Although it doesn't matter too much for a small program, get into the habit of explicitly qualifying names with std::
instead. An alternative is using declarations (using std::string, std::vector;
), which only pulls in the specified names. See Why is using namespace std;
considered bad practice? for more information.
-
\$\begingroup\$ Your
reverse_words()
could accept string by value, and thenstd::move()
it to the string stream's constructor. That avoids copying a temporary (e.g. in every test case here, where astd::string
is constructed from a string literal). \$\endgroup\$Toby Speight– Toby Speight2020年02月27日 13:28:56 +00:00Commented Feb 27, 2020 at 13:28 -
1\$\begingroup\$ @TobySpeight The
std::string&&
constructor ofstd::istringstream
is added in C++20 according to cppreference; the C++17 version only accepted aconst std::string&
and copied it, sostd::move
won't help. I don't think it's possible to construct directly into the stream ... Anyway, your approach is much better than mine. \$\endgroup\$L. F.– L. F.2020年02月28日 02:18:04 +00:00Commented Feb 28, 2020 at 2:18 -
\$\begingroup\$ "
delim_len
should be declared as const, because it doesn't change. Or simply drop it; it is only used in one place." <-- While, yes, it is used in 1 place, it is used in every iteration of thewhile
loop. Wouldn't it (slightly) hurt performance to just drop it? In Python, it doesn't matter, but on PHP and JavaScript it matters a lot. I think it does in C/C++ too, but I'm not sure. \$\endgroup\$Ismael Miguel– Ismael Miguel2020年02月28日 09:59:16 +00:00Commented Feb 28, 2020 at 9:59 -
\$\begingroup\$ @IsmaelMiguel libstdc++ stores the length of the string. libc++ does test
__is_long()
insize()
, so you're probably right, but I don't think the overhead will be noticeable (especially given that the implementation itself usessize()
all over the place). C is another story - different languages handle strings differently. \$\endgroup\$L. F.– L. F.2020年02月28日 10:09:56 +00:00Commented Feb 28, 2020 at 10:09 -
\$\begingroup\$ Well, if C++ stores the length, then probably it will be a micro-optimization to keep the length precalculated somewhere. \$\endgroup\$Ismael Miguel– Ismael Miguel2020年02月28日 10:19:40 +00:00Commented Feb 28, 2020 at 10:19
Test coverage
We don't have any test inputs that begin or end with a space, or that contain spaces but no letters. Also, in each test, all spaces are the same length; there are no tests with single and double spaces in the same input.
These should be tested, as they are all candidates for making an error in the algorithm.
A different approach
The code makes several copies of strings, and creates and copies a temporary vector. With judicious use of standard algorithms, we can work completely in-place, without any extra allocations.
We can use std::find_if()
and std::find_if_not()
with a suitable predicate to find start and end of each word as iterators:
--+---+---+---+---+---+---+--
. | | w | o | r | d | | .
--+---+---+---+---+---+---+--
▲さんかく ▲さんかく
start end
Then we can simply pass those iterators to std::reverse()
, which will reverse that substring.
In C++, that looks like this:
#include <algorithm>
#include <cctype>
#include <string>
std::string reverse_words(std::string s)
{
auto is_space = [](unsigned char c) { return std::isspace(c); };
auto word_start = s.begin();
while ((word_start = std::find_if_not(word_start, s.end(), is_space)) != s.end()) {
auto word_end = std::find_if(word_start, s.end(), is_space);
std::reverse(word_start, word_end);
word_start = word_end;
}
return s;
}
This way, there's no copying if the caller doesn't want to re-use the passed string (e.g. if it's passed using std::move()
), and only the single copy into the argument otherwise.
I provided the is_space()
adapter for two reasons:
- If we want to change it to recognise only spaces, or perhaps also count punctuation as word separators, then we need to change only one place in the code, rather than two.
std::isspace()
accepts a non-negative integer as its argument, so we need to convertchar
tounsigned char
before it's widened toint
.
For completeness, here's a slightly different main()
that I used when testing:
#include <cassert>
#include <iostream>
int main(int argc, char **argv)
{
if (argc <= 1) {
// no arguments: self test
assert(reverse_words("The quick brown fox jumps over the lazy dog.") == "ehT kciuq nworb xof spmuj revo eht yzal .god");
assert(reverse_words("apple") == "elppa");
assert(reverse_words("a b c d") == "a b c d");
assert(reverse_words("double spaced words") == "elbuod decaps sdrow");
assert(reverse_words("") == "");
assert(reverse_words("ab ba cd") == "ba ab dc");
assert(reverse_words(" ab ba ") == " ba ab ");
} else {
// transform and print each argument
for (int i = 1; i < argc; ++i) {
std::cout << '\'' << reverse_words(argv[i]) << "'\n";
}
}
}
The other two answers have done a good job.
I will present a slightly alternative solution.
One of the slowest things about strings is duplicating them. So where possible it is nice to re-use the same space (i.e. do it inplace). You can then use wrapper functions to do copying when you need to:
// Lets start with a generic way of reversing words in any container.
template<typename I>
void reverse_words(I begin, I end)
{
// Credit to Toby Speight
// Stolen basically verbatim.
auto is_space = [](unsigned char c) { return std::isspace(c); };
auto word_start = begin;
while ((word_start = std::find_if_not(word_start, end, is_space)) != end) {
auto word_end = std::find_if(word_start, end, is_space);
std::reverse(word_start, word_end);
word_start = word_end;
}
}
// Reverse in place for any container.
template<typename C>
void reverse_words_in_container(C& container)
{
reverse_words(std::begin(container), std::end(container));
}
// Reverse a container into another container (usually a string).
template<typename C, typename S = std::string>
S reverse_words_in_container(C const& container)
{
S result(std::begin(container), std::end(container))
reverse_words_in_container(result);
return result;
}
Then just for fun. As it is an interesting code interview question. Reverse the words in a string:
std::string reverseWordsInAString(std::string const& value)
{
// Notice we reverse the whole string here.
result(std::rbegin(container), std::rend(container))
// So now the whole string is reversed.
// We now go through and reverse each word so it is
// the correct way around.
reverse_words_in_container(result);
return result;
}