For the given strings (not containing numbers), print their shortened versions, where each adjacent sequence of the same characters longer than 2, change to an expression consisting of a sign and a number of repetitions.
Sample input:
4
AAA
ABCDEF
CCCCCCDDDDDDD
ZZAACCCDDDDEEEEEE
Sample output:
A3
ABCDEF
C6D7
ZZAAC3D4E6
My code:
#include <algorithm>
#include <cstddef>
#include <functional>
#include <iostream>
#include <iterator>
#include <string>
std::string reduce(std::string const& word) {
std::string result;
for (auto it = word.cbegin(); true;) {
auto curr = std::adjacent_find(it, word.cend(), std::not_equal_to<int>());
auto dist = std::distance(it, curr) + (curr != word.cend());
if (dist < 3) {
result += std::string(dist, *it);
} else {
result += *it + std::to_string(dist);
}
if (curr == word.cend()) {
break;
}
it = 1 + curr;
}
return result;
}
int main() {
std::size_t tests;
std::cin >> tests;
while (tests--) {
std::string word;
std::cin >> word;
std::cout << reduce(word) << "\n";
}
}
How could I simplify or improve this code?
-
1\$\begingroup\$ Welcome to Code Review! Please see What to do when someone answers. I have rolled back Rev 4 → 3 \$\endgroup\$Sᴀᴍ Onᴇᴌᴀ– Sᴀᴍ Onᴇᴌᴀ ♦2019年07月10日 15:16:52 +00:00Commented Jul 10, 2019 at 15:16
2 Answers 2
First impressions: nicely presented code; good use of the appropriate standard library functions and classes.
A minor suggestion would be to change the name, given that reduce
is a well-known concept in functional programming (and is a function in <numeric>
). Perhaps call it compress
?
I'd suggest extracting the constant 3
to give it a meaningful name.
Can we eliminate the break
with some reorganisation of the loop? Perhaps by using std::mismatch(it, it+1)
instead of std::adjacent_find()
? (I haven't fully thought that through; it might not be better.)
We can avoid constructing a new string here:
result += std::string(dist, *it);
by using the overload of append()
that takes two iterators:
result.append(dist, *it);
A few things might be better, but you will need to measure if they actually help. (untested code)
The most costly thing in your program (except the I/O) is properly allocation for the strings. So to avoid continuous reallocation you could try
result.reserve(word.size());
and
constexpr int LargeBuffer { 4096 };
std::string word;
word.reserve(LargeBuffer); // reuse the buffer.
while (tests--) {
std::cin >> word;
std::cout << reduce(word) << "\n"; // this call might use NRVO
}
That might still trigger one allocation per word, so a more drastic rebuild could be
std::string& reduce(std::string const& word, std::string & result)
and
constexpr int LargeBuffer { 4096 };
std::string word, result;
word.reserve(LargeBuffer); // reuse the buffer.
result.reserve(LargeBuffer);
while (tests--) {
std::cin >> word;
result.clear(); // should not dealloc.
std::cout << reduce(word, result) << "\n";
}
The strings will grow and keep their new size if the actual word is larger than expected.
The next most expensive should be the std::to_string
if (dist < 3) {
result.append(dist, *it); // from Toby's answer
} else {
result.append(*it);
if (dist < 10) {
result.append('0'+dist);
} else {
result.append(std::to_string(dist)); // hopefully we are saved here by short string optimisation
}
}
The change should work nicely for your example data, less so if the repeats randomly change between <10 and>= 10.
Explore related questions
See similar questions with these tags.