String manipulation with std::adjacent_find

Question 1

For the given strings (not containing numbers), print their shortened versions, where each adjacent sequence of the same characters longer than 2, change to an expression consisting of a sign and a number of repetitions.

Sample input:

4
AAA
ABCDEF
CCCCCCDDDDDDD
ZZAACCCDDDDEEEEEE

Sample output:

A3
ABCDEF
C6D7
ZZAAC3D4E6

My code:

#include <algorithm>
#include <cstddef>
#include <functional>
#include <iostream>
#include <iterator>
#include <string>
std::string reduce(std::string const& word) {
 std::string result;
 for (auto it = word.cbegin(); true;) {
 auto curr = std::adjacent_find(it, word.cend(), std::not_equal_to<int>());
 auto dist = std::distance(it, curr) + (curr != word.cend());
 if (dist < 3) {
 result += std::string(dist, *it);
 } else {
 result += *it + std::to_string(dist);
 }
 if (curr == word.cend()) {
 break;
 }
 it = 1 + curr;
 }
 return result;
}
int main() {
 std::size_t tests;
 std::cin >> tests;
 while (tests--) {
 std::string word;
 std::cin >> word;
 std::cout << reduce(word) << "\n";
 }
}

How could I simplify or improve this code?

Question 2

Welcome to Code Review! Please see What to do when someone answers. I have rolled back Rev 4 → 3

Question 3

First impressions: nicely presented code; good use of the appropriate standard library functions and classes.

A minor suggestion would be to change the name, given that reduce is a well-known concept in functional programming (and is a function in <numeric>). Perhaps call it compress?

I'd suggest extracting the constant 3 to give it a meaningful name.

Can we eliminate the break with some reorganisation of the loop? Perhaps by using std::mismatch(it, it+1) instead of std::adjacent_find()? (I haven't fully thought that through; it might not be better.)

We can avoid constructing a new string here:

 result += std::string(dist, *it);

by using the overload of append() that takes two iterators:

 result.append(dist, *it);

Question 4

A few things might be better, but you will need to measure if they actually help. (untested code)

The most costly thing in your program (except the I/O) is properly allocation for the strings. So to avoid continuous reallocation you could try

result.reserve(word.size());

and

constexpr int LargeBuffer { 4096 };
std::string word;
word.reserve(LargeBuffer); // reuse the buffer.
while (tests--) {
 std::cin >> word;
 std::cout << reduce(word) << "\n"; // this call might use NRVO
}

That might still trigger one allocation per word, so a more drastic rebuild could be

std::string& reduce(std::string const& word, std::string & result)

and

constexpr int LargeBuffer { 4096 };
std::string word, result;
word.reserve(LargeBuffer); // reuse the buffer.
result.reserve(LargeBuffer);
while (tests--) {
 std::cin >> word;
 result.clear(); // should not dealloc.
 std::cout << reduce(word, result) << "\n";
}

The strings will grow and keep their new size if the actual word is larger than expected.

The next most expensive should be the std::to_string

 if (dist < 3) {
 result.append(dist, *it); // from Toby's answer
 } else {
 result.append(*it);
 if (dist < 10) {
 result.append('0'+dist);
 } else {
 result.append(std::to_string(dist)); // hopefully we are saved here by short string optimisation
 }
 }

The change should work nicely for your example data, less so if the repeats randomly change between <10 and>= 10.

Toby Speight Toby Speight 87.9k14 gold badges104 silver badges325 bronze badges · Accepted Answer · 2019-07-09 15:27:38Z

First impressions: nicely presented code; good use of the appropriate standard library functions and classes.

A minor suggestion would be to change the name, given that reduce is a well-known concept in functional programming (and is a function in <numeric>). Perhaps call it compress?

I'd suggest extracting the constant 3 to give it a meaningful name.

Can we eliminate the break with some reorganisation of the loop? Perhaps by using std::mismatch(it, it+1) instead of std::adjacent_find()? (I haven't fully thought that through; it might not be better.)

We can avoid constructing a new string here:

 result += std::string(dist, *it);

by using the overload of append() that takes two iterators:

 result.append(dist, *it);

Stack Exchange Network

String manipulation with std::adjacent_find

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

String manipulation with std::adjacent_find

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions