I haven't used C++ for quite a while and wanted to practice a bit, so I decided to start simple and work through the Advent of Code 2017.
This is my solution to Part 1 of the Day 1 exercise "Inverse Captcha":
The captcha requires you to review a sequence of digits (your puzzle input) and find the sum of all digits that match the next digit in the list. The list is circular, so the digit after the last digit is the first digit in the list.
For example:
1122
produces a sum of 3 (1 + 2) because the first digit (1) matches the second digit and the third digit (2) matches the fourth digit.1111
produces 4 because each digit (all 1) matches the next.1234
produces 0 because no digit matches the next.91212129
produces 9 because the only digit that matches the next one is the last digit, 9.
My program works as follows:
- Read one line from stdin
- Convert the line character-wise to a vector of decimal digits
- Calculate the sum of matching digits
- Write the result to stdout
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
// Convert the first line of the input stream to a vector of digits.
std::vector<int> read_digits(std::istream& input) {
std::string line;
std::getline(input, line);
std::vector<int> digits(line.size());
std::transform(
std::cbegin(line), std::cend(line), std::begin(digits),
[](char c) { return c - '0'; }
);
return digits;
}
// Return the sum of all digits in the list that match the next digit
// (the first digit is the "next" digit of the last digit).
int reverse_captcha(const std::vector<int>& digits) {
int sum = 0;
for (auto p = std::begin(digits); p != std::end(digits); p++) {
auto next_p =
(std::next(p) == std::end(digits)) ?
std::begin(digits) : std::next(p);
if (*p == *next_p) sum += *p;
}
return sum;
}
int main() {
std::cout << reverse_captcha(read_digits(std::cin)) << '\n';
}
I'm compiling it with clang++-4.0 -std=c++14
. I'm not sure if I actually use features of the newest standard, but I don't want to restrict myself to older versions.
Here are a few things I'm particularly curious about:
How can I separate iterating over the pairs of digits from calculating their sum if the pairs are equal? If I were using Python I would write one or two small generator functions, but I don't know yet if there is an equally simple way to do this in C++, or if this would become overkill.
Is there an elegant way to avoid the extra step of constructing a string
line
before creating the vector of digits?Using
- '0'
for convertingchar
→int
feels like a hack to me, is there really no purpose-built function for this? (I'm used toint('5')
→5
from Python)
4 Answers 4
I think std::rotate
can lead to a somewhat simpler approach:
- get the input
- create a copy
- rotate the copy by one element
- Accumulate the sum of elements that match between the two
Code might look something on this general order:
int sum_matching(std::vector<int> const &digits) {
std::vector<int> copy{digits};
std::rotate(copy.begin(), copy.begin()+1, copy.end());
int sum = 0;
for (int i=0; i<digits.size(); i++)
if (digits[i] == copy[i])
sum += digits[i];
return sum;
}
I'm pretty sure you could hijack std::inner_product
to do that final loop, but given its name, that would be misleading, so probably better avoided. On the other hand, if your compiler includes all the latest C++17 goodness, you could use std::transform_reduce
though:
int main() {
std::vector<int> inputs{1, 2, 3, 3, 1};
std::vector<int> copy{inputs};
std::rotate(copy.begin(), copy.begin()+1, copy.end());
int result = std::transform_reduce(inputs.begin(), inputs.end(),
copy.begin(),
0,
[](int a, int b) { return a + b; },
[](int a, int b) { return a==b ? a : 0; });
std::cout << result << "\n";
}
If you'd like to play with that using a (strictly prototype) implementation of transform_reduce
, it's not terribly complex--something on this order:
template<class InputIterator1,
class InputIterator2,
class T,
class BinaryOperation1,
class BinaryOperation2>
T transform_reduce(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2,
T init,
BinaryOperation1 binary_op1,
BinaryOperation2 binary_op2)
{
for ( ; first1 != last1; ++first1, ++first2) {
init = binary_op1(init, binary_op2(*first1, *first2));
}
return init;
}
Note, however, that this is a purely serial implementation--part of the intent of transform_reduce
is that it will support parallel implementation, which this doesn't (but that's mostly a restriction that binary_op1
and binary_op2
must be commutative and associative, not a requirement on the implementation). Note that the standard includes a number of overloads of std::transform_reduce
--I've included an implementation of the one used in the preceding code.
-
\$\begingroup\$ One can also hijack
std::transform
by supplying noop output iterator. It is unfortunate thatstd::accumulate
does not provide a version for two ranges, or N. \$\endgroup\$Incomputable– Incomputable2017年12月10日 18:07:05 +00:00Commented Dec 10, 2017 at 18:07 -
\$\begingroup\$ Can
rotate
also be used without making a copy, but using some sort of reference or "view" on the original container? \$\endgroup\$mkrieger1– mkrieger12017年12月10日 19:51:00 +00:00Commented Dec 10, 2017 at 19:51 -
1\$\begingroup\$ @mkrieger1: If you wanted to avoid the copy, you could use something a little like (but definitely not quite identical to) the
mod_iterator
I posted in a previous answer. With C++ 17, you can easily do without themake_mod_iterator
as well, making this approach more convenient. \$\endgroup\$Jerry Coffin– Jerry Coffin2017年12月10日 21:00:30 +00:00Commented Dec 10, 2017 at 21:00 -
\$\begingroup\$ @JerryCoffin, I believe it will violate most of the iterator category requirements, as range "should have an end". Though as output iterator into ring buffer would be nice. Btw, in my solution I went the other around, e.g. operate in place. \$\endgroup\$Incomputable– Incomputable2017年12月10日 21:06:16 +00:00Commented Dec 10, 2017 at 21:06
-
\$\begingroup\$ It’s universally known and accepted that
std::inner_product
’s name is misleading anyway. So nobody relies on its name for its semantics, and using it here wouldn’t be all that surprising. But yes, unfortunately its name is misleading. Maybe there should be a proposal to add an alias (in the<algorithm>
header). \$\endgroup\$Konrad Rudolph– Konrad Rudolph2017年12月11日 19:03:01 +00:00Commented Dec 11, 2017 at 19:03
Consider extending digits
with
digits.push_back(*digits.begin());
This way you wouldn't need to worry about a next() == end()
:
for (auto first = digits.begin(), second = next(first);
second != digits.end();
++first, ++second) {
sum += (*first == *second)? *first: 0;
}
-
\$\begingroup\$ I'm not sure if a single-digit input is supposed to be counted or not. This interpretation makes sense, though: It's its own next digit, so it matches itself. But if that's not the right interpretation, you'd have to explicitly check for the single-digit case and not count it. But anyway, +1 for this because it avoids making a second copy of the whole list, and avoids extra work inside the loop. Peeling the last iteration would probably be even more efficient, but DRY says not to do that. \$\endgroup\$Peter Cordes– Peter Cordes2017年12月11日 10:37:26 +00:00Commented Dec 11, 2017 at 10:37
CodeReview
The code does a lot of unnecessary copying. Otherwise it is pretty solid. May be actually having second iterator would be nicer. That way, the code check only the second iterator.
Alternative Solution
I actually started thinking about it, and found out that no copying around is needed. One can memorize the first and the previous char
and find the sum. The nature of the algorithm is that it performs only one pass, which is the same as pass over any sequence at all.
#include <iterator>
#include <iostream>
template <typename InputIterator>
int reverse_captcha(InputIterator first, InputIterator last)
{
if (first == last) {return {};}
char leading_digit = *first++;
if (first == last) {return {};}
char prev_digit{leading_digit};
int sum{0};
while (first != last)
{
if (*first == prev_digit)
{
sum += *first - '0';
}
prev_digit = *first++;
}
if (leading_digit == prev_digit)
{
sum += leading_digit - '0';
}
return sum;
}
int main()
{
auto res = reverse_captcha(std::istream_iterator<char>{std::cin}, {});
std::cout << res << '\n';
}
This has multiple drawbacks, but also multiple benefits.
Benefits
Relies only on performance of char retrieval. One can make it fast, or make it slow
Works on any range which support single pass
More in line with standard algorithms, thus has the potential to be well understood in terms of usage.
Drawbacks
Benefit #1 is also a drawback
More tedious implementation, which makes it easier to get wrong
Overgeneralization?
-
\$\begingroup\$ I haven't tested this, but I'm sure it won't fail in obvious cases. It does certainly compile though. \$\endgroup\$Incomputable– Incomputable2017年12月10日 20:41:00 +00:00Commented Dec 10, 2017 at 20:41
-
\$\begingroup\$ This is what I was thinking, too. Expanding each ASCII digit to an
int
, and then copying that, is just totally unnecessary. However, multiple single-bytecin>>
function calls are probably slow (and probably don't optimize away into larger reads), sogetline
is probably not a bad idea. Then looping over a buffer of characters will let the compiler auto-vectorize with SIMD packed-compare, though. (e.g. on x86, load and overlapping unaligned load, thenpcmpeqb
(packed compare) /psubb
(set1('0')
/pand
(mask off bytes that weren't equal) / horizontal sum + addpsadbw
/paddq
) \$\endgroup\$Peter Cordes– Peter Cordes2017年12月11日 09:26:06 +00:00Commented Dec 11, 2017 at 9:26 -
\$\begingroup\$ @PeterCordes, as I mentioned at the bottom of the post, this really wasn’t much about quality as it was about usage of standard library :) \$\endgroup\$Incomputable– Incomputable2017年12月11日 09:27:31 +00:00Commented Dec 11, 2017 at 9:27
-
\$\begingroup\$ But is there any way to use
istream_iterator<char>{std::cin}
that compiles efficiently? I had a look, and this doesn't: godbolt.org/g/pfggez. A function call for each character. (Which presumably has to lock/unlockcin
, not to mention all the function-call overhead in the calling code setting up args.) If not, this makes it hard to recommendstd::istream_iterator<char>
overgetline
if you know you want whole lines, and that lines aren't unreasonably long. \$\endgroup\$Peter Cordes– Peter Cordes2017年12月11日 10:13:14 +00:00Commented Dec 11, 2017 at 10:13 -
1\$\begingroup\$ @PeterCordes, I did edit it now. Hopefully I didn't miss anything from what you said. \$\endgroup\$Incomputable– Incomputable2017年12月11日 18:29:23 +00:00Commented Dec 11, 2017 at 18:29
Don't use std::transform
When in doubt, don't. This:
std::vector<int> digits(line.size());
std::transform(
std::cbegin(line), std::cend(line), std::begin(digits),
[](char c) { return c - '0'; }
Is just so much more complicated than this:
std::vector<int> digits;
digits.reserve(line.size());
for (char c : line) {
digits.push_back(c - '0');
}
Don't use non-member begin
/end
You're using standard containers. They have member begin
/end
. It's a natural thing to use and long predates the non-member functions. The non-members only make sense to be used in generic code, or on types that don't have the members. As-is, you're neither adding more functionality (w.r.t. generic code) nor adding to readability (it's longer to type, and the important part - the container - now comes last).
So this:
for (auto p = std::begin(digits); p != std::end(digits); p++) {
auto next_p =
(std::next(p) == std::end(digits)) ?
std::begin(digits) : std::next(p);
if (*p == *next_p) sum += *p;
}
Could become:
for (auto p = digits.begin(); p != digits.end(); p++) {
auto next_p =
(std::next(p) == digits.end()) ?
digits.begin() : std::next(p);
if (*p == *next_p) sum += *p;
}
Could become, since vector
's iterators are random-access:
for (auto p = digits.begin(); p != digits.end(); p++) {
auto next_p = (p + 1 == digits.end()) ? digits.begin() : p + 1;
if (*p == *next_p) sum += *p;
}
Could become, since using iterators doesn't really give you anything:
for (size_t i = 0; i < digits.size(); ++i) {
auto next_i = (i + 1 == digits.size()) ? 0 : i + 1;
if (digits[i] == digits[next_i]) sum += digits[i];
}
Could become, since we don't have to deal with modulus awkwardness:
for (size_t i = 0; i < digits.size(); ++i) {
if (digits[i] == digits[(i + 1) % digits.size()]) {
sum += digits[i];
}
}
-
1\$\begingroup\$ I somewhat agree with the point about non member begin/end. I believe their true power unleashes when ADL kicks in (given that it is written to use ADL). It seems like there is a lot of tension about this. About iterators, for me iterators always look easier to comprehend. I never got used to index based algorithms. \$\endgroup\$Incomputable– Incomputable2017年12月10日 21:56:44 +00:00Commented Dec 10, 2017 at 21:56
-
\$\begingroup\$ @Incomputable There's no tension. If you're writing generic code, you have to do
using std::begin; begin(x)
If you're not writing generic code, just do the thing that works? I don't see the point of writingstd::begin(x)
overx.begin()
. \$\endgroup\$Barry– Barry2017年12月10日 21:58:18 +00:00Commented Dec 10, 2017 at 21:58 -
\$\begingroup\$ my IDE has macros for non member version. I can read both versions with the same comprehension, so I guess it is about personal taste. \$\endgroup\$Incomputable– Incomputable2017年12月10日 22:00:11 +00:00Commented Dec 10, 2017 at 22:00
-
1\$\begingroup\$
(i+1) % digits.size()
is quite bad for performance. Compilers don't realize that it only wraps at most once, and use an actual 64-bit DIV instruction inside the loop! (See gcc+clang+ICC on godbolt.org/g/hrwzFX). This is about 30x slower thanauto next_i = (i + 1 == digits.size()) ? 0 : i + 1;
or the OP's code which compiles to a CMOV. Peeling the wrap-around condition out of the loop entirely is even better. @vnp's suggestion to push a 2nd copy of the first digit solves it cleanly without violating DRY. \$\endgroup\$Peter Cordes– Peter Cordes2017年12月11日 09:54:26 +00:00Commented Dec 11, 2017 at 9:54 -
1\$\begingroup\$ I disagree with a number of your items here. Do use the free functions, they are more than acceptable: uniform interfaces aren’t only beneficial for generic code (but don’t use
std::
in front, ADL will find them anyway). Do usestd::transform
— that’s what it’s there for (though in this particular case the added verbosity does make it less robust). Do use iterators, they are (virtually always) less error-prone than using indices. In fact, forget about indices on sequential container access altogether. The default idiom should always be to work on ranges. Indices are a low-level hack. \$\endgroup\$Konrad Rudolph– Konrad Rudolph2017年12月11日 19:34:51 +00:00Commented Dec 11, 2017 at 19:34
Explore related questions
See similar questions with these tags.
return {std::istream_iterator<char>{}, {}};
inread_digits
. If I remember correctly, constructor should not complain. \$\endgroup\$input
(orline
?) and where would I put the conversion fromchar('5')
toint(5)
? \$\endgroup\$char
never skips anything, and reads one character from input stream. \$\endgroup\$