Yet another minesweeper field calculator

Question 1

The task is to read in a data file containing a representation of a rectangular array with '*' signifying a mine and '.' signifying the absence of one. Each such array is preceded by a row and column count. The file is terminated with an entry that has zero for both the row and column counts. The expected output counts each frame (starting from 1) and emits a modified representation of the input that substitutes a digit for each '.' in the input. The digit is the number of mines next to that space.

As with C++ Mine Sweeper Attempt I did a review and decided to create a version as well. On my machine with a 25,000 line input file mostly containing \99ドル \times 100\$ fields, the original took an average of 0.140 seconds while this version took an average of 0.025 seconds (less than 20% of the time).

Internally, it keeps three pointers into a common buffer area. The pointers are for the currentRow and also for the aboveRow and belowRow. It reads data into the currentRow calling the placeMine function when a mine has been read. As each mine is read, all rows are appropriately updated using the increment function.

No attempt is made at error checking or recovery, so if the user sets the maximum width (given here as a template parameter) to say, 50, but the input has rows much larger, the program will likely segfault and crash. In the same vein, the input file is expected to be formatted perfectly.

mines.cpp

#include <algorithm>
#include <iostream>
#include <array>
template <std::size_t maxWidth = 100>
class MineField {
 static constexpr char mine{'*'};
 static constexpr int maxSize{maxWidth + 2};
 int rows = 0;
 int cols = 0;
 std::array<char, maxSize * 3> buffer;
 char *aboveRow = &buffer[0];
 char *currentRow = &buffer[maxSize];
 char *belowRow = &buffer[maxSize * 2];
 void increment(char& location) {
 if (location != mine)
 ++location;
 }
 void showrow(std::ostream& out) const {
 out.write(&aboveRow[1], cols);
 out << '\n';
 }
 // place the mine at the current row and adjust neighboring counts
 void placeMine(int i) {
 currentRow[i+1] = mine;
 increment(currentRow[i]);
 increment(currentRow[i+2]);
 increment(aboveRow[i]);
 increment(aboveRow[i+1]);
 increment(aboveRow[i+2]);
 increment(belowRow[i]);
 increment(belowRow[i+1]);
 increment(belowRow[i+2]);
 }
 // shift the rows up and fill the bottom row with zeroes
 void shift() {
 std::swap(aboveRow, currentRow);
 std::swap(belowRow, currentRow);
 std::fill_n(belowRow, cols, '0');
 }
public:
 void process(std::istream& in, std::ostream& out) {
 in >> rows >> cols;
 for (int fieldnum{0}; rows && cols; ) {
 buffer.fill('0');
 if (fieldnum) {
 out << '\n';
 }
 out << "Field #" << ++fieldnum << ":\n";
 for (int j{0}; j < rows; ++j) {
 for (int i{0}; i < cols; ++i) {
 char ch;
 in >> ch;
 if (ch == mine) {
 placeMine(i);
 }
 }
 if (j) {
 showrow(out);
 }
 shift();
 }
 showrow(out);
 in >> rows >> cols;
 }
 }
};
int main() {
 std::ios_base::sync_with_stdio(false);
 std::cin.tie(nullptr);
 MineField<100> field;
 field.process(std::cin, std::cout);
}

Example input

4 4
*.**
....
.*..
...*
3 6
***...
*.*..*
***...
0 0

Example output

Field #1:
*2**
2332
1*23
112*
Field #2:
***211
*8*31*
***211
```

Question 2

What is the processor, memory size and OS on your computer?

Question 3

This computer has an Intel i7 running at 3.4GHz, with 16 RAM running 64-bit Linux (Fedora 34). Compiler is gcc 11.2 with -O2 optimization.

Question 4

That should be 16GiB.

Question 5

@Edward you can go back and edit your comments, FYI.

Question 6

You can edit comments, but only within 5 minutes of posting.

Question 7

Consider using an array of rows

The idea to use pointers into buffer is a good one, as swapping those pointers around is quite fast. However, instead of using three distinct variables to hold those pointers, consider storing them in an array:

std::array<char *, 3> rows = {&buffer[0], &buffer[maxSize], &buffer[maxSize * 2]};

Now you can do more with loops, for example placeMine() could become:

void placeMine(int i) {
 for (int y: {0, 1, 2})
 for (int x: {0, 1, 2})
 increment(rows[y][i + x]);
 rows[1][i + 1] = mine;
}

And let the compiler worry about unrolling that loop. The two swaps in shift() could be replaced by a std::rotate():

std::rotate(rows.begin(), ++rows.begin(), rows.end());

Possible optimizations

In showrow(), I would set the last character of the row to '\n', so you can print the line in a single call.

Instead of checking if fieldnum is non-zero to print an empty line separating the fields, I would unconditionally print a newline at the end of the outer for-loop in process().

It might be possible to vectorize this. For example, if your CPU has 128 bit vectors, each one can hold 16 characters. So read in 16 characters into a temporary 16 x 8 bit vector register, then use some bit operations to ensure every * is converted to a 1, and every . to a 0. Then in placeMine() you basically do what you are already doing, but now with 16 positions at once.

Question 8

I particularly like the array idea. It didn't speed things up in my testing, but I think it makes the code nicer, so that's a win.

Question 9

Unless the width if the fields is very large, everything will fit into the L1 data cache, so apart from vectorization I don't see how to improve it. Even with vectorization, it might be that I/O is already dominating the performance.

Question 10

The style in C++ is to put the * or & with the type, not the identifier. This is called out specifically near the beginning of Stroustrup’s first book, and is an intentional difference from C style.

I don't like how
in >> rows >> cols;
needs to be repeated at the top and bottom of the loop. I think you should consider a mid-decision loop as the most straightforward solution:

for (;;) {
 read values
 if (value meets exit condition) break;
 body of loop
 body statements...
};

Or, you can make the reading of the bounds a separate function call, and cram it into the while condition:

while (read_bounds(rows,cols)) {

Question 11

I thought about writing for (in >> rows >> cols; rows && cols; in >> rows >> cols). What do you think of that variation?

Question 12

@Edward hard to follow when cramming everything into the for statement, and you still have to state it twice! Use a separate function call as I showed: it's readable and only describes the reading once (in the function). The for loop does not match the semantics thus the need to restate the iteration step as the initial step. The semantics are logically a while as there is only one kind of thing, not different "set up the first time" and "advance to the next".

Question 13

I would include in the condition: while (in >> rows >> cols && rows && cols) (even though the usual input error checking is omitted elsewhere).

G. Sliepen G. Sliepen 68.7k3 gold badges74 silver badges179 bronze badges · Accepted Answer · 2021-09-30 17:48:51Z

Consider using an array of rows

The idea to use pointers into buffer is a good one, as swapping those pointers around is quite fast. However, instead of using three distinct variables to hold those pointers, consider storing them in an array:

std::array<char *, 3> rows = {&buffer[0], &buffer[maxSize], &buffer[maxSize * 2]};

Now you can do more with loops, for example placeMine() could become:

void placeMine(int i) {
 for (int y: {0, 1, 2})
 for (int x: {0, 1, 2})
 increment(rows[y][i + x]);
 rows[1][i + 1] = mine;
}

And let the compiler worry about unrolling that loop. The two swaps in shift() could be replaced by a std::rotate():

std::rotate(rows.begin(), ++rows.begin(), rows.end());

Possible optimizations

In showrow(), I would set the last character of the row to '\n', so you can print the line in a single call.

Instead of checking if fieldnum is non-zero to print an empty line separating the fields, I would unconditionally print a newline at the end of the outer for-loop in process().

It might be possible to vectorize this. For example, if your CPU has 128 bit vectors, each one can hold 16 characters. So read in 16 characters into a temporary 16 x 8 bit vector register, then use some bit operations to ensure every * is converted to a 1, and every . to a 0. Then in placeMine() you basically do what you are already doing, but now with 16 positions at once.

I particularly like the array idea. It didn't speed things up in my testing, but I think it makes the code nicer, so that's a win.
Unless the width if the fields is very large, everything will fit into the L1 data cache, so apart from vectorization I don't see how to improve it. Even with vectorization, it might be that I/O is already dominating the performance.

Stack Exchange Network

Yet another minesweeper field calculator

mines.cpp

Example input

Example output

2 Answers 2

Consider using an array of rows

Possible optimizations

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Yet another minesweeper field calculator

mines.cpp

Example input

Example output

2 Answers 2

Consider using an array of rows

Possible optimizations

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions