Leetcode valid sudoku

Question 1

I'll include a solution in Python and C++ and you can review one. I'm mostly interested in reviewing the C++ code which is a thing I recently started learning; those who don't know C++ can review the Python code. Both solutions share similar logic, so the review will apply to any.

Problem statement

Determine if a 9 x 9 Sudoku board is valid. Only the filled cells need to be validated according to the following rules:

Each row must contain the digits 1-9 without repetition. Each column
must contain the digits 1-9 without repetition. Each of the nine 3 x
3 sub-boxes of the grid must contain the digits 1-9 without repetition.

Note:

A Sudoku board (partially filled) could be valid but is not necessarily solvable. Only the filled cells need to be validated according to the mentioned rules.

Example 1:

Input: board = 
[["5","3",".",".","7",".",".",".","."]
,["6",".",".","1","9","5",".",".","."]
,[".","9","8",".",".",".",".","6","."]
,["8",".",".",".","6",".",".",".","3"]
,["4",".",".","8",".","3",".",".","1"]
,["7",".",".",".","2",".",".",".","6"]
,[".","6",".",".",".",".","2","8","."]
,[".",".",".","4","1","9",".",".","5"]
,[".",".",".",".","8",".",".","7","9"]]
Output: true

Example 2:

Input: board = 
[["8","3",".",".","7",".",".",".","."]
,["6",".",".","1","9","5",".",".","."]
,[".","9","8",".",".",".",".","6","."]
,["8",".",".",".","6",".",".",".","3"]
,["4",".",".","8",".","3",".",".","1"]
,["7",".",".",".","2",".",".",".","6"]
,[".","6",".",".",".",".","2","8","."]
,[".",".",".","4","1","9",".",".","5"]
,[".",".",".",".","8",".",".","7","9"]]
Output: false
Explanation: Same as Example 1, except with the 5 in the top left corner being modified to 8. Since there are two 8's in the top left 3x3 sub-box, it is invalid.

valid_sudoku.py

def is_valid(board, empty_value='.', b_size=3):
 seen = set()
 size = b_size * b_size
 for row in range(size):
 for col in range(size):
 if (value := board[row][col]) == empty_value:
 continue
 r = f'0{row}{value}'
 c = f'1{col}{value}'
 b = f'2{row // b_size}{col // b_size}{value}'
 if r in seen or c in seen or b in seen:
 return False
 seen.update({r, c, b})
 return True
if __name__ == '__main__':
 g = [
 ["5", "3", ".", ".", "7", "5", ".", ".", "."],
 ["6", ".", ".", "1", "9", "5", ".", ".", "."],
 [".", "9", "8", ".", ".", ".", ".", "6", "."],
 ["8", ".", ".", ".", "6", ".", ".", ".", "3"],
 ["4", ".", ".", "8", ".", "3", ".", ".", "1"],
 ["7", ".", ".", ".", "2", ".", ".", ".", "6"],
 [".", "6", ".", ".", ".", ".", "2", "8", "."],
 [".", ".", ".", "4", "1", "9", ".", ".", "5"],
 [".", ".", ".", ".", "8", ".", ".", "7", "9"],
 ]
 print(is_valid(g))

Stats:

Runtime: 92 ms, faster than 81.70% of Python3 online submissions for Valid Sudoku.
Memory Usage: 14.1 MB, less than 73.95% of Python3 online submissions for Valid Sudoku.

Here's an alternative solution using numpy, it's shorter and more readable but slower:

import numpy as np
def is_valid(board, size=3, empty_value='.'):
 board = np.array(board)
 blocks = board.reshape(4 * [size]).transpose(0, 2, 1, 3).reshape(2 * [size * size])
 for grid in [board, board.T, blocks]:
 for line in grid:
 non_empty = line[line != empty_value]
 if not len(non_empty) == len(set(non_empty)):
 return False
 return True

Stats:

Runtime: 172 ms, faster than 5.19% of Python3 online submissions for Valid Sudoku.
Memory Usage: 30.2 MB, less than 11.10% of Python3 online submissions for Valid Sudoku.

valid_sudoku.h

#ifndef LEETCODE_VALID_SUDOKU_H
#define LEETCODE_VALID_SUDOKU_H
#include <string_view>
#include <unordered_set>
bool sudoku_check_update(const size_t &row, const size_t &col, const char &value,
 const int &block_size,
 std::unordered_set<std::string_view> &seen);
bool sudoku_check(const std::vector<std::vector<char>> &board,
 const char &empty_value = '.');
void test1();
#endif //LEETCODE_VALID_SUDOKU_H

valid_sudoku.cpp

#include <iostream>
#include <vector>
#include <string_view>
#include <cmath>
#include <unordered_set>
bool sudoku_check_update(const size_t &row, const size_t &col, const char &value,
 const int &block_size,
 std::unordered_set<std::string_view> &seen) {
 std::string_view r, c, b;
 r = "0-" + std::to_string(row) + value;
 c = "1-" + std::to_string(col) + value;
 b = "2-" + std::to_string(row / block_size) + std::to_string(col / block_size) +
 value;
 for (const auto &seen_id: {r, c, b}) {
 if (seen.find(seen_id) != seen.end())
 return false;
 seen.insert(seen_id);
 }
 return true;
}
bool sudoku_check(const std::vector<std::vector<char>> &board,
 const char &empty_value = '.') {
 std::unordered_set<std::string_view> seen;
 const auto row_size = board.size();
 const int block_size = std::sqrt(row_size);
 for (size_t row = 0; row < row_size; ++row) {
 for (size_t col = 0; col < row_size; ++col) {
 auto value = board[row][col];
 if (value == empty_value)
 continue;
 if (!sudoku_check_update(row, col, value, block_size, seen))
 return false;
 }
 }
 return true;
}
void test1() {
 std::vector<std::vector<char>> v = {
 {'5', '3', '.', '.', '7', '.', '.', '.', '.'},
 {'6', '.', '.', '1', '9', '5', '.', '.', '.'},
 {'.', '9', '8', '.', '.', '.', '.', '6', '.'},
 {'8', '.', '.', '.', '6', '.', '.', '.', '3'},
 {'4', '.', '.', '8', '.', '3', '.', '.', '1'},
 {'7', '.', '.', '.', '2', '.', '.', '.', '6'},
 {'.', '6', '.', '.', '.', '.', '2', '8', '.'},
 {'.', '.', '.', '4', '1', '9', '.', '.', '5'},
 {'.', '.', '.', '.', '8', '.', '.', '7', '9'}
 };
 std::cout << sudoku_check(v);
}

Stats:

Runtime: 48 ms, faster than 17.98% of C++ online submissions for Valid Sudoku.
Memory Usage: 20.4 MB, less than 22.55% of C++ online submissions for Valid Sudoku.

Question 2

Based on codereview.meta.stackexchange.com/questions/10574/… and others - though this is on-topic, for higher-quality answers we encourage separate languages to be posted in separate questions.

Question 3

@Reinderien I understand and get the point behind why you might encourage separating languages, in the dual-language posts, I tried as much as I could making both share similar logic in order for people reviewing my code to focus on the algorithm rather than language specific reviews except for c++ which I started learning recently and this is totally not the case for python. Therefore whenever 2 answers in future questions that you might consider as being not very similar in the logic / algorithm, please point them and I'll separate them if necessary.

Question 4

There are other reasons: I think it's more convenient for people that don't know one language to review the other and also these being programming exercises, they most probably won't contain more than a function or 2 max, if I separate python code, I might end up with a post with minimal to no critical points at all unless the algorithm is inefficient, which will apply for both languages. I'm avoiding to waste your time reading the same description, the same logic in 2 separate posts while most optimizations will likely apply to c++ and general performance. That's my opinion anyway.

Question 5

Consider however, two answerers who know only one of the two languages respectively. If they both give perfect answers, to whom do you award the checkmark?

Question 6

@Schism to the one with a better review overall

Question 7

Here are some suggestions for how you might be able to improve your code.

C++ version

Use all of the required `#include`s

The type std::vector<std::vector<char>> is used in the definition of sudoku_check() in the header file, but #include <vector> is missing from the list of includes there.

Minimize the interface

The .h file is a declaration of the interface to your software. The .cpp is the implementation of that interface. It is good design practice to minimize the interface to just that which is needed by outside programs. For that reason, I would remove the sudoku_check_update() and test1() functions and just use this:

#ifndef LEETCODE_VALID_SUDOKU_H
#define LEETCODE_VALID_SUDOKU_H
#include <vector>
bool sudoku_check(const std::vector<std::vector<char>> &board,
 const char &empty_value = '.');
#endif //LEETCODE_VALID_SUDOKU_H

The implementation should include the interface header

As the title of this sections states, the implementation should include the interface header. This assures that the interface and implementation match, and eliminates errors. If we do that in this case, we see that the default value for empty_value is declared twice. It should only be declared once in the header file.

Make local functions `static`

With the smaller interface as advocated above, the sudoku_check_update function becomes an implementation detail used only within the .cpp file. For that reason, it should be made static so the compiler knows that it's safe to inline the function.

The keyword static when used with a function declaration specifies that the linkage is internal. In other words it means that nothing outside of that file can access the function. This is useful for the compiler to know because, for instance, if a static function is used only once and/or is small, the compiler has the option of putting the code inline. That is, instead of the usual assembly language call...ret instructions to jump to a subroutine and return from it, the compiler can simply put the code for the function directly at that location, saving the computational cost of those instructions and helping to assure cache predictions are correct (because normally cache takes advantage of locality of reference.)

Also read about storage class specifiers to better understand what static does in other contexts and more generally declaration specifiers for explanations of constexpr and more.

Fix the bug!

The code currently uses string_view inappropriately. A std::string_view is essentially a pointer to a string that exists. But your strings are composed and deleted dynamically, so this is an invalid use of std::string_view. If you replace all instances of string_view with string, the program works.

Memory problems like this and concurrency errors are among the most difficult problems for programmers to detect and correct. As you gain more experience you will find that your ability to spot these problems and avoid them come more reflexively. There are many approaches to finding such errors. See Leak detection simple class for some of them.

Write better test functions

The bug mentioned above was easily discovered by calling the function several times with varying inputs. Perhaps you already had a more extensive array of test functions, but if not, I highly recommend creating and applying them.

Use efficient data structures

If the goal of this code is to be efficient in terms of both run time and memory, there are a lot of improvements that could be made. First, the data structure std::unordered_set<std::string_view> is not optimal. Whenever we are working on a performance optimization, it's useful to measure. So I wrote a very simple test program based on my stopwatch template. It's here:

#include "valid_sudoku.h"
#include "stopwatch.h"
#include <iostream>
#include <vector>
#include <string>
int main(int argc, char* argv[]) {
 std::vector<std::vector<char>> v = {
 {'5', '3', '.', '.', '7', '.', '.', '.', '.'},
 {'6', '.', '.', '1', '9', '5', '.', '.', '.'},
 {'.', '9', '8', '.', '.', '.', '.', '6', '.'},
 {'8', '.', '.', '.', '6', '.', '.', '.', '3'},
 {'4', '.', '.', '8', '.', '3', '.', '.', '1'},
 {'7', '.', '.', '.', '2', '.', '.', '.', '6'},
 {'.', '6', '.', '.', '.', '.', '2', '8', '.'},
 {'.', '.', '.', '4', '1', '9', '.', '.', '5'},
 {'.', '.', '.', '.', '8', '.', '.', '7', '9'}
 };
 if (argc != 2) {
 std::cout << "Usage: " << argv[0] << " num_trials\n";
 return 1;
 }
 auto iterations = std::stoul(argv[1]);
 Stopwatch<> timer{};
 bool valid{true};
 for (auto i{iterations}; i; --i) {
 valid &= sudoku_check(v);
 }
 auto elapsed{timer.stop()};
 if (!valid) {
 std::cout << "The program failed!\n";
 return 2;
 }
 std::cout << iterations << " trials took " << elapsed << " microseconds\n"
 " for an average of " << elapsed/iterations << " microseconds/trial\n";
}

When I run this on my machine with 1,000,000 trials, (with the bug noted above fixed as described) here's the result I get:

1000000 trials took 1.44351e+07 microseconds for an average of 14.4351 microseconds/trial

Now let's think of a more efficient data structure. Instead of an unordered_set, we might use a set of fixed arrays. There are nine rows, nine columns and nine subsquares. Each of those either contains a number or it doesn't. To me, that suggests we could use a object like this:

using SeenType = std::array<std::array<std::array<bool, 9>, 9>, 3>;

This contains the 3 types (rows, columns, subsquares) and within each, 9 collections of 9 bits; one bit for each number. Let's rewrite the function to use this:

static bool sudoku_check_update(std::size_t row, std::size_t col, 
 char value, SeenType &seen) {
 static constexpr std::size_t block_size{3};
 static_assert(block_size * block_size == row_size, "block_size must be the square root of row_size");
 const std::size_t block = col / block_size + block_size * (row / block_size);
 std::size_t dim{0};
 value -= '1'; // adjust from digits '1'-'9' to indices 0-8.
 for (const auto &seen_id: {row, col, block}) {
 if (seen[dim][seen_id][value])
 return false;
 seen[dim][seen_id][value] = true;
 ++dim;
 }
 return true;
}

Now re-run the program with a million trial as before:

1000000 trials took 562153 microseconds for an average of 0.562153 microseconds/trial

So that one change made things 25x faster. We could also use the fact that the dimensions are known to use a std::array<std::array<char, 9>, 9> instead of the vectors and use constexpr for those dimensions. Doing that change as well, we get this:

1000000 trials took 160808 microseconds for an average of 0.160808 microseconds/trial

So now it is 90x faster.

Prefer `{}` style initializations

You may notice that the code I write tends to use the {}-style of initialization. There are several reasons for this, including the fact that when you see it, it is always an initialization and can't be mistaken for a function call. See ES.23 for more details.

Pass values rather than references for short data types

Rather than passing const size_t &col or const char& value, it's generally better to pass those by value. This is often advantageous because the pointer is likely to be longer than the thing it's pointing to and because it allows the elimination of an indirection and memory lookup.

Move calculations from runtime to compile time where practical

It probably doesn't take up a lot of time, but this line is not as fast as it could be:

const int block_size = std::sqrt(row_size);

What this does is to convert row_size to a double, invokes the floating-point sqrt function and the converts the double back to an int. By contrast, we could just write this:

constexpr std::size_t block_size{3};

Now it takes no time at all at runtime because the value is known at compile time. It also eliminates having to pass the value and, as above, its definition can be put the only place it's actually needed, which is within the sudoku_check_update function.

Generally, we prefer to move things from runtime to compile time for three reasons:

programs are generally run more times than they are compiled, so we optimize for the more common occurrence
the sooner we detect bugs, the cheaper and easier they are to fix
it tends to make software smaller and, internally, simpler which improves loading speed, cache performance and simpler software tends to improve quality

Python version

Avoid `continue` by restructuring the loop

There's nothing intrinsically wrong with your use of the walrus operator, but there seems to be little reason not to invert the sense of the comparison and simply process the update rather than using continue. It does not affect performance, but aids the human reader of the code in understanding program flow. I tend to put early "bailout" clauses early in a function to quickly reject invalid conditions, but avoid continue in loops; ultimately it's a question of readability and style in either C++ or Python.

Use more efficient data structures

What was true in C++ also works in Python. We can use the same ideas and speed up the code by a factor of 6:

def is_valid(board, empty_value='.', b_size=3):
 size = b_size * b_size
 seen = [[(size * [False]) for _ in range(size)] for _ in range(3)]
 for row in range(size):
 for col in range(size):
 if (value := board[row][col]) != empty_value:
 block = col // b_size + b_size * (row // b_size)
 dim = 0
 value = int(value) - 1
 for seen_id in [row, col, block]:
 if seen[dim][seen_id][value]:
 return False
 seen[dim][seen_id][value] = True
 dim += 1
 return True

Question 8

Thanks, that's a very helpful review, I'll check and let you know if I have any questions. Also there is my other question here (if you have the time of course, if not, there are many points you mentioned here that would apply there as well)

Question 9

@Edward, In your python version should the formula for block use integer division? block = col // b_size + b_size * (row // b_size)

Question 10

@RootTwo: Yes, you're right. I've fixed my answer now. Thanks!

Question 11

@bullseye: Ha! You're absolutely correct. What was I saying about writing test code? :) I've fixed my answer.

Question 12

@bullseye: I've updated my answer to try to answer all of your questions.

Question 13

Minor (and Python), but I personally find this to be a little confusing:

if (value := board[row][col]) == empty_value:
 continue
r = f'0{row}{value}'
c = f'1{col}{value}'
b = f'2{row // b_size}{col // b_size}{value}'

You're using an assignment expression to assign a value, but then only use it in the false case. I think this would be much cleaner by using a plain-old assignment statement:

value = board[row][col]
if value == empty_value:
 continue
r = f'0{row}{value}'
c = f'1{col}{value}'
b = f'2{row // b_size}{col // b_size}{value}'

I don't think the line saved is worth burying the creation of a variable.

Question 14

Yeah, I might be abusing the walrus operator, it's cool I guess but maybe not necessary. How about the algorithm? Any improvements you can think of?

Question 15

I don't see anything glaring. Your use of strings for r, c and b seemed odd initially, but I guess they may actually be faster/more memory efficient than other container types. You may get a tiny boost by changing seen.update({r, c, b}) into three separate add lines to avoid the creation of the intermediate set. Whether or not that'll make a noticeable difference though, idk.

Question 16

@bullseye Actually, nvm that. On 100 million additions, it's only a difference of ~2 seconds between three adds and one update.

Question 17

they look italic to me

Question 18

I added some lang-python tags to help the markdown tool correctly format the code. See codereview.stackexchange.com/editing-help#syntax-highlighting

Question 19

C++

It's simpler, and possibly faster to pass small plain data types like size_t and char by value, not by reference. So we should have:

bool sudoku_check_update(size_t row, size_t col, char value, int block_size,
 std::unordered_set<std::string_view> &seen)
bool sudoku_check(const std::vector<std::vector<char>> &board,
 char empty_value = '.')

More importantly: std::string_view cannot be used for storing strings. It does not own the string, it is just a pointer and a size.

In doing something like this:

std::string_view r = "0-" + std::to_string(row) + value;

... we construct a temporary std::string and then assign it to a string_view. However, the temporary string goes out of scope at the end of this line!

It has passed on. This string is no more. It has ceased to be. It's expired and gone to meet its maker. This is a late string. It's a stiff. Bereft of life it rests in peace. If we hadn't nailed it to a std::string_view it would be pushing up the daisies. It has run-down the curtain and joined the choir invisible. This is an ex-string.

In other words, it's undefined behaviour to try and use that string_view. So r, c and b need to be std::strings themselves. And seen should be a std::unordered_set<std::string>.

Re. std::string_view:

std::string_view points to a range of characters in memory. These characters could be stored in a std::string, in a std::array, a std::vector or in a string-literal.

By using std::string_view we get the same interface (finding, comparison, substring creation) irrespective of what that underlying storage is. So it's useful as a common language between these types.

Since std::string_view doesn't own the characters, it does no memory allocation or copying itself. This makes it useful for things like parsing long text files - we can search and compare in substrings without doing the copying that std::string would do.

The trade-off is that we have to ensure that the lifetime of the actual string in memory is longer than that of the string_view.

Question 20

I'm not sure of the use-cases of string vs string_view I learned in previous posts / what I understood is that using string_view might be better for performance reasons but it's still unclear to me why use one and not the other and vice versa. Same goes to using pointers vs using references but of course this might be out of the scope of my question, I'm just mentioning it because you brought it up. As I'm at the start of my learning c++ curve so, I might be asking questions because I have not found actual use cases yet in which they are a requirement rather than option.

Question 21

Also same goes for value vs reference, what I understand is passing large objects like vectors / arrays(not optional in this case) and containers in general are better passed by reference to avoid making unnecessary copies or maybe I need to modify a non-local container.

Question 22

Added some details about std::string_view.

Question 23

Re. pass-by-value vs pass-by-reference: yep, that's correct. For small objects, it's simpler to pass by value. You can think of pass-by-reference as passing a pointer. If the object is the same size or smaller than a pointer, it's probably faster to just pass the object itself. In practice the speed depends on compiler settings / optimizations and machine architecture.

Question 24

Thanks, let me rephrase after your edit: string_view is better passed by a higher level function that has the actual string but the initial declaration should be a string

Edward Edward 67.2k4 gold badges120 silver badges284 bronze badges · Accepted Answer · 2020-11-18 19:14:26Z

Here are some suggestions for how you might be able to improve your code.

C++ version

Use all of the required `#include`s

The type std::vector<std::vector<char>> is used in the definition of sudoku_check() in the header file, but #include <vector> is missing from the list of includes there.

Minimize the interface

The .h file is a declaration of the interface to your software. The .cpp is the implementation of that interface. It is good design practice to minimize the interface to just that which is needed by outside programs. For that reason, I would remove the sudoku_check_update() and test1() functions and just use this:

#ifndef LEETCODE_VALID_SUDOKU_H
#define LEETCODE_VALID_SUDOKU_H
#include <vector>
bool sudoku_check(const std::vector<std::vector<char>> &board,
 const char &empty_value = '.');
#endif //LEETCODE_VALID_SUDOKU_H

The implementation should include the interface header

As the title of this sections states, the implementation should include the interface header. This assures that the interface and implementation match, and eliminates errors. If we do that in this case, we see that the default value for empty_value is declared twice. It should only be declared once in the header file.

Make local functions `static`

With the smaller interface as advocated above, the sudoku_check_update function becomes an implementation detail used only within the .cpp file. For that reason, it should be made static so the compiler knows that it's safe to inline the function.

The keyword static when used with a function declaration specifies that the linkage is internal. In other words it means that nothing outside of that file can access the function. This is useful for the compiler to know because, for instance, if a static function is used only once and/or is small, the compiler has the option of putting the code inline. That is, instead of the usual assembly language call...ret instructions to jump to a subroutine and return from it, the compiler can simply put the code for the function directly at that location, saving the computational cost of those instructions and helping to assure cache predictions are correct (because normally cache takes advantage of locality of reference.)

Also read about storage class specifiers to better understand what static does in other contexts and more generally declaration specifiers for explanations of constexpr and more.

Fix the bug!

The code currently uses string_view inappropriately. A std::string_view is essentially a pointer to a string that exists. But your strings are composed and deleted dynamically, so this is an invalid use of std::string_view. If you replace all instances of string_view with string, the program works.

Memory problems like this and concurrency errors are among the most difficult problems for programmers to detect and correct. As you gain more experience you will find that your ability to spot these problems and avoid them come more reflexively. There are many approaches to finding such errors. See Leak detection simple class for some of them.

Write better test functions

The bug mentioned above was easily discovered by calling the function several times with varying inputs. Perhaps you already had a more extensive array of test functions, but if not, I highly recommend creating and applying them.

Use efficient data structures

If the goal of this code is to be efficient in terms of both run time and memory, there are a lot of improvements that could be made. First, the data structure std::unordered_set<std::string_view> is not optimal. Whenever we are working on a performance optimization, it's useful to measure. So I wrote a very simple test program based on my stopwatch template. It's here:

#include "valid_sudoku.h"
#include "stopwatch.h"
#include <iostream>
#include <vector>
#include <string>
int main(int argc, char* argv[]) {
 std::vector<std::vector<char>> v = {
 {'5', '3', '.', '.', '7', '.', '.', '.', '.'},
 {'6', '.', '.', '1', '9', '5', '.', '.', '.'},
 {'.', '9', '8', '.', '.', '.', '.', '6', '.'},
 {'8', '.', '.', '.', '6', '.', '.', '.', '3'},
 {'4', '.', '.', '8', '.', '3', '.', '.', '1'},
 {'7', '.', '.', '.', '2', '.', '.', '.', '6'},
 {'.', '6', '.', '.', '.', '.', '2', '8', '.'},
 {'.', '.', '.', '4', '1', '9', '.', '.', '5'},
 {'.', '.', '.', '.', '8', '.', '.', '7', '9'}
 };
 if (argc != 2) {
 std::cout << "Usage: " << argv[0] << " num_trials\n";
 return 1;
 }
 auto iterations = std::stoul(argv[1]);
 Stopwatch<> timer{};
 bool valid{true};
 for (auto i{iterations}; i; --i) {
 valid &= sudoku_check(v);
 }
 auto elapsed{timer.stop()};
 if (!valid) {
 std::cout << "The program failed!\n";
 return 2;
 }
 std::cout << iterations << " trials took " << elapsed << " microseconds\n"
 " for an average of " << elapsed/iterations << " microseconds/trial\n";
}

When I run this on my machine with 1,000,000 trials, (with the bug noted above fixed as described) here's the result I get:

1000000 trials took 1.44351e+07 microseconds for an average of 14.4351 microseconds/trial

Now let's think of a more efficient data structure. Instead of an unordered_set, we might use a set of fixed arrays. There are nine rows, nine columns and nine subsquares. Each of those either contains a number or it doesn't. To me, that suggests we could use a object like this:

using SeenType = std::array<std::array<std::array<bool, 9>, 9>, 3>;

This contains the 3 types (rows, columns, subsquares) and within each, 9 collections of 9 bits; one bit for each number. Let's rewrite the function to use this:

static bool sudoku_check_update(std::size_t row, std::size_t col, 
 char value, SeenType &seen) {
 static constexpr std::size_t block_size{3};
 static_assert(block_size * block_size == row_size, "block_size must be the square root of row_size");
 const std::size_t block = col / block_size + block_size * (row / block_size);
 std::size_t dim{0};
 value -= '1'; // adjust from digits '1'-'9' to indices 0-8.
 for (const auto &seen_id: {row, col, block}) {
 if (seen[dim][seen_id][value])
 return false;
 seen[dim][seen_id][value] = true;
 ++dim;
 }
 return true;
}

Now re-run the program with a million trial as before:

1000000 trials took 562153 microseconds for an average of 0.562153 microseconds/trial

So that one change made things 25x faster. We could also use the fact that the dimensions are known to use a std::array<std::array<char, 9>, 9> instead of the vectors and use constexpr for those dimensions. Doing that change as well, we get this:

1000000 trials took 160808 microseconds for an average of 0.160808 microseconds/trial

So now it is 90x faster.

Prefer `{}` style initializations

You may notice that the code I write tends to use the {}-style of initialization. There are several reasons for this, including the fact that when you see it, it is always an initialization and can't be mistaken for a function call. See ES.23 for more details.

Pass values rather than references for short data types

Rather than passing const size_t &col or const char& value, it's generally better to pass those by value. This is often advantageous because the pointer is likely to be longer than the thing it's pointing to and because it allows the elimination of an indirection and memory lookup.

Move calculations from runtime to compile time where practical

It probably doesn't take up a lot of time, but this line is not as fast as it could be:

const int block_size = std::sqrt(row_size);

What this does is to convert row_size to a double, invokes the floating-point sqrt function and the converts the double back to an int. By contrast, we could just write this:

constexpr std::size_t block_size{3};

Now it takes no time at all at runtime because the value is known at compile time. It also eliminates having to pass the value and, as above, its definition can be put the only place it's actually needed, which is within the sudoku_check_update function.

Generally, we prefer to move things from runtime to compile time for three reasons:

programs are generally run more times than they are compiled, so we optimize for the more common occurrence
the sooner we detect bugs, the cheaper and easier they are to fix
it tends to make software smaller and, internally, simpler which improves loading speed, cache performance and simpler software tends to improve quality

Python version

Avoid `continue` by restructuring the loop

There's nothing intrinsically wrong with your use of the walrus operator, but there seems to be little reason not to invert the sense of the comparison and simply process the update rather than using continue. It does not affect performance, but aids the human reader of the code in understanding program flow. I tend to put early "bailout" clauses early in a function to quickly reject invalid conditions, but avoid continue in loops; ultimately it's a question of readability and style in either C++ or Python.

Use more efficient data structures

What was true in C++ also works in Python. We can use the same ideas and speed up the code by a factor of 6:

def is_valid(board, empty_value='.', b_size=3):
 size = b_size * b_size
 seen = [[(size * [False]) for _ in range(size)] for _ in range(3)]
 for row in range(size):
 for col in range(size):
 if (value := board[row][col]) != empty_value:
 block = col // b_size + b_size * (row // b_size)
 dim = 0
 value = int(value) - 1
 for seen_id in [row, col, block]:
 if seen[dim][seen_id][value]:
 return False
 seen[dim][seen_id][value] = True
 dim += 1
 return True

Thanks, that's a very helpful review, I'll check and let you know if I have any questions. Also there is my other question here (if you have the time of course, if not, there are many points you mentioned here that would apply there as well)
@Edward, In your python version should the formula for block use integer division? block = col // b_size + b_size * (row // b_size)
@RootTwo: Yes, you're right. I've fixed my answer now. Thanks!
@bullseye: Ha! You're absolutely correct. What was I saying about writing test code? :) I've fixed my answer.
@bullseye: I've updated my answer to try to answer all of your questions.

Stack Exchange Network

Leetcode valid sudoku

Problem statement

3 Answers 3

C++ version

Use all of the required `#include`s

Minimize the interface

The implementation should include the interface header

Make local functions `static`

Fix the bug!

Write better test functions

Use efficient data structures

Prefer `{}` style initializations

Pass values rather than references for short data types

Move calculations from runtime to compile time where practical

Python version

Avoid `continue` by restructuring the loop

Use more efficient data structures

C++

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Leetcode valid sudoku

Problem statement

3 Answers 3

C++ version

Use all of the required #includes

Minimize the interface

The implementation should include the interface header

Make local functions static

Fix the bug!

Write better test functions

Use efficient data structures

Prefer {} style initializations

Pass values rather than references for short data types

Move calculations from runtime to compile time where practical

Python version

Avoid continue by restructuring the loop

Use more efficient data structures

C++

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Use all of the required `#include`s

Make local functions `static`

Prefer `{}` style initializations

Avoid `continue` by restructuring the loop