Here is my code in Python 2.7 for a Sudoku resolver. Any advice on performance improvement, code bugs or general code style advice is appreciated.
My major idea is:
- Using method generate some random data, then using
check_invalid
to see if it is a valid Sudoku - If from 1, it is a valid Sudoku, then I will call
resolve_sudoku
to fill valid values
I find my code sometimes run a long time without completion. Are there any code bugs you can find?
import random
from collections import defaultdict
found = False
def check_invalid(matrix):
# check for each row
for row in range(len(matrix)):
cur_row = set()
for col in range(len(matrix[0])):
if matrix[row][col] == 0:
continue
elif 1 <= matrix[row][col] <= 9:
if matrix[row][col] in cur_row:
return False
else:
cur_row.add(matrix[row][col])
else:
return False # invalid number
# check each col
for col in range(len(matrix[0])):
cur_col = set()
for row in range(len(matrix)):
if matrix[row][col] == 0:
continue
elif 1 <= matrix[row][col] <= 9:
if matrix[row][col] in cur_col:
return False
else:
cur_col.add(matrix[row][col])
else:
return False # invalid number
# check each 3*3 square
for start_row in [0,3,6]:
for start_col in [0,3,6]:
cur_square = set()
for row in range(start_row, start_row+3):
for col in range(start_col, start_col + 3):
if matrix[row][col] == 0:
continue
elif 1 <= matrix[row][col] <= 9:
if matrix[row][col] not in cur_square:
cur_square.add(matrix[row][col])
else:
return False
else:
return False # invalid value
return True
def resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col):
global found
if found:
return
if cur_row == len(matrix):
found = True
for r in matrix:
print r
elif cur_col == len(matrix[0]):
resolve_sudoku(matrix, row_map, col_map, square_map, cur_row+1, 0)
elif matrix[cur_row][cur_col] != 0:
resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col+1)
else:
for val in range(1,10):
square_x = cur_row / 3
square_y = cur_col / 3
if val in row_map[cur_row] or val in col_map[cur_col] or val in square_map[(square_x, square_y)]:
continue
else:
row_map[cur_row].add(val)
col_map[cur_col].add(val)
square_map[(square_x, square_y)].add(val)
matrix[cur_row][cur_col] = val
resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col+1)
row_map[cur_row].remove(val)
col_map[cur_col].remove(val)
square_map[(square_x, square_y)].remove(val)
matrix[cur_row][cur_col] = 0
if found:
return
if __name__ == "__main__":
matrix = []
for row in range(9):
cur_row = []
for col in range(9):
if random.random() < 0.1:
cur_row.append(random.randint(1,9))
else:
cur_row.append(0)
matrix.append(cur_row)
for r in matrix:
print r
re = check_invalid(matrix)
print re
if re:
# init for row map and col map
row_map = defaultdict(set)
col_map = defaultdict(set)
square_map = defaultdict(set)
for row in range(len(matrix)):
for col in range(len(matrix[0])):
square_x = row / 3
square_y = row / 3
if matrix[row][col] != 0:
row_map[row].add(matrix[row][col])
col_map[col].add(matrix[row][col])
square_map[(row, col)].add(matrix[row][col])
resolve_sudoku(matrix, row_map, col_map, square_map, 0, 0)
-
1\$\begingroup\$ Right now, without a set debugger, I have noticed that after running your code literally 100 times, the only time I do not get any output is if it runs for 5 seconds. This must be a time out feature. As for bugs, relook at your loops to see if you can improve its speed. This 5-second timeout occurs in both my Online Console Compiler and in Sphere Engine. If I see anything that can be improved, I'll post it as an answer. \$\endgroup\$George McGinn– George McGinn2017年02月26日 05:01:02 +00:00Commented Feb 26, 2017 at 5:01
-
\$\begingroup\$ Thanks George, vote up for your advice. If you found any bugs, please let me know. BTW, I run in PyCharm with Python 2.7, it does not run to complete in some cases. \$\endgroup\$Lin Ma– Lin Ma2017年03月05日 01:34:47 +00:00Commented Mar 5, 2017 at 1:34
-
\$\begingroup\$ Noted. I have that too. Will use to work your sources. Thanks. \$\endgroup\$George McGinn– George McGinn2017年03月06日 05:29:37 +00:00Commented Mar 6, 2017 at 5:29
2 Answers 2
Let's break it down, piece by piece.
Generating a matrix
First, move the "generating a potential sudoku matrix" into a separate function. And, we can improve it by making use of list comprehensions:
def generate_sudoku_matrix():
"""Generates a 9x9 matrix with random numbers from 1 to 9, leaving 0s as unsolved."""
return [[0 if random.random() < 0.1 else random.randint(1, 9)
for _ in range(9)]
for _ in range(9)]
Note the use of _
for throw-away variables.
Validating a matrix
First of all, I would call the function check_valid()
instead of check_invalid()
since you are returning True
in case a matrix is valid. Though, a better name would probably be something like is_valid_sudoku()
.
The function itself is over-complicated and also violates the DRY principle because you basically repeat the same exact check for rows, columns and 3x3 squares.
What if instead we would define a reusable function that would validate a "block" (row, column or square):
def is_valid_block(block):
"""Returns true if a block (row, column or square) is a valid Sudoku block."""
return len(block) == 9 and sum(block) == sum(set(block))
And apply it to all the 3 cases:
from itertools import chain
def is_valid_sudoku(matrix):
"""Returns True if a Sudoku matrix is valid, False otherwise."""
# check each row
for row in matrix:
if not is_valid_block(row):
return False
# check each column
for col in zip(*matrix):
if not is_valid_block(col):
return False
# check each 3x3 square
for i in range(0, 9, 3):
for j in range(0, 9, 3):
square = list(chain(row[j:j + 3] for row in matrix[i:i + 3]))
if not is_valid_block(square):
return False
return True
We can also take it a step further and make use of generators and all()
:
def generate_blocks(matrix):
"""Generates rows, columns and 3x3 squares for a Sudoku matrix."""
for row in matrix:
yield row
for col in zip(*matrix):
yield col
for i in range(0, 9, 3):
for j in range(0, 9, 3):
yield list(chain(*(row[j:j + 3] for row in matrix[i:i + 3])))
def is_valid_sudoku(matrix):
"""Returns True if a Sudoku matrix is valid, False otherwise."""
return all(is_valid_block(block) for block in generate_blocks(matrix))
Solving Sudoku
First of all, move the initial map setup to a separate function to keep the "main" function clean and readable.
I don't particularly like the way you handle exiting the function, using the global found
variable. It makes difficult to follow the flow of the program and hurts modularity and separation of concerns.
The other place for improvements is the temporary map and matrix modifications and rollbacks. Though, I am not sure if copying the data structures to pass it to the next recursive call would be cleaner and faster.
I have never written a sudoku solver before, but here is how I would think about implementing the brute-force solution (I guess that is just a different approach and it's not related to the code review):
- define the
is_valid_solved_sudoku()
function that would validate if a matrix is a valid solved sudoku matrix. The function would be quite similar to theis_valid_sudoku()
except the way you would check each of the "blocks" - you would not allow for0
anymore - then, the
is_valid_solved_sudoku()
positive result will be our recursive base condition - then, on each recursive call, we find the position of the next unsolved cell (with
0
value) - for each row_index, col_index of the
0
cell, get the set of impossible values/excluded numbers from the corresponding row, column and square - make a recursive call for every number excluding the numbers from a set of excluded numbers we've constructed on the previous step
Code Style and other issues and suggestions:
- use pretty-printing via
pprint
instead of a regular print - unused
square_x
andsquare_y
variables in the "main" part of the script (that might be actually an error/bug, since, I was expecting you to use them to make a key for thesquare_map
dictionary) - organize imports per PEP8 (reference)
- two blank lines between the functions (reference)
-
\$\begingroup\$ Thanks for the answer alecxe, mark your reply as answer and grant bounty point. \$\endgroup\$Lin Ma– Lin Ma2017年03月12日 23:49:32 +00:00Commented Mar 12, 2017 at 23:49
What comes to performance, it is not due to your algorithm (which is very efficient), but rather a couple of bugs in it:
def resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col):
global found
if found:
return
if cur_row == len(matrix):
found = True
for r in matrix:
print r
elif cur_col == len(matrix[0]):
resolve_sudoku(matrix, row_map, col_map, square_map, cur_row+1, 0)
elif matrix[cur_row][cur_col] != 0:
resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col+1)
else:
for val in range(1,10):
# square_x = cur_row / 3 # <- bug
square_x = cur_col / 3
# square_y = cur_col / 3 # <- bug
square_y = cur_row / 3
if val in row_map[cur_row] or val in col_map[cur_col] or val in square_map[(square_x, square_y)]:
continue
else:
row_map[cur_row].add(val)
col_map[cur_col].add(val)
square_map[(square_x, square_y)].add(val)
matrix[cur_row][cur_col] = val
resolve_sudoku(matrix, row_map, col_map, square_map, cur_row, cur_col+1)
row_map[cur_row].remove(val)
col_map[cur_col].remove(val)
square_map[(square_x, square_y)].remove(val)
matrix[cur_row][cur_col] = 0
if found:
return
-
\$\begingroup\$ Nice catch coderodde, vote up! \$\endgroup\$Lin Ma– Lin Ma2017年03月12日 23:49:43 +00:00Commented Mar 12, 2017 at 23:49