My Sudoku solver is fast enough and good with small data (4*4 and 9*9 Sudoku). But with a 16*16 board it takes too long and doesn't solve 25*25 Sudoku at all. How can I improve my program in order to solve giant Sudoku faster?
I use backtracking and recursion.
It should work with any size Sudoku by changing only the define of SIZE
, so I can't make any specific bit fields or structs that only work for 9*9
, for example.
#include <stdio.h>
#include <math.h>
#define SIZE 16
#define EMPTY 0
int SQRT = sqrt(SIZE);
int IsValid (int sudoku[SIZE][SIZE], int row, int col, int number);
int Solve(int sudoku[SIZE][SIZE], int row, int col);
int main() {
int sudoku[SIZE][SIZE] = {
{0,1,2,0,0,4,0,0,0,0,5,0,0,0,0,0},
{0,0,0,0,0,2,0,0,0,0,0,0,0,14,0,0},
{0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0},
{11,0,0,0,0,0,0,0,0,0,0,16,0,0,0,0},
{0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,16,0,0,0,0,0,0,2,0,0,0,0,0},
{0,0,0,0,0,0,0,0,11,0,0,0,0,0,0,0},
{0,0,14,0,0,0,0,0,0,4,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,1,16,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,14,0,0,13,0,0},
{0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,11,0,0,0,0,0,0,0,0,0,0,0,0,0},
{16,0,0,0,0,0,11,0,0,3,0,0,0,0,0,0},
};
/*
int sudoku[SIZE][SIZE] = {
{6,5,0,8,7,3,0,9,0},
{0,0,3,2,5,0,0,0,8},
{9,8,0,1,0,4,3,5,7},
{1,0,5,0,0,0,0,0,0},
{4,0,0,0,0,0,0,0,2},
{0,0,0,0,0,0,5,0,3},
{5,7,8,3,0,1,0,2,6},
{2,0,0,0,4,8,9,0,0},
{0,9,0,6,2,5,0,8,1}
};*/
if (Solve (sudoku,0,0))
{
for (int i=0; i<SIZE; i++)
{
for (int j=0; j<SIZE; j++) {
printf("%2d ", sudoku[i][j]);
}
printf ("\n");
}
}
else
{
printf ("No solution \n");
}
return 0;
}
int IsValid (int sudoku[SIZE][SIZE], int row, int col, int number)
{
int prRow = SQRT*(row/SQRT);
int prCol = SQRT*(col/SQRT);
for (int i=0;i<SIZE;i++){
if (sudoku[i][col] == number) return 0;
if (sudoku[row][i] == number) return 0;
if (sudoku[prRow + i / SQRT][prCol + i % SQRT] == number) return 0;}
return 1;
}
int Solve(int sudoku[SIZE][SIZE], int row, int col)
{
if (SIZE == row) {
return 1;
}
if (sudoku[row][col]) {
if (col == SIZE-1) {
if (Solve (sudoku, row+1, 0)) return 1;
} else {
if (Solve(sudoku, row, col+1)) return 1;
}
return 0;
}
for (int number = 1; number <= SIZE; number ++)
{
if(IsValid(sudoku,row,col,number))
{
sudoku [row][col] = number;
if (col == SIZE-1) {
if (Solve(sudoku, row+1, 0)) return 1;
} else {
if (Solve(sudoku, row, col+1)) return 1;
}
sudoku [row][col] = EMPTY;
}
}
return 0;
}
2 Answers 2
The first thing that will help is to switch this from a recursive algorithm to an iterative one. This will prevent the stack overflow that prevents you from solving 25x25, and will be a bit faster to boot.
However to speed this up more, you will probably need to use a smarter algorithm. If you track what numbers are possible in each square, you will find that much of the time, there is only 1 possibility. In this case, you know what number goes there. You then can update all of the other squares in the same row, col, or box as the one you just filled in. To implement this efficiently, you would want to define a set (either a bitset or hashset) for what is available in each square, and use a heap to track which squares have the fewest remaining possibilities.
-
3\$\begingroup\$ Might I suggest Dancing links as an entry point for your search into a smarter algoriothm? \$\endgroup\$WorldSEnder– WorldSEnder2019年03月25日 19:42:51 +00:00Commented Mar 25, 2019 at 19:42
-
1\$\begingroup\$ The max recursion depth of this algorithm = number of squares, I think.
25*25
is only 625. The recursion doesn't create a copy of the board in each stack frame, so it probably only uses about 32 bytes per frame on x86-64. (Solve
doesn't have any locals other than its args to save across a recursive call: an 8-byte pointer and 2x 4-byteint
. That plus a return address, and maintaining 16-byte stack alignment as per the ABI, probably adds up to a 32-byte stack frame on x86-64 Linux or OS X. Or maybe 48 bytes with Windows x64 where the shadow space alone takes 32 bytes.) \$\endgroup\$Peter Cordes– Peter Cordes2019年03月25日 23:49:34 +00:00Commented Mar 25, 2019 at 23:49 -
2\$\begingroup\$ Anyway, that's only
25*25*48
= 30kB (not 30kiB) of stack memory max, which trivial (stack limits of 1MiB to 8MiB are common). Even a factor of 10 error in my reasoning isn't a problem. So it's not stack overflow, it's simply theO(SIZE^SIZE)
exponential time complexity that stops SIZE=25 from running in usable time. \$\endgroup\$Peter Cordes– Peter Cordes2019年03月25日 23:54:50 +00:00Commented Mar 25, 2019 at 23:54 -
\$\begingroup\$ Yeah, any idea why it wasn't returning for 25x25 before? Just speed? \$\endgroup\$Oscar Smith– Oscar Smith2019年03月25日 23:57:53 +00:00Commented Mar 25, 2019 at 23:57
-
\$\begingroup\$ @OscarSmith: I'd assume just speed, yeah, that's compatible with the OP's wording.
n^n
grows very fast! Or maybe an unsolvable board? Anyway, Sudoku solutions finder using brute force and backtracking goes into detail on your suggestion to try cells with fewer possibilities first. There are several other Q&As in the "related" sidebar that look useful. \$\endgroup\$Peter Cordes– Peter Cordes2019年03月26日 00:00:21 +00:00Commented Mar 26, 2019 at 0:00
The strategy needs work: brute-force search is going to scale very badly. As an order-of-magnitude estimate, observe that the code calls IsValid()
around SIZE
times for each cell - that's O(n3), where n is the SIZE
.
Be more consistent with formatting. It's easier to read (and to search) code if there's a consistent convention. To take a simple example, we have:
int IsValid (int sudoku[SIZE][SIZE], int row, int col, int number) int Solve(int sudoku[SIZE][SIZE], int row, int col) if (Solve (sudoku,0,0)) if(IsValid(sudoku,row,col,number))
all with differing amounts of space around (
. This kind of inconsistency gives an impression of code that's been written in a hurry, without consideration for the reader.
Instead of defining SIZE
and deriving SQRT
, it's simpler to start with SQRT
and define SIZE
to be (SQRT * SQRT)
. Then there's no need for <math.h>
and no risk of floating-point approximation being unfortunately truncated when it's converted to integer.
The declaration/definition of main()
should specify that it takes no arguments:
int main(void)
If we write int main()
, that declares main
as a function that takes an unspecified number of arguments (unlike C++, where ()
is equivalent to (void)
).
You can see that C compilers treat void foo(){}
differently from void foo(void){}
on the Godbolt compiler explorer.
-
2\$\begingroup\$ Very good suggestion to make
SQRT
a compile-time constant. The code uses stuff likeprRow + i / SQRT
andi % SQRT
, which will compile to a runtime integer division (like x86idiv
) becauseint SQRT
is a non-const
global! And with a non-constant initializer, so I don't think this is even valid C. But fun fact: gcc does accept it as C (doing constant-propagation throughsqrt
even with optimization disabled). But clang rejects it. godbolt.org/z/4jrJmL. Anyway yes, we get nastyidiv
unless we useconst int sqrt
(or better unsigned) godbolt.org/z/NMB156 \$\endgroup\$Peter Cordes– Peter Cordes2019年03月25日 20:22:55 +00:00Commented Mar 25, 2019 at 20:22
SIZE
becomes large. \$\endgroup\$