Simple recursive Sudoku solver

Question 1

My Sudoku solver is fast enough and good with small data (4*4 and 9*9 Sudoku). But with a 16*16 board it takes too long and doesn't solve 25*25 Sudoku at all. How can I improve my program in order to solve giant Sudoku faster?

I use backtracking and recursion.

It should work with any size Sudoku by changing only the define of SIZE, so I can't make any specific bit fields or structs that only work for 9*9, for example.

#include <stdio.h>
#include <math.h>
#define SIZE 16
#define EMPTY 0
int SQRT = sqrt(SIZE);
int IsValid (int sudoku[SIZE][SIZE], int row, int col, int number);
int Solve(int sudoku[SIZE][SIZE], int row, int col);
int main() {
int sudoku[SIZE][SIZE] = {
{0,1,2,0,0,4,0,0,0,0,5,0,0,0,0,0},
{0,0,0,0,0,2,0,0,0,0,0,0,0,14,0,0},
{0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0},
{11,0,0,0,0,0,0,0,0,0,0,16,0,0,0,0},
{0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,16,0,0,0,0,0,0,2,0,0,0,0,0},
{0,0,0,0,0,0,0,0,11,0,0,0,0,0,0,0},
{0,0,14,0,0,0,0,0,0,4,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,1,16,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,14,0,0,13,0,0},
{0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,11,0,0,0,0,0,0,0,0,0,0,0,0,0},
{16,0,0,0,0,0,11,0,0,3,0,0,0,0,0,0},
};
/*
int sudoku[SIZE][SIZE] = {
{6,5,0,8,7,3,0,9,0},
{0,0,3,2,5,0,0,0,8},
{9,8,0,1,0,4,3,5,7},
{1,0,5,0,0,0,0,0,0},
{4,0,0,0,0,0,0,0,2},
{0,0,0,0,0,0,5,0,3},
{5,7,8,3,0,1,0,2,6},
{2,0,0,0,4,8,9,0,0},
{0,9,0,6,2,5,0,8,1}
};*/
 if (Solve (sudoku,0,0))
 {
 for (int i=0; i<SIZE; i++)
 {
 for (int j=0; j<SIZE; j++) {
 printf("%2d ", sudoku[i][j]);
 }
 printf ("\n");
 }
 }
 else
 {
 printf ("No solution \n");
 }
 return 0;
}
int IsValid (int sudoku[SIZE][SIZE], int row, int col, int number)
{
 int prRow = SQRT*(row/SQRT);
 int prCol = SQRT*(col/SQRT);
 for (int i=0;i<SIZE;i++){
 if (sudoku[i][col] == number) return 0;
 if (sudoku[row][i] == number) return 0;
 if (sudoku[prRow + i / SQRT][prCol + i % SQRT] == number) return 0;}
 return 1;
}
int Solve(int sudoku[SIZE][SIZE], int row, int col)
{
 if (SIZE == row) {
 return 1;
 }
 if (sudoku[row][col]) {
 if (col == SIZE-1) {
 if (Solve (sudoku, row+1, 0)) return 1;
 } else {
 if (Solve(sudoku, row, col+1)) return 1;
 }
 return 0;
 }
 for (int number = 1; number <= SIZE; number ++)
 {
 if(IsValid(sudoku,row,col,number))
 {
 sudoku [row][col] = number;
 if (col == SIZE-1) {
 if (Solve(sudoku, row+1, 0)) return 1;
 } else {
 if (Solve(sudoku, row, col+1)) return 1;
 }
 sudoku [row][col] = EMPTY;
 }
 }
 return 0;
}

Question 2

Can you add a 9x9 and 16x16 file? It will make answering easier.

Question 3

When you added the 16 X 16 grid you left the size at 9 rather than changing it to 16. This might lead to the wrong results.

Question 4

Sudoku is NP complete. No matter what improvements you make to your code, it will become exceptionally slow as SIZE becomes large.

Question 5

@pacmaninbw oh sorry, I only forgot to change it while I was editing my post earlier. With that part there is no problem, but thank you!

Question 6

Have you tried to use any heuristic? For example if you try solving for the number that occurs the most often first you will have a smaller problem set to solve.

Question 7

The first thing that will help is to switch this from a recursive algorithm to an iterative one. This will prevent the stack overflow that prevents you from solving 25x25, and will be a bit faster to boot.

However to speed this up more, you will probably need to use a smarter algorithm. If you track what numbers are possible in each square, you will find that much of the time, there is only 1 possibility. In this case, you know what number goes there. You then can update all of the other squares in the same row, col, or box as the one you just filled in. To implement this efficiently, you would want to define a set (either a bitset or hashset) for what is available in each square, and use a heap to track which squares have the fewest remaining possibilities.

Question 8

Might I suggest Dancing links as an entry point for your search into a smarter algoriothm?

Question 9

The max recursion depth of this algorithm = number of squares, I think. 25*25 is only 625. The recursion doesn't create a copy of the board in each stack frame, so it probably only uses about 32 bytes per frame on x86-64. (Solve doesn't have any locals other than its args to save across a recursive call: an 8-byte pointer and 2x 4-byte int. That plus a return address, and maintaining 16-byte stack alignment as per the ABI, probably adds up to a 32-byte stack frame on x86-64 Linux or OS X. Or maybe 48 bytes with Windows x64 where the shadow space alone takes 32 bytes.)

Question 10

Anyway, that's only 25*25*48 = 30kB (not 30kiB) of stack memory max, which trivial (stack limits of 1MiB to 8MiB are common). Even a factor of 10 error in my reasoning isn't a problem. So it's not stack overflow, it's simply the O(SIZE^SIZE) exponential time complexity that stops SIZE=25 from running in usable time.

Question 11

Yeah, any idea why it wasn't returning for 25x25 before? Just speed?

Question 12

@OscarSmith: I'd assume just speed, yeah, that's compatible with the OP's wording. n^n grows very fast! Or maybe an unsolvable board? Anyway, Sudoku solutions finder using brute force and backtracking goes into detail on your suggestion to try cells with fewer possibilities first. There are several other Q&As in the "related" sidebar that look useful.

Question 13

The strategy needs work: brute-force search is going to scale very badly. As an order-of-magnitude estimate, observe that the code calls IsValid() around SIZE times for each cell - that's O(n3), where n is the SIZE.

Be more consistent with formatting. It's easier to read (and to search) code if there's a consistent convention. To take a simple example, we have:

int IsValid (int sudoku[SIZE][SIZE], int row, int col, int number)
int Solve(int sudoku[SIZE][SIZE], int row, int col)
 if (Solve (sudoku,0,0))
 if(IsValid(sudoku,row,col,number))

all with differing amounts of space around (. This kind of inconsistency gives an impression of code that's been written in a hurry, without consideration for the reader.

Instead of defining SIZE and deriving SQRT, it's simpler to start with SQRT and define SIZE to be (SQRT * SQRT). Then there's no need for <math.h> and no risk of floating-point approximation being unfortunately truncated when it's converted to integer.

The declaration/definition of main() should specify that it takes no arguments:

int main(void)

If we write int main(), that declares main as a function that takes an unspecified number of arguments (unlike C++, where () is equivalent to (void)).

You can see that C compilers treat void foo(){} differently from void foo(void){} on the Godbolt compiler explorer.

Question 14

Very good suggestion to make SQRT a compile-time constant. The code uses stuff like prRow + i / SQRT and i % SQRT, which will compile to a runtime integer division (like x86 idiv) because int SQRT is a non-const global! And with a non-constant initializer, so I don't think this is even valid C. But fun fact: gcc does accept it as C (doing constant-propagation through sqrt even with optimization disabled). But clang rejects it. godbolt.org/z/4jrJmL. Anyway yes, we get nasty idiv unless we use const int sqrt (or better unsigned) godbolt.org/z/NMB156

Oscar Smith Oscar SmithOscar Smith 3,71718 silver badges31 bronze badges · Answer 1 · 2019-03-25 15:56:44Z

15

\$\begingroup\$

The first thing that will help is to switch this from a recursive algorithm to an iterative one. This will prevent the stack overflow that prevents you from solving 25x25, and will be a bit faster to boot.

However to speed this up more, you will probably need to use a smarter algorithm. If you track what numbers are possible in each square, you will find that much of the time, there is only 1 possibility. In this case, you know what number goes there. You then can update all of the other squares in the same row, col, or box as the one you just filled in. To implement this efficiently, you would want to define a set (either a bitset or hashset) for what is available in each square, and use a heap to track which squares have the fewest remaining possibilities.

Share

answered Mar 25, 2019 at 15:56

Oscar Smith's user avatar

Oscar Smith Oscar SmithOscar Smith

3,71718 silver badges31 bronze badges

\$\endgroup\$

5

3

\$\begingroup\$ Might I suggest Dancing links as an entry point for your search into a smarter algoriothm? \$\endgroup\$

WorldSEnder
– WorldSEnder

2019年03月25日 19:42:51 +00:00
Commented Mar 25, 2019 at 19:42
1

\$\begingroup\$ The max recursion depth of this algorithm = number of squares, I think. 25*25 is only 625. The recursion doesn't create a copy of the board in each stack frame, so it probably only uses about 32 bytes per frame on x86-64. (Solve doesn't have any locals other than its args to save across a recursive call: an 8-byte pointer and 2x 4-byte int. That plus a return address, and maintaining 16-byte stack alignment as per the ABI, probably adds up to a 32-byte stack frame on x86-64 Linux or OS X. Or maybe 48 bytes with Windows x64 where the shadow space alone takes 32 bytes.) \$\endgroup\$

Peter Cordes
– Peter Cordes

2019年03月25日 23:49:34 +00:00
Commented Mar 25, 2019 at 23:49
2

\$\begingroup\$ Anyway, that's only 25*25*48 = 30kB (not 30kiB) of stack memory max, which trivial (stack limits of 1MiB to 8MiB are common). Even a factor of 10 error in my reasoning isn't a problem. So it's not stack overflow, it's simply the O(SIZE^SIZE) exponential time complexity that stops SIZE=25 from running in usable time. \$\endgroup\$

Peter Cordes
– Peter Cordes

2019年03月25日 23:54:50 +00:00
Commented Mar 25, 2019 at 23:54
\$\begingroup\$ Yeah, any idea why it wasn't returning for 25x25 before? Just speed? \$\endgroup\$

Oscar Smith
– Oscar Smith

2019年03月25日 23:57:53 +00:00
Commented Mar 25, 2019 at 23:57
\$\begingroup\$ @OscarSmith: I'd assume just speed, yeah, that's compatible with the OP's wording. n^n grows very fast! Or maybe an unsolvable board? Anyway, Sudoku solutions finder using brute force and backtracking goes into detail on your suggestion to try cells with fewer possibilities first. There are several other Q&As in the "related" sidebar that look useful. \$\endgroup\$

Peter Cordes
– Peter Cordes

2019年03月26日 00:00:21 +00:00
Commented Mar 26, 2019 at 0:00

Add a comment |

score 9 · Answer 2 · 2019-03-25 16:00:01Z

The strategy needs work: brute-force search is going to scale very badly. As an order-of-magnitude estimate, observe that the code calls IsValid() around SIZE times for each cell - that's O(n3), where n is the SIZE.

Be more consistent with formatting. It's easier to read (and to search) code if there's a consistent convention. To take a simple example, we have:

int IsValid (int sudoku[SIZE][SIZE], int row, int col, int number)
int Solve(int sudoku[SIZE][SIZE], int row, int col)
 if (Solve (sudoku,0,0))
 if(IsValid(sudoku,row,col,number))

all with differing amounts of space around (. This kind of inconsistency gives an impression of code that's been written in a hurry, without consideration for the reader.

Instead of defining SIZE and deriving SQRT, it's simpler to start with SQRT and define SIZE to be (SQRT * SQRT). Then there's no need for <math.h> and no risk of floating-point approximation being unfortunately truncated when it's converted to integer.

The declaration/definition of main() should specify that it takes no arguments:

int main(void)

If we write int main(), that declares main as a function that takes an unspecified number of arguments (unlike C++, where () is equivalent to (void)).

You can see that C compilers treat void foo(){} differently from void foo(void){} on the Godbolt compiler explorer.

Very good suggestion to make SQRT a compile-time constant. The code uses stuff like prRow + i / SQRT and i % SQRT, which will compile to a runtime integer division (like x86 idiv) because int SQRT is a non-const global! And with a non-constant initializer, so I don't think this is even valid C. But fun fact: gcc does accept it as C (doing constant-propagation through sqrt even with optimization disabled). But clang rejects it. godbolt.org/z/4jrJmL. Anyway yes, we get nasty idiv unless we use const int sqrt (or better unsigned) godbolt.org/z/NMB156

Stack Exchange Network

Simple recursive Sudoku solver

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Simple recursive Sudoku solver

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions