Using my Sudoku Solver to generate new puzzles (part 2)

Following on from Part 1, I’ve been looking at some alternative approaches to generating new Sudoku puzzles. It turns out that having a working and effective solver is essential in generating new puzzles, because you need to know if you’ve generated a valid puzzle, meaning a puzzle with only a single solution.

The generally accepted approach to creating new puzzles is to generate a completed grid and then remove values from cells, checking at each step that you still have a valid puzzle and then stop at a point where you’ve generated a puzzle with the required difficulty. Grading a puzzle ‘easy’ through ‘difficult’ is another topic unto itself, so I’ll come back to this later.

Here’s a summary of possible (but not exhaustive) approaches to generate a completed grid:

Brute force approach with backtracking

  • fill each cell with a random number
  • check puzzle is valid
  • if puzzle is not valid, go back to previous cell and try another random number

Although this approach would work, it’s obviously not the most efficient approach. Brute force algorithms usually take advantage of the speed of today’s CPUs to offset the inefficiency of trying every combination until you get what you’re looking for.

Running a solver against an empty grid

  • run solver against empty grid
  • ensure solver is picking a random candidate column from it’s exact cover matrix (see previous posts for how this is used by the solver)

This approach is interesting as it reuses the Solver, and instead of solving a puzzle with a limited number of givens, the puzzle has zero givens. My Solver uses the same approach with Knuth’s Algorithm X to ‘solve’ an empty grid, it doesn’t make much difference whether there’s a minimum of 17 givens for a valid puzzle or zero puzzle, it will still fill the grid.

The downside of my current Solver approach though is that it will always fill the grid with the same combination of numbers due to the way it always picks the next value based on the next unmet constraint with the least number of remaining possible candidates. While this approach sames significant time solving a valid puzzle, it does mean the approach has a predictable result and doesn’t produce random filled grids. This could be updated to add some randomness, but I think there’s an easier approach.

Run solver against a partially populated grid

  • Populate row 1 in the grid with a random combination of 1 through 9
  • Run the solver against this ‘seeded’ grid

Like the previous approach, this reuses the solver. However, by ‘seeding’ the first row with a random combination of 1 through 9, this does result in different grids being generated. Without any closer investigation however, due to the predictable results from my Solver, I think at best this probably means this approach will only generate 9! combinations (362880). While this is enough puzzles to keep you busy for a least a few weeks (!), there still has to be a better approach.

Conclusions

Since I already have a working solver, I’m going to use the ‘seeded’ partially populated grid approach, run my seeder against it, then remove a number of candidates, then run a ‘difficulty grader’ against it. I’ve been working on the ‘grader’ for a while now, I’ll post an update on the approach I’m taking with that in another post.