Grading the difficulty of a Sudoku puzzle

I’ve made a couple of previous attempts at developing Sudoku solvers, one naive approach which didn’t work too well, and a second, more informed attempt implementing Donald Knuth’s Algorithm X, which works incredibly well.

I thought I would turn this around and instead of building a solver, attempt to build a puzzle generator. As I already have a working solver (I’ve also build a React frontend for the Algorithm X solver that runs as an AWS Lambda), checking if I have a valid puzzle (one with only a single solution) is easy as I can reuse my existing solver. The hard part it turns out to be how you grade the difficulty of a puzzle, i.e. is a puzzle easy, medium or hard? This turns out to be a subjective point of view, based more personal opinion rather than any established levels of measuring difficulty. It’s interesting to note that apparently unlike a solver, there are no established mathematical approaches for grading a Sudoku puzzle.

Ok, so if it’s subjective, how to we write an app to automate the grading? First, the degree of how easy or complex a puzzle is determined using the same approaches to solve the puzzle as a human attempting to solve a puzzle does. The range and number of different approaches needed to solve a puzzle can be used to determine the relative difficulty. This is where the subjective nature comes in.

It seems to be commonly agreed that a puzzle rated as ‘simple’ can be solved solely by identifying ‘naked singles’ and/or ‘hidden singles.’ I’m not going to define all of the solving techniques as they’re described in many places online already, but for reference for the first couple see here:

Next for ‘medium’ difficulty, if you need to use naked pairs and hidden pairs, this puzzle is commonly considered medium difficulty.

Similarly for hard, if you need to used naked triples and/or hidden triples.

At this point more complex techniques such as x-wing and swordfish seem to be used to determine very hard or expert level, but there seems to be some variation on what level of difficulty require these techniques. There’s also many other techniques, some are listed here.

Grading Puzzle Examples

Given that there’s no standard approach to grade a puzzle, it’s probably going to be common that if you take an example puzzle from any source, the approaches used to grade that puzzle might be different from the rules you would apply, and therefore there’s likely to be some variation in difficulty of puzzles.

Let’s take a couple of example puzzles from sources online and run them through my grader so far so see if we’re on the right track.

Here’s an ‘easy’ puzzle example from https://www.websudoku.com :

Running my grader against this puzzle, I get:

Puzzle solved: Yes
Initial givens: 35
Passes through grid: 7
Naked singles found: 42
Hidden singles found: 0

This puzzle can be solved only by finding naked singles, therefore by my grading system, this is rated as ‘easy’ difficulty.

This next example needs additional approaches to solve, although it needs two approaches that would still be considered at the easy level, naked singles and also hidden singles:

This is a hard example from SukokuWiki:

Output from solver:

Puzzle solved: Yes
Initial givens: 44
Passes through grid: 35
Naked singles found: 6
Hidden singles found: 30
Naked pairs found: 17

Here’s an example where the source of this puzzle rates this one as difficult, but applying my ranking criteria my approach would classify this one as a medium (since it requires more than finding just singles, it requires finding pairs, but not any other techniques other than pairs).

Summary

Writing a solver following recognized algorithms like Dancing Links and Algorithm X turned out to be much simpler than developing a human ranker. For the ranker it took significant time to get the approaches implemented so they were working correctly and to the point where they could actually solve a puzzle. As this was a personal time project and I was only working on this a couple of hours a week, it took the majority of 2020 to get this complete and working. By comparison the Algorithm X solver I was able to complete in a couple of hours a week over about 2 weeks.

Completing this grader was one step towards implementing a puzzle generator. This is now also complete and running a couple of times a day, generating new puzzles which you can load via my React frontend here.