I’m setting up an approach to run text generation model training jobs on demand with aitextgen, and the first approach I’m looking at is to run the training in a Docker container. Later I may move this to an AWS service like ECS, but this is my first step.
I’ve built a Docker image with the following dockerfile:
RUN yum update -y
RUN yum install -y python3
RUN pip3 install aitextgen
ADD source-file-for-fine-tuning.txt .
ADD generate.py .
ADD train.py .
.. and then built my image with:
docker build -t aitextgen .
I then run a container passing in the cmd I want to run, in this case ‘python3 train.py’:
docker run --volume /data/trained_model:/trained_model:rw -d aitextgen sh -c "cd / && python3 train.py && mv aitextgen.tokenizer.json /trained_model"
I’m also attaching a bind point where the model output is being written to during the run, and -d to run the container in the background. The last step in the run command copies the token file to the mounted EBS volume so it can be reused by the generation.
To generate text from the model, run:
docker run --volume /data/trained_model:/trained_model:rw -d aitextgen sh -c "cd / && python3 generate.py"
I thought I would turn this around and instead of building a solver, attempt to build a puzzle generator. As I already have a working solver (I’ve also build a React frontend for the Algorithm X solver that runs as an AWS Lambda), checking if I have a valid puzzle (one with only a single solution) is easy as I can reuse my existing solver. The hard part it turns out to be how you grade the difficulty of a puzzle, i.e. is a puzzle easy, medium or hard? This turns out to be a subjective point of view, based more personal opinion rather than any established levels of measuring difficulty. It’s interesting to note that apparently unlike a solver, there are no established mathematical approaches for grading a Sudoku puzzle.
Ok, so if it’s subjective, how to we write an app to automate the grading? First, the degree of how easy or complex a puzzle is determined using the same approaches to solve the puzzle as a human attempting to solve a puzzle does. The range and number of different approaches needed to solve a puzzle can be used to determine the relative difficulty. This is where the subjective nature comes in.
It seems to be commonly agreed that a puzzle rated as ‘simple’ can be solved solely by identifying ‘naked singles’ and/or ‘hidden singles.’ I’m not going to define all of the solving techniques as they’re described in many places online already, but for reference for the first couple see here:
Next for ‘medium’ difficulty, if you need to use naked pairs and hidden pairs, this puzzle is commonly considered medium difficulty.
Similarly for hard, if you need to used naked triples and/or hidden triples.
At this point more complex techniques such as x-wing and swordfish seem to be used to determine very hard or expert level, but there seems to be some variation on what level of difficulty require these techniques. There’s also many other techniques, some are listed here.
Grading Puzzle Examples
Given that there’s no standard approach to grade a puzzle, it’s probably going to be common that if you take an example puzzle from any source, the approaches used to grade that puzzle might be different from the rules you would apply, and therefore there’s likely to be some variation in difficulty of puzzles.
Let’s take a couple of example puzzles from sources online and run them through my grader so far so see if we’re on the right track.
Here’s an example where the source of this puzzle rates this one as difficult, but applying my ranking criteria my approach would classify this one as a medium (since it requires more than finding just singles, it requires finding pairs, but not any other techniques other than pairs).
Writing a solver following recognized algorithms like Dancing Links and Algorithm X turned out to be much simpler than developing a human ranker. For the ranker it took significant time to get the approaches implemented so they were working correctly and to the point where they could actually solve a puzzle. As this was a personal time project and I was only working on this a couple of hours a week, it took the majority of 2020 to get this complete and working. By comparison the Algorithm X solver I was able to complete in a couple of hours a week over about 2 weeks.
Completing this grader was one step towards implementing a puzzle generator. This is now also complete and running a couple of times a day, generating new puzzles which you can load via my React frontend here.