Running aitextgen model training in a Docker container

I’m setting up an approach to run text generation model training jobs on demand with aitextgen, and the first approach I’m looking at is to run the training in a Docker container. Later I may move this to an AWS service like ECS, but this is my first step.

I’ve built a Docker image with the following dockerfile:

FROM amazonlinux
RUN yum update -y
RUN yum install -y python3
RUN pip3 install aitextgen
ADD source-file-for-fine-tuning.txt .
ADD generate.py .
ADD train.py .

.. and then built my image with:

docker build -t aitextgen .

I then run a container passing in the cmd I want to run, in this case ‘python3 train.py’:

docker run --volume /data/trained_model:/trained_model:rw -d aitextgen sh -c "cd / && python3 train.py && mv aitextgen.tokenizer.json /trained_model"

I’m also attaching a bind point where the model output is being written to during the run, and -d to run the container in the background. The last step in the run command copies the token file to the mounted EBS volume so it can be reused by the generation.

To generate text from the model, run:

docker run --volume /data/trained_model:/trained_model:rw -d aitextgen sh -c "cd / && python3 generate.py"

Mount as EBS volume inside a EC2 instance

By default, if you provision and attach additional EBS volumes for an EC2 instance, they don’t get mounted by default.

The boot EBS is usually /dev/xvda1. Each additional EBS volume should be /dev/xvdb and so on.

First format the new volume:

sudo mkfs -t ext4 /dev/xvdb

Make a mount mount directory like /data, then mount it with:

sudo mount /dev/xvdb /data

Now you should see the new volume available:

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           3.9G  432K  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/xvda1       20G  8.8G   12G  44% /
tmpfs           798M     0  798M   0% /run/user/1000
/dev/xvdb       7.8G   36M  7.3G   1% /data

Add a line to /etc/fstab to mount on startup:

/dev/xvdb /data ext4 defaults,nofail 0 2

These steps are from multiple places, mainly answers to this question.

AWS CloudFormation example for S3 bucket

Typical Cloudformation for an S3 bucket with block all public access enabled:

Resources:
  S3BucketExample:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: s3-bucket-name
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true