Using AWS SageMaker to train a model to generate text (part 3): What I learned about Python AWS Lambdas this week

This is a follow-on from my investigation on how to use AWS SageMaker as an AWS replacement to my current approach to generate text from a Machine Learning model. Up until this point I’ve been running torch-rnn on a server locally. You can follow part 1 and part 2 of my progress so far.

In summary, here’s what I learned this week:

  • Some Python modules are OS platform specific. That means, if you install a module on MacOS, you can’t zip it up as a dependency in a .zip deployment for a Lambda which runs on Linux, as it won’t be OS compatible
  • The maximum size for an AWS Lambda deployment is 50MB. Zipping up what I’ve built so far (only a minimal script but relying on a number of modules) I’ve got a 500MB zip file. Clearly that’s too large to deploy as a Lambda
  • Following some suggestions here,  there are Python frameworks (such as Zappa) to help build Python based AWS Lambdas and address some of the issues with modules and deployment. Clearly I’ve got some learning here to get his to work 🙂

Using AWS SageMaker to train a model to generate text (part 2)

This is part 2 following on from my previous post, investigating how to take advantage of AWS SageMaker to train a model and use it to generate text for my Twitter bot, @kevinhookebot.

From the AWS SageMaker docs, in order to get the data in a supported format to use to train a model, it mentions “A script to convert data from tokenized text files to the protobuf format is included in the seq2seq example notebook”

Ok, so from the SageMaker Notebook I created in part 1, let’s start it up via the AWS console:

Once started, clicking the ‘Open’ link to open the Jupyter notebook, we can open the seq2seq example which is in the ‘SageMaker Examples’ section:

From looking at the steps in this example Notebook, it’s clear that this character2character algorithm is more focused on translating text from source to destination (such as translating text in one language to another, as shown in this example notebook).

Ok, so this isn’t what I was looking for so let’s change gears. My main objective is to be able to train a new model using AWS SageMaker service, and generate text from it. From what I understand so far, you have two options how you can use SageMaker. You can either use the AWS Console for SageMaker to create Training Jobs using the built in algorithms, or you can use a Juypter notebook and define the steps yourself using Python to retrieve your data source, prepare the data, and train a model.

At this point the easiest thing might be to look for another Recurrent Neural Net (RNN) to generate characters to replace the Lua Torch char-rnn approach I was previously running locally on an Ubuntu server. Doing some searching I found char-rnn.pytorch.

This is my first experience setting up a Juypter notebook, so at this point I’ve no idea if what I’ve doing is the right approach, but I’ve got something working.

On the righthand side of the notetbook I pressed the New button and selected a Python PyTorch notebook:

Next I added a step to clone the char-rnn.pytorch repo into my notebook:

Next I added a step to use the aws cli to copy my data file for training the model into my notebook:

Next, adding the config options to train a model using char-rnn.pytorch, I added a step to run the training, but it gave an error about some Python modules missing:

Adding an extra step to use pip to install the required modules:

The default number of epochs is 2,000 which takes a while to run, so decreasing this to something smaller with –n_epochs 100 we get a successful run, and calling the generate script, we have content!

I trained with an incredibly small file to get started, just 100 lines of text, for a very short time. So next steps I’m going to look at:

  • training with the full WordPress export of all my posts for a longer training time
  • training with a cleaned up export (remove URL links and other HTML markup)
  • automate the text generation from the model to feed my AWS Lambda based bot

I’ll share another update on these enhancements in my next upcoming post.

 

Building a Card Playing Twitter Bot: gameplay dialog

I’ve built a couple of other Twitter bots, I have @kevinhookebot which generates random tweets generated from a trained ML model:

and I have a product name generator which generates humorous product names using the Tracery template library:

For my next project I’m thinking about what it would involve to build a multiplayer card game playing bot, a simple card game to get started, like Blackjack. The Twitter REST apis I’ve used so far will be reusable for this project, but the interesting parts are the interaction between a player and the bot, the game logic, and the persistence of game state (each of which I’ll discuss in future posts).

I’ve been thinking about the interaction for the gamevand think it will look something like this:

Player: @blackjackcard deal

Bot: @player bot deals you 4 Clubs and 7 Spades. Reply hit or stick

Player: @blackjackcard hit

Bot: @player bot deals you 4 Hearts. You now have 4 Clubs, 7 Spades, 4 Hearts. Reply hit or stick

Player: @blackjackcard stick

Bot: @player the bot currently has 3 Hearts, 9 Clubs, and takes a card

Bot: @player the bot takes 10 Diamonds and now has 3 Hearts, 9 Clubs, 10 Diamonds. Bust! You win!

The interesting part of the gameplay interaction is that there’s only 3 commands:

  • deal: start a game (get dealt your initial two cards)
  • hit: get dealt another card
  • stick: keep current hard

This makes the options that the bot needs to handle pretty simple. Next up, I’ll talk about persisting the game state to AWS DynamoDB.