Even if you’re not actively following recent trends in AI and Machine Learning, you may have come across articles by a researcher who experiments with training neural nets to generate interesting things such as:
- cooking recipes – including culinary wisdom such as:
Brown salmon in oil. Add creamed meat and another deep mixture
- recipe titles – including my favorites:
- Chocolate Pickle Sauce
- Completely Meat Chocolate Pie
- and even craft beer names
So what’s going on here? What’s being used is something called a Recurrent Neural Net to generate text in a specific style. It’s trained with input data which it analyzes to recognizes patterns in the text, constructing a model of that data. It can then generate new text following the same patterns, sometimes with rather curious and amusing results.
A commonly referred to article on this topic is by Andrej Karpathy, titled “The Unreasonable Effectiveness of Recurrent Neural Networks” – it’s well worth a read to get an understanding of the theory and approach.
There’s many RNN implementations you can download and start training with any input data you can imagine. Here’s a few to take a look at:
- char-rnn by Andrej Karpathy
- torch-rnn – a re-implementation of char-rnn using Torch
- textgenrnn – a character RNN python module
- … and many more
So it occurred to me, what would happen if you trained a RNN with all your past Twitter tweets, and then used it to generate new tweets? Let’s find out 🙂
Let’s try it out with torch-rnn – the following is a summary of install steps from https://github.com/jcjohnson/torch-rnn:
sudo apt-get -y install python2.7-dev
sudo apt-get install libhdf5-dev
Install torch, from http://torch.ch/docs/getting-started.html#_ :
git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh
#source new PATH for first time usage in current shell
source ~/.bashrc
Now clone the torch-rnn repo:
git clone https://github.com/jcjohnson/torch-rnn.git
Install torch deps:
luarocks install torch luarocks install nn luarocks install optim luarocks install lua-cjson
Install torch-hdf5:
git clone https://github.com/deepmind/torch-hdf5
cd torch-hdf5
luarocks make hdf5-0-0.rockspec
Install pip to install python deps:
sudo apt-get install python-pip
From inside torch-rnn dir:
pip install -r requirements.txt
Now following steps from docs to preprocess your text input:
python scripts/preprocess.py \ --input_txt my_data.txt \ --output_h5 my_data.h5 \ --output_json my_data.json
For my input tweet text this looks like:
python scripts/preprocess.py \
--input_txt ~/tweet-text/tweet-text.txt \
--output_h5 ~/tweet-text/tweet-text.h5 \
--output_json ~/tweet-text/tweet-text.json
This gives me:
Total vocabulary size: 182 Total tokens in file: 313709 Training size: 250969 Val size: 31370 Test size: 31370
Now to train the model:
th train.lua \ -input_h5 my_data.h5 -input_json my_data.json
For my input file containing my tweet text this looks like:
th train.lua
-input_h5 ~/tweet-text/tweet-text.h5
-input_json ~/tweet-text/tweet-text.json
This gave me this error:
init.lua:389: module 'cutorch' not found:No LuaRocks module found for cutorch no field package.preload['cutorch']
Trying to manually install cutorch I got errors about cuda toolkit:
CMake Error at /usr/share/cmake-3.5/Modules/FindCUDA.cmake:617 (message): Specify CUDA_TOOLKIT_ROOT_DIR
Checking the docs:
By default this will run in GPU mode using CUDA; to run in CPU-only mode, add the flag -gpu -1
… so adding -gpu -1 and trying again, now I’ve got this output as it runs:
Epoch 1.44 / 50, i = 44 / 5000, loss = 3.493316
… one line every few seconds.
After some time it completes a run, and you’ll find files like this in your cv dir beneath where you ran the previous script:
checkpoint_1000.json
checkpoint_1000.t7
checkpoint_2000.json
checkpoint_2000.t7
checkpoint_3000.json
checkpoint_3000.t7
checkpoint_4000.json
checkpoint_4000.t7
checkpoint_5000.json
checkpoint_5000.t7
Now to run and get some generated text:
th sample.lua -checkpoint cv/checkpoint_5000.t7 -length 500 -gpu -1 -temperature 0.4
Breaking this down:
-checkpoint : as the model training runs, it saves these point in time snapshots of the model. You can run the generation against any of these files, but it seems the last file it generates gives you the best results
-length : how many characters to generate from the model
-gpu -1 : turn off the gpu usage
-temperature : this ranges from 0.1 to 1 and with values closest to zero the generation is less creative, closer to 1 the generated output is, let’s say, more creative
Let’s run a couple of example. Let’s do 140 chars are -temperature 0.1:
The programming to softting the some the programming to something the computer the computer the computer to a computer the com
and now lets crank it up to 1.0:
z&loDOps be sumpriting sor’s a porriquilefore AR2 vanerone as dathing 201lus: It’s buct. Z) https://t.co/gEDr9Er24N Amatere. PEs’me tha
Now we’ve some pretty random stuff including a randomly generated shortened url too.
Using a value towards the middle, like 0.4 to 0.5 gets some reasonably interesting results that are not too random, but somewhat similar to my typical tweet style. What’s interesting is my regular retweets of software development quotes from @CodeWisdom have heavily influenced the model, so based on my 3000+ tweets it generates text like:
RT @CodeWisdom followed by random generated stuff
Given that the following text is clearly not content from @CodeWisdom, it wouldn’t be appropriate to use this text as-is and post it as a new tweet. Since I’m looking to take this text and use it as input for an automated Twitter-bot, as interesting as this generated pattern is in that it does look like the majority of my tweets, I’ve filtered out anything that starts with ‘RT @text’
I’ve already implemented a first attempt at a Twitter bot using this content with an AWS Lambda running on a timed schedule, you can check it out here:
to a programming to sure the code to like a do and the programmer
— Kevin Hooke Bot (@kevinhookebot) April 6, 2018
I’ll be following up with some additional posts on the implementation of my AWS Lambda soon.