No, AI models will not replace programmers any time soon

This month’s “Communications of the ACM” magazine (01/2023) published a rather alarmist article titled ‘The End of Programming’. While it is a well written article, it bets heavily on the future usefulness of AI models like ChatGPT to generate working code, replacing the need for programmers to write code by hand. ChatGPT is currently getting a lot of attention in the media and online right now, with people finding out that not only can you ask questions on any topic and get a believable answer, you can also ask it a more practical question like “show me C code to read lines of a file”.

Finding out that ChatGPT can be used to ‘generate’ code is prompting questions online from new developers posting questions like ‘should I start a career in software development when programmers are likely going to be replaced by ChatGPT?’

The tl;dr answer: ChatGPT is not replacing anyone any time soon.

While development and improvement of these types of AI model is going to continue, it’s worth keeping in mind that these models are only as good as the material they are trained on, which also means they’re limited by the correctness or usefulness of the material used for training. This also means they are subject to the age old problem of ‘garbage in, garbage out’. What’s not being discussed enough is that these current models do not understand the content they generate. They also have no understanding of whether any of generated content is correct, either factually correct for text, or syntactically correct for code snippets. Unlike these ML trained models, as humans we use our existing knowledge and experience to infer other missing details from what we read or hear. We’re also good at using our existing knowledge to assess how correct or realistic new information is based on what we already know to be true. AI models currently do not have this level of understanding (although research has been attempting to replicate ‘understanding’ and ability to make decisions based on existing facts for years (Google ‘expert systems’ for more info).

I’ve seen developers recently attempting to answer questions on Stack Overflow, Reddit and other sites using ChatGPT, with and without success based on whether the topic of the subject was within the scope of materials the model was trained with.

The current problem with text generation from models is that the models lack context. The current models don’t understand context, and so can attempt to generate a response based on identifying key words from the input prompt, but that doesn’t always result in an answer the same way as if a human would answer the same question. Model also don’t understand intent. A question can be asked in a number of similar but different way, and to another human you may be able to infer the intent or purpose of the question, but to a current general purpose trained ML models, that’s currently not possible.

In its current form, ChatGPT is trained on materials currently available online, websites with both static articles and reference materials, as well as question and answer discussion sites. The limitation with this approach is that if I ask a very specific question like ‘show me example code for building a REST api with Spring Boot’, there are plenty of examples online and assuming the model was trained on at least some of these, then the resulting answer could incorporate some of this material. The answer isn’t likely to be better than anything you could have found yourself online if you just Googled the same question. There could be some benefit from having an answer as a conglomeration of text from various sources, but that can also mean that the combined text ends up being syntactic gibberish (the model doesn’t currently know if what it’s returning to you is syntactically correct).

It’s clear that there is promise in this area to be used to aid and support developers, but as a complete replacement for all custom software development work in it’s current form, this seems highly unlikely, or not at least within the next 10 years, and possibly even longer.

Generating tweets using a Recurrent Neural Net (torch-rnn)

Even if you’re not actively following recent trends in AI and Machine Learning, you may have come across articles by a researcher who experiments with training neural nets to generate interesting things such as:

Brown salmon in oil. Add creamed meat and another deep mixture

  • Chocolate Pickle Sauce
  • Completely Meat Chocolate Pie

So what’s going on here? What’s being used is something called a Recurrent Neural Net to generate text in a specific style. It’s trained with input data which it analyzes to recognizes patterns in the text, constructing a model of that data. It can then generate new text following the same patterns, sometimes with rather curious and amusing results.

A commonly referred to article on this topic is by Andrej Karpathy, titled “The Unreasonable Effectiveness of Recurrent Neural Networks” – it’s well worth a read to get an understanding of the theory and approach.

There’s many RNN implementations you can download and start training with any input data you can imagine. Here’s a few to take a look at:

So it occurred to me, what would happen if you trained a RNN with all your past Twitter tweets, and then used it to generate new tweets? Let’s find out 🙂

Let’s try it out with torch-rnn – the following is a summary of install steps from https://github.com/jcjohnson/torch-rnn:

sudo apt-get -y install python2.7-dev
sudo apt-get install libhdf5-dev

Install torch, from http://torch.ch/docs/getting-started.html#_ :

git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh
#source new PATH for first time usage in current shell
source ~/.bashrc

Now clone the torch-rnn repo:

git clone https://github.com/jcjohnson/torch-rnn.git

Install torch deps:

luarocks install torch
luarocks install nn
luarocks install optim
luarocks install lua-cjson

Install torch-hdf5:

git clone https://github.com/deepmind/torch-hdf5
cd torch-hdf5
luarocks make hdf5-0-0.rockspec

Install pip to install python deps:

sudo apt-get install python-pip

From inside torch-rnn dir:

pip install -r requirements.txt

Now following steps from docs to preprocess your text input:

python scripts/preprocess.py \
  --input_txt my_data.txt \
  --output_h5 my_data.h5 \
  --output_json my_data.json

For my input tweet text this looks like:

python scripts/preprocess.py \
  --input_txt ~/tweet-text/tweet-text.txt  \
  --output_h5 ~/tweet-text/tweet-text.h5 \
  --output_json ~/tweet-text/tweet-text.json

This gives me:

Total vocabulary size: 182

Total tokens in file: 313709

  Training size: 250969

  Val size: 31370

  Test size: 31370

Now to train the model:

th train.lua \
  -input_h5 my_data.h5 
  -input_json my_data.json

For my input file containing my tweet text this looks like:

th train.lua 
  -input_h5 ~/tweet-text/tweet-text.h5 
  -input_json ~/tweet-text/tweet-text.json

This gave me this error:

init.lua:389: module 'cutorch' not found:No LuaRocks module found for cutorch

 no field package.preload['cutorch']

Trying to manually install cutorch I got errors about cuda toolkit:

CMake Error at /usr/share/cmake-3.5/Modules/FindCUDA.cmake:617 (message):

  Specify CUDA_TOOLKIT_ROOT_DIR

Checking the docs:

By default this will run in GPU mode using CUDA; to run in CPU-only mode, add the flag -gpu -1

… so adding -gpu -1 and trying again, now I’ve got this output as it runs:

Epoch 1.44 / 50, i = 44 / 5000, loss = 3.493316

… one line every few seconds.

After some time it completes a run, and you’ll find files like this in your cv dir beneath where you ran the previous script:

checkpoint_1000.json
checkpoint_1000.t7
checkpoint_2000.json
checkpoint_2000.t7
checkpoint_3000.json
checkpoint_3000.t7
checkpoint_4000.json
checkpoint_4000.t7
checkpoint_5000.json
checkpoint_5000.t7

Now to run and get some generated text:

th sample.lua -checkpoint cv/checkpoint_5000.t7 -length 500 -gpu -1 -temperature 0.4

Breaking this down:

-checkpoint : as the model training runs, it saves these point in time snapshots of the model. You can run the generation against any of these files, but it seems the last file it generates gives you the best results

-length : how many characters to generate from the model

-gpu -1 : turn off the gpu usage

-temperature : this ranges from 0.1 to 1 and with values closest to zero the generation is less creative, closer to 1 the generated output is, let’s say, more creative

Let’s run a couple of example. Let’s do 140 chars are -temperature 0.1:

The programming to softting the some the programming to something the computer the computer the computer to a computer the com

and now lets crank it up to  1.0:

z&loDOps be sumpriting sor’s a porriquilefore AR2 vanerone as dathing 201lus: It’s buct. Z) https://t.co/gEDr9Er24N Amatere. PEs’me tha

Now we’ve some pretty random stuff including a randomly generated shortened url too.

Using a value towards the middle, like 0.4 to 0.5 gets some reasonably interesting results that are not too random, but somewhat similar to my typical tweet style. What’s interesting is my regular retweets of software development quotes from @CodeWisdom have heavily influenced the model, so based on my 3000+ tweets it generates text like:

RT @CodeWisdom followed by random generated stuff

Given that the following text is clearly not content from @CodeWisdom, it wouldn’t be appropriate to use this text as-is and post it as a new tweet. Since I’m looking to take this text and use it as input for an automated Twitter-bot, as interesting as this generated pattern is in that it does look like the majority of my tweets, I’ve filtered out anything that starts with ‘RT @text’

I’ve already implemented a first attempt at a Twitter bot using this content with an AWS Lambda running on a timed schedule, you can check it out here:

 


I’ll be following up with some additional posts on the implementation of my AWS Lambda soon.