AWS Workspaces: Web access to Linux Workspaces (not supported)

If you attempt to access an AWS Workspace running Linux via a browser, you’ll get his spinner for what seems like a couple of minutes:

Eventually it times out with this error:

Looking in the docs here this is this note:

Web access to Linux WorkSpaces is currently not supported. Would be nice if it gave you an error when to attempt to logon to tell you it’s not a supported option instead of just falling.

Experimenting with the gpt-2 models and text generation

I’m experimenting with the gpt-2 774M model to generate text based on a prompt. Starting up with:

python3 src/interactive_conditional_samples.py --temperature=0.7 --model_name=774M --nsamples=1 --length=100

And then providing a prompt of something relevant like :

“The problem with training Machine Learning models today is that the model is only as good as the data it is trained with.”

and let’s see what we get:

This almost sounds believable as if it was written by hand. It might not make complete sense, but the sentences are well formed, something that was an issue with my previous experimenting with RNN model text generation.

Here’s. the text for reference:

“There is a desire to train the model on a large data set.

A very big data set is not always a good enough data set.

A good data set is more than just enough. A good data set is always growing.

To take advantage of this growing data set we need to train the model on the latest batch of data. However, there may be a huge amount of data that does not fit in the pre-defined batch size.”

I’m curious how the temperature setting effects what is generated, so let’s bump it up from 0.7 to 0.9 and try again with the same prompt text.

This sounds like it comes straight out of a research paper:

“This is why there is now a trend toward training a workforce of highly trained humans in a search engine optimization function (SRE) to monitor changes in signal and value in the big data.

Whether it is a search engine algorithm like Chi-squared or a machine learning model like a neural network, there is still a lot of work ahead to make those apps more relevant to the data they collect.

The need for better data

Having data that makes the predictions bigger, on”

Not knowing exactly how this works, it almost reads like it’s regurgitating whole chunks of the data it was trained on, but who knows. Given that it was trained from materials and sites online, I guess the easy way to test for this would be to just drop one of these sentences in Google and see if something like the source article comes up. Trying this out:

Huh. So this sentence is not found anywhere in Google’s indexes. That’s quite remarkable if the model actually generates sentences that are this convincing.

What about starting with a line from a recipe (for example from here):

“Heat oil in large pot. Add beef and flour, stirring to coat beef; brown.”

Ok, now this is interesting but slightly weird. It’s clearly recognizable as steps in a recipe, but the order doesn’t make too much sense and some of the steps are repeated variations of themselves:

“Add onions and garlic and cook until onion starts to turn translucent, 3 minutes. Stir in bell pepper, glass tomatoes, and celery; cook 1 minute. Add potatoes, beef broth, and bay leaves; bring to a boil. Reduce heat to low. Reduce heat to medium; cook until beef is no longer pink. Taste and add salt and pepper. Add beef and stir. Add liquid; cover and simmer until meat is tender, 10 minutes.”

What about lines from films?

“I find your lack of faith disturbing”

Now we’re gone off into Crazy Town. This is weird:

“Why do you persist in viewing my appearance with contempt?”

“I want to believe I am worthy of your worship.”

“Therefore, you have no right to demand I lift my gaze.”

“I am unworthy of your reverence; I can only pray that you may address me in a less lewd manner.”

“You call me a beast?”

“You cannot deny the very fact you see me.”

“I see nothing.”

“Why?”

Whoah. Ok, keeping with the Star Wars theme, what does it do if you start it with some Yoda quotes:

“When 900 years old you reach, look as good you will not”

And I got:

“I listen to every guru and I say, good luck with your yoga and counting numbers.”

Does this phrase exist anywhere?

Nope. You have to wonder if there were some Joda quotes or lines from films that went into the model, it’s interesting that ‘yoga’ is mentioned, and you wonder if this is the model generating a variation of the name ‘Yoda’. And ‘When 900 years old you reach” = “Counting numbers”? Very curious.

Ok, one more. How about a line from an HP monitor product description:

“Designed with purpose: multiple customization options allow you to adjust your monitor to best fit the way you work.”

Incredible:

“Customizable paired ergonomics and colors: customize every feature of your new monitor from its basic design and color customization options.

Intuitive control links: there are more than 100

cabling connections provided with each monitor.

Designed for comfort: more than twice as many inches of space between the screen and the desk top than standard monitors.

Dedicated power buttons for powering the monitor on and off.”

Ok, enough playing, I did have a project planned to use with this, so time to get busy 🙂

Issues running gpt-2 with provided Docker image

After following the instructions to build the gpt-2 Docker image here, I started up a bash shell into the container:

docker run  -it gpt-2 bash

And then ran:

python3 src/generate_unconditional_samples.py | tee /tmp/samples

This failed with this error:

AttributeError: module 'tensorflow' has no attribute 'sort'

A post here says to upgrade to Tensorflow 1.14.0 in the container, so running:

pip install tensorflow==1.14.0

And then retrying, text generated! Now to start playing and see what this provided models will generate!

Is the quality of new questions on StackOverflow declining and/or is StackOverflow becoming less useful?

Now and then I browse StackOverflow hoping to be able to pick up a few easy questions and help some new developers. My activity on StackOverflow is sporadic, I browse a few times over a few days then a few months go by before I take another look. Why does my activity have gaps of sometimes months between activity? I haven’t thought about it before but I think I get disheartened by the experience. You invest time coming up with an appropriate answer and then even on a question where your answer is the only answer, the original poster doesn’t even bother to vote up your answer let alone select it as the best answer. More often that not, a new user with a rep of 1 presumably gets the answer they were looking for then disappears.

My motivation is more to provide help rather than earn more rep, but still, the rep system is the only tangible reward you get for participation, sometimes it just doesn’t feel worthwhile when you get nothing in return.

Here’s the other thing I’ve noticed recently: the majority of new questions asked rarely meet the requirements for acceptable questions you can ask, or are asked in a way that doesn’t meet the criteria of a good question. As a result, most new questions are downvoted and closed. That’s sad. Taking a quick look at the 10 most recent new questions tagged ‘Java’ right now and their current votes:

  • -1, no answers
  • -1 : closed, no answers
  • -4 : closed, no answers
  • -4, no answers
  • 0 with 1 answer
  • -1 : no answers
  • -3 : closed, no answers
  • 0 : 1 answer
  • 0 : no answers
  • 0 : no answers

Out of these 10:

  • 3 were already closed as not meeting the guidelines,
  • 3 have downvotes and will likely be closed unless they can be edited to meet guidelines,
  • 2 have no votes and 1 answer,
  • 2 no votes no answers.

This is pretty typical most days that I take a quick browse. Since I have Review Queue privs on new posts, out of maybe 3 out of 5 new questions I review I add comments to refer to the ‘what can I ask‘ and ‘how to ask‘ faqs, because most new questions are most obviously not following the guidelines.

Which brings me to my other observation which is pretty surprising when you think about it:

Of over 15 million registered users, users that actually respond to questions either by posting answers, comments, voting on questions or basically any interaction with the site that keeps it running in any useful way is a tiny percent of the total number of users. Look at the current leaderboard stats for all time:

You get a rep of 1 just for creating an account. This doesn’t even include users that interact with the site, searching for answer to questions etc when not even logged on or those not registered. Registered users at the lowest level of rep on the site between 1 and 199 are about 88% of the total registered users. Users with rep of 200 and above are about 12%.

This means the users that are actually providing answers to questions, editing questions/answers and asking clarifying questions are only 12% of the community’s users. That’s surprisingly low when you think the whole purpose of the StackOverflow site is to ask a question and (hopefully) get a useful answer.

Don’t get me wrong. I love StackOverflow. It’s a frequently used tool in my daily workflow as a developer and I’ve used it for years, as have many if not all developers everywhere. It’s just curious when you look at the numbers that the success of the site relies on the volunteer community of such a few users prepared to give back, where the largest percentage of users are those coming to the site with questions.