AWS EKS: Kubernetes clusters provisioned with CloudFormation templates

AWS was the last of the major cloud providers to offer a managed Kubernetes service (GA announced June this year). All the others have already had offerings up and available for some time (Google Kubernetes Engine – GKE, Microsoft Azure Kubernetes Service – AKS, IBM Cloud Kuberenetes Service, even Oracle Cloud have their Container Engine for Kubernetes). At the point when AWS announced via Kubernetes service last year, many people declared that the container orchestration wars between Kubernetes, Mesos and Docker Swarm (and others?) was over. At this point Kubernetes has become a common runtime platform for running microservices on any of the major cloud platforms.

The great thing about the pay as you go approach of the cloud is that it’s easy to spin up anything on demand and kick the tires. I’ve been experimenting with Kubernetes running on a my HomeLab ESXi server for a while, and have been bouncing around the idea of moving some personal hobby projects from currently running in Docker containers on cheap VPSes online to my own Kubernetes cluster in the cloud.

I had a couple of attempts walking through the rather extensive EKS setup instructions. My first attempt I didn’t manage to get a working cluster running, but learned enough about what I was supposed to do and where I’d gone wrong, that on my next attempt I got my cluster up and running ok.

From my limited experience so far, there’s little in the way of being able to ‘one click provision’ a new EKS cluster on AWS. It takes about an hour to walk through the setup scripts, which although written well, there’s not enough automation and too much reliance on provided CloudFormation scripts. In comparison at the other end of the ease of provisioning spectrum, take a look at Google’s Kubernetes Engine offering. While on a road trip, I created a Google Cloud account on my phone as a passenger in a car and created a GKE cluster with multiple nodes in less time than it took to create my Google Cloud billing account and enter my credit card details. Google’s cloud provisioning via their web console have simplified the whole setup to the point where it only take a couple of button clicks and you’re up and running. In comparison, AWS EKS is far from this point, it would be impossible to follow and run their setup scripts on your phone as a passenger in a car.

The other problem with the current approach on AWS is the extensive use of CloudFormation templates to create EKS clusters – it seems this results in little connection between the bare bones EKS console web page and the resources you provision via the CloudFormation scripts. This lack of connection between the console page and the scripts resulted in this rather unpleasant monthly bill:

I created a test EKS cluster to do some testing, and then when I’d finished, I deleted the cluster with the delete button on the EKS console page. I expected that this would have deleted all the resources created and associated with the cluster. Apparently though, if you delete your cluster from the EKS Console, only the master nodes are destroyed (which at current prices cost 20c/hour, is expensive compared to GKE and AKS that run your cluster master nodes for free), but any other provisioned resources like Auto Scaling groups and your EC2 nodes are left active.

If you currently delete you cluster from the Console therefore and then forgot about it for a couple of weeks, the cost of leaving a couple of t2.medium EC2 instances up for several days is around $50. Ouch.

What makes this issue worse is the Auto Scaling Group created from the CloudFormation templates for the nodes will keep recreating your EC2 nodes if you try and manually terminate them. So if you attempt to shut them down, if you’re not paying attention they’ll automatically get recreated with a few mins:

Luckily, after creating a support ticket with AWS to explain that these nodes were left up running even though I’d deleted the EKS cluster from the Console, they gave me a full refund for these unexpected charges. AWS your customer support is awesome 🙂

So, lessons learned so far:

  • set up am AWS Budget with alarms so if your monthly costs unexpectedly increase beyond what you plan to spend, you’ll be alerted and can take corrective actions
  • don’t take CloudFormation templates for granted – check the resources they create, and keep an eye on the resources as they’re running
  • it’s great that you now have the option of a common runtime platform on every major cloud provider, but some of the other providers offer a much better user experience in terms of provisioning and tooling (although I expect AWS will catch up soon)

Provisioning a Kubernetes cluster on AWS EKS (part 1, not up yet)

tldr; AWS introduced their version of a managed Kubernetes as a Service earlier this year, EKS. A link to the setup guide and my experience following these instructions is below. After running into a couple of issues (using the IAM role in the wrong place, and running the CloudFormation stack creation and cluster creation steps as a root user, instead of an IAM user), I spent at least a couple of hours trying to get an EKS cluster up, and then wanted to find out how easy or otherwise it is to provision a Kubernetes cluster on the other major cloud vendors. On Google Cloud, it turns out it’s incredibly easy – it took less than 5 minutes using their web console on my phone while in a car (as a passenger of course 🙂 ). From reading similar articles it sounds like the experience on Azure is similar. AWS have clearly got some work to do in this area to get their provisioning more like Google’s couple-of-button-clicks and you’re done approach. If you’re still interested in the steps to provision EKS then continue reading, otherwise in the meantime I’m off to play with Google’s Kubernetes Engine 🙂

The full Getting Started guide for AWS EKS is here, but here’s the condensed steps required to deploy a Kubernetes cluster on AWS:

Create a Role in IAM for EKS:

Create a Stack using the linked CloudFormation template in the docs – I kept all the defaults and used the role I created above.

At this point when I attempted to create, but got this error:

Template validation error: Role arn:aws:iam::xxx:role/eks_role is invalid or cannot be assumed

I assumed that the Role created in the step earlier was to be used here, but it’s used later when creating your cluster, not for running the CloudFormation template – don’t enter it here, leave this field blank:

When then Stack creation completes you’ll see:

Back to set up steps:

Install kubectl if you don’t already have it

Download and install aws-iam-authenticator, and to your PATH

Back to the AWS Console, head to EKS and create your cluster:

For the VPC selection, the first/default VPC selected was my default VPC and not the VPC created during the Stack creation, so I changed from this default:

Since I had run and re-run the CloudFormation template a few times until I got it working, I ended up with a number of VPCs and SecurityGroups with the same name as the Stack. To work out which were the currently in use ones, I went back to CloudFormation and checked the Outputs tab to get a list of SecurityGroupIds, VPCIds and SubnetIds in use by the current Stack. Using this info I then selected the matching values for the VPC and SecurityGroup (the one with ControlPlaneSecurityGroup in the name).

Cluster is up!

Initialize the aws cli and kubectl connection config to your cluster:

 aws eks update-kubeconfig --name cluster_name

At this point you have a running cluster with master nodes, but no worker EC2 nodes provisioned. That’s the next step in the Getting Started Guide.

Now check running services:

kubectl get svc

At this point, I was prompted for credentials and wondered what credentials it needed since my aws cli was already configured and logged in:

$ kubectl get svc
Please enter Username: my-iam-user
Please enter Password:

This post suggested that there’s a step in the guide that requires you to create the cluster with an IAM user and not your root user. I obviously missed this and used my root user. I’ll delete the cluster logon as an IAM user and try again.

Created a new cluster with an Admin IAM user, and now I can see the cluster starting with:

aws eks describe-cluster --name devel --query cluster.status

{
"cluster": {
"name": "test1",
...

"status": "CREATING",
...
}

Once the Cluster is up, continue with the instructions to add a worker node using the CloudFormation template file.

At this point more errors, ‘Unauthorized’

Searching around found this post, that implies not only should you not create the cluster with a root user, but also the stack needs to be created with the same IAM user.

Back to the start, trying one more time.

At this point I got distracted by the fact that it only takes 5 minutes and a couple of button clicks on Google Cloud to provision a Kubernetes cluster… so I’ll come back to getting this set up on AWS at a later point … in the meantime I’m off to kick the tires on Google Kubernetes Engine.