Kubernetes Rolling Updates: implementing a Readiness Probe with Spring Boot Actuator

During a Rolling Update on Kubernetes, if a service has a Readiness Probe defined, Kubernetes will use the results of calling this heathcheck to determine when updated pods are ready to start accepting traffic.

Kubernetes supports two probes to determine the health of a pod:

  • the Readiness Probe: used to determine if a pod is able to accept traffic
  • the Liveliness Probe: used to determine if a pod is appropriately responding, and if not, it will be killed and a new pod restarted

Spring Boot’s Actuator default healthcheck to indicate if a service is up and ready for traffic can be used for a Kubernetes Readiness Probe. To include in an existing Spring Boot service, add the Actuator maven dependency:

<dependency>
<groupId>org.springframework.boot</groupId
<artifactId>spring-boot-starter-actuator</artifactId
</dependency>

This adds the default healthcheck accessible by /actuator/health, and returns a 200 (and json response { “status” : “up”} ) if the service is up and running. 

To configure Kubernetes to call this Actuator healthcheck to determine the health of a pod, add a readinessProbe section to the container spec section for your deployment.yaml:

spec:
containers:
- name: exampleservice-b
image: kevinhooke/examplespringboot-b:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /example-b/v1/actuator/health
port: 8080
initialDelaySeconds: 5
timeoutSeconds: 5

Kubernetes will call this endpoint to check when the pod is deployed and ready for traffic. During a rolling update, as new pods are created with an updated image, you’ll see their status go from 0/1 available to 1/1 available as soon as the Spring Boot service has completed startup and the healthcheck is responding.

The gif below shows deployment of an update image to a pod. Notice how as new pods are created, they move from 0/1 to 1/1 and then when they are ready, the old pods are destroyed:

Provisioning a Kubernetes cluster on AWS EKS (part 1, not up yet)

tldr; AWS introduced their version of a managed Kubernetes as a Service earlier this year, EKS. A link to the setup guide and my experience following these instructions is below. After running into a couple of issues (using the IAM role in the wrong place, and running the CloudFormation stack creation and cluster creation steps as a root user, instead of an IAM user), I spent at least a couple of hours trying to get an EKS cluster up, and then wanted to find out how easy or otherwise it is to provision a Kubernetes cluster on the other major cloud vendors. On Google Cloud, it turns out it’s incredibly easy – it took less than 5 minutes using their web console on my phone while in a car (as a passenger of course 🙂 ). From reading similar articles it sounds like the experience on Azure is similar. AWS have clearly got some work to do in this area to get their provisioning more like Google’s couple-of-button-clicks and you’re done approach. If you’re still interested in the steps to provision EKS then continue reading, otherwise in the meantime I’m off to play with Google’s Kubernetes Engine 🙂

The full Getting Started guide for AWS EKS is here, but here’s the condensed steps required to deploy a Kubernetes cluster on AWS:

Create a Role in IAM for EKS:

Create a Stack using the linked CloudFormation template in the docs – I kept all the defaults and used the role I created above.

At this point when I attempted to create, but got this error:

Template validation error: Role arn:aws:iam::xxx:role/eks_role is invalid or cannot be assumed

I assumed that the Role created in the step earlier was to be used here, but it’s used later when creating your cluster, not for running the CloudFormation template – don’t enter it here, leave this field blank:

When then Stack creation completes you’ll see:

Back to set up steps:

Install kubectl if you don’t already have it

Download and install aws-iam-authenticator, and to your PATH

Back to the AWS Console, head to EKS and create your cluster:

For the VPC selection, the first/default VPC selected was my default VPC and not the VPC created during the Stack creation, so I changed from this default:

Since I had run and re-run the CloudFormation template a few times until I got it working, I ended up with a number of VPCs and SecurityGroups with the same name as the Stack. To work out which were the currently in use ones, I went back to CloudFormation and checked the Outputs tab to get a list of SecurityGroupIds, VPCIds and SubnetIds in use by the current Stack. Using this info I then selected the matching values for the VPC and SecurityGroup (the one with ControlPlaneSecurityGroup in the name).

Cluster is up!

Initialize the aws cli and kubectl connection config to your cluster:

 aws eks update-kubeconfig --name cluster_name

At this point you have a running cluster with master nodes, but no worker EC2 nodes provisioned. That’s the next step in the Getting Started Guide.

Now check running services:

kubectl get svc

At this point, I was prompted for credentials and wondered what credentials it needed since my aws cli was already configured and logged in:

$ kubectl get svc
Please enter Username: my-iam-user
Please enter Password:

This post suggested that there’s a step in the guide that requires you to create the cluster with an IAM user and not your root user. I obviously missed this and used my root user. I’ll delete the cluster logon as an IAM user and try again.

Created a new cluster with an Admin IAM user, and now I can see the cluster starting with:

aws eks describe-cluster --name devel --query cluster.status

{
"cluster": {
"name": "test1",
...

"status": "CREATING",
...
}

Once the Cluster is up, continue with the instructions to add a worker node using the CloudFormation template file.

At this point more errors, ‘Unauthorized’

Searching around found this post, that implies not only should you not create the cluster with a root user, but also the stack needs to be created with the same IAM user.

Back to the start, trying one more time.

At this point I got distracted by the fact that it only takes 5 minutes and a couple of button clicks on Google Cloud to provision a Kubernetes cluster… so I’ll come back to getting this set up on AWS at a later point … in the meantime I’m off to kick the tires on Google Kubernetes Engine.

Migrating an existing WordPress + nginx + php5-fpm + mysql website to Docker containers: lessons learned

I’ve covered in previous posts why I wanted to Dockerize my site and move to containers, you can read about it in my other posts shared here. Having played with Docker for personal projects for several months at this point, I thought it was going to be easy, but ran into several issues and unexpected decisions that I needed to make. In this post I’ll summarize a few of these issues and learning points.

Realizing the meaning of ‘containers are ephemeral’, or ‘where do I put my application data’?

Docker images are the blueprint for a container, while the container is a running instance of an image. It’s clear from the Docker docs and elsewhere that you should treat your containers as ‘ephemeral’, meaning they only exist while they’re up and running, their state is temporary, and once they are discarded their state is also lost.

This is an easy concept to grasp at a high level, but in practice this leads to important and valid questions, like ‘so where does my data go’? This became very apparent to me when transferring my existing WordPress data. First, I have data in MySQL tables that needs to get imported into the new MySQL server running in a container. Second, where does the wordpress/wp-content go that in my case contains nearly 500MB of uploaded images from my 2,000+ posts?

The data for MySQL was easy to address, as the official MySQL docker image is already set up to use Docker’s data volume feature by default to externalize your MySQL data files outside of your running container.

The issue of where to put my WordPress wp-content containing 500MB of upload files is what caused my ahah moment with data volumes. Naively, you can create an image and use the COPY command to copy any number of files into an image, including even 500MB of images, but when you start to move this image around, like pushing it to a repository or a remote server, you quickly realize you’ve created something that is impractical. Making incremental changes to a image containing this quantity of files you quickly find that you’re unable to push it anywhere quickly.

To address this, I created an image with nginx and php5-fpm installed, but used Docker’s bind mount to reference and load my static content outside the container.

Now I have my app in containers, how do I actually deploy to different servers?

Up until this point I’ve built and run containers locally, I’ve set up a local Docker Repository for pushing images to for testing, but the main reasons I was interested in for this migration was to enable:

  • building and testing the containers locally
  • testing deployment to a VM server with an identical setup to my production KVM hosted server
  • pushing to my production server when I was ready to deploy to my live site

Before the Windows and MacOS naive Docker installations, I thought docker-machine was just a way to deploy to a locally running Docker install in a VM. It never occurred to me that you can also use the docker-machine command to act on any remote Docker install too.

It turns out even setting a env var DOCKER_HOST to point to the IP of any remote Docker server will enable you to direct commands to that remote server. I believe part of the ‘docker-machine create’ setup helps automate setting up TLS certs for communicating with your remote server, but you can also do this manually following the steps here. I took this approach because I wanted to use the certs from my dev machine as well as my GitLab build machine.

I used this approach to build my images locally, and then on committing my Dockerfile and source changes to my GitLab repo, I also set up a CI Pipeline to run the same commands and push automatically to a locally running test VM server, and then manually to push to my production server.

I’ll cover my GitLab CI Pipeline setup in an upcoming post.

How do you monitor an application running in containers?

I’ve been looking at a number of approaches. Prometheus looks like a great option, and I’ve been setting this up on my test server to take a look. I’m still looking at a few related options, maybe even using Grafana to visualize metrics. I’ll cover this in a future post too.