Kubernetes node join command / token expired – generating a new token/hash for node join

After running a ‘kubeadm init’ on the main node, it shows you the node join command which includes a token and a hash. It appears these values only stay valid for 24hrs, so if you try to use them again after 24 hours the  ‘kubeadm join’ command will fail with something like:

[discovery] Failed to connect to API Server “192.168.1.67:6443”: there is no JWS signed token in the cluster-info ConfigMap. This token id “78a69b” is invalid for this cluster, can’t connect

To create a new join string, from the master node run:

kubeadm token create --print-join-command

Running the new join command string on your new nodes will now allow them to join the cluster.

This is described in the docs here.

Revisiting Docker and Kubernetes installation on CentOS7 (take 3)

I tried a while back to get a Kubernetes cluster up and running on CentOS7, and captured my experience in this post here. At one point I did have it up and running, but after a reboot of my master and worker nodes, I ran into an issue with some of the pods not starting, and then decided to shelf this project for a while to work on something else.

Based on tips from a colleague who had recently worked through a similar setup, the main difference in the approach he took compared to my steps was that he didn’t do a vanilla install of Docker with ‘sudo yum install docker’ but instead installed a custom version for CentOS.

Retracing my prior steps, the section in the Kubernetes install instructions here  tell you to do a ‘sudo yum install docker’, but the steps on the Docker site for CentOS here walk you through installing from a custom repo. I followed these steps on a clean CentOS 7 install, and then continued with the Kubernetes setup.

Started the Docker service with:

sudo systemctl start docker.service

Next instead of opening the required ports, since this is just a homelab setup, I just disabled the firewall (following instructions from here):

sudo systemctl disable firewalld

And then stopped it from currently running:

sudo systemctl stop firewalld

Next, picking up with the kubernetes instructions to install kubelet, kubeadm etc.

Disable selinux:

sudo setenforce 0

From my previous steps, editing /etc/selinux/configand setting:

SELINUX=disabled

CentOS7 specific config for iptables (although disabling the firewall on CentOS7 this might not be relevant, but adding it anyway :

cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

Disable swap:

swapoff -a

Also, edit /etc/fstab and remove the swap line and reboot.

Next, following the install instructions to add the repo file, and then installing with:

sudo yum install -y kubelet kubeadm kubectl

Enabling the services:

sudo systemctl enable kubelet && systemctl start kubelet

Starting the node init:

sudo kubeadmin init

And realized we hadn’t addressed the cgroups config issue to get kubelet and docker using the same driver:

Dec 17 18:17:08 unknown000C2954104F kubelet[16450]: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: “systemd” is different from docker cgroup driver: “

I have a post on addressing this with installing Openshift Origin. Follow the same steps here to reconfigure.

Kubeadm init, add the networking overlay (I installed Weave), and I think we’re up:

kubectl get nodes

[kev@unknown000C2954104F /]$ kubectl get nodes

NAME                  STATUS    ROLES     AGE       VERSION

unknown000c2954104f   Ready     master    31m       v1.9.0

Checking the pods though, the dns pod was stuck restarting and not coming up clean. I found this ticket for exactly the issue I was seeing. The resolution was to switch back to cgroupfs for both Docker and Kubernetes.

I did this by backing out the addition previously made to

/usr/lib/systemd/system/docker.service, and then adding a new file, 

/etc/docker/daemon.json, and pasting in:

{
"exec-opts": ["native.cgroupdriver=cgroupfs"]
}

Next, edit /etc/systemd/system/kubelet.service.d/10-kubeadm.conf and replace:

Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=systemd"

with

Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"

Restart Docker:

sudo systemctl daemon-reload

sudo systemctl restart docker

Check we’re back on cgroupfs:

sudo docker info |grep -i cgroup

Cgroup Driver: cgroupfs

And now check the nodes:

$ kubectl get nodes

NAME                  STATUS    ROLES     AGE       VERSION

unknown000c2954104f   Ready     master    1d        v1.9.0

unknown000c29c2b640   Ready     <none>    1d        v1.9.0

And the pods:

$ kubectl get pods –all-namespaces

NAMESPACE     NAME                                          READY     STATUS    RESTARTS   AGE

kube-system   etcd-unknown000c2954104f                      1/1       Running   3          1d

kube-system   kube-apiserver-unknown000c2954104f            1/1       Running   4          1d

kube-system   kube-controller-manager-unknown000c2954104f   1/1       Running   3          1d

kube-system   kube-dns-6f4fd4bdf-jtk82                      3/3       Running   123        1d

kube-system   kube-proxy-9b6tr                              1/1       Running   3          1d

kube-system   kube-proxy-n5tkx                              1/1       Running   1          1d

kube-system   kube-scheduler-unknown000c2954104f            1/1       Running   3          1d

kube-system   weave-net-f29k9                               2/2       Running   9          1d

kube-system   weave-net-lljgc

 

Now we’re looking good! Next up, lets deploy something and check we’re looking good!

kubernetes : kubectl useful commands

Working through the interactive tutorial here is a good reference for kubectl usage.

A few notes for reference:

Copy master node config to a remote machine (from here):

scp root@<master ip>:/etc/kubernetes/admin.conf .

All of the commands if running on a remote machine can use the copied conf file by passing: --kubeconfig ./admin.conf

Query nodes in the cluster:

kubectl get nodes


Show current cluster info:

./kubectl cluster-info

Kubernetes master is running at https://192.168.1.80:6443

KubeDNS is running at https://192.168.1.80:6443/api/v1/namespaces/kube-system/services/kube-dns/proxy

From the interactive tutorial:

Run kubernetes-bootcamp:

kubectl run kubernetes-bootcamp --image=docker.io/jocatalin/kubernetes-bootcamp:v1 --port=8080

Pods:

kubectl get pods
kubectl describe pod podname
kubectl delete pod podname

Deployments:

kubectl get deployments
kubectl describe deployment deploymentname
kubectl delete deployment deploymentname

Get logs

kubectl logs podname

HP iLO Integrated Remote Control Java Web Start App

I love that I can power on my HP DL380 server remotely using the iLO feature (it’s upstairs in my office). I hadn’t yet played with the ‘Integrated Remote Control’ yet though which is available from a link from the iLO home page. The Java Web Start version runs ok from Firefox which allows you to watch the start up process remotely and interact with the server startup menus.

After ESXi has completed booting, apparently you need an additional license to continue to use it past that point. At this point after ESXi there’s not much you can do with the server anyway so this is not a big deal (since interacting with ESXi is all via the web app).