tl:dr – I tried installing Kubernetes from scratch on Fedora Atomic hosts, but couldn’t get it working. I captured the steps I went through up until the point where I got stuck, but thinking there has to be an easier way, I found kubeadm and successfully used that to get a cluster up and running on CentOS instead.
If you’re interested in the steps for my first failed attempt then they’re below, otherwise if you’re interested in the steps (and issues and fixes) to get kubeadm setup on CentOS 7, skip down to Attempt 2.
(Failed )Attempt 1: Setting up Kubernetes from scratch on Fedora Atomic
I following the instructions here, but instead of 4 Atomic hosts I created 2 Fedora Atomic VMs on Proxmox. On the first of the VMs, I started a local Docker Registry container:
sudo docker create -p 5000:5000 \ -v /var/lib/local-registry:/var/lib/registry \ -e REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY=/var/lib/registry \ -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \ --name=local-registry registry:2
Next, I followed the steps to change the SELinux context on the directory that docker created for our persistence volume, and created a systemd unit file to configure a systemd service to start the registry at startup.
Next up, edit /etc/etcd/etcd.conf file for the ports to listen on:
The first line replaced this original line:
and the second replaced:
After following the steps to generate the keys, I had to create the /etc/kubernetes/certs, and then copied the generated files below ./pki/ but apart from ca.crt, the other two files were named server.crt and server.key, so I renamed them to match the instructions when copying to the destination:
sudo cp pki/ca.crt /etc/kubernetes/certs sudo cp pki/issued/server.crt /etc/kubernetes/certs/kubernetes-master.crt sudo cp pki/private/server.key /etc/kubernetes/certs/kubernetes-master.key
I followed the steps up to the point of starting the kuberetes service, and then got an error:
$ sudo systemctl start etcd kube-apiserver kube-controller-manager kube-scheduler Job for kube-apiserver.service failed because the control process exited with error code. See "systemctl status kube-apiserver.service" and "journalctl -xe" for details.
Using journalctl -xe to take a look at the logs, I’ve got a * lot * of :
reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions//factory.go:70: Failed to list *v1.ReplicationController: Get http://192.168.1.76:8080/api/v1/replicationcontrollers?resourceVersion=0: dial tcp 192.168.1.76:8080: getsockopt: connection refused
This implies the api server is not running.
This is the point where I started reading around for help, but started to ask after I’d walked through all the manual steps to get this far and hadn’t got it working yet if there’s a better/quicker way to setup Kubernetes. Turns out there is, kubeadm
Successful attempt 2: Setting up a Kubernetes cluster using kubeadm on CentOS 7
Compared to the manual installation and configuration steps, kubeadm looks very attractive as it does a lot of the work for you.
Following the instructions here, the first issue I ran into was this step and this didn’t work for me on CentOS 7:
sudo sysctl net.bridge.bridge-nf-call-iptables=1 sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
I skipped this (not knowing if there’s something comparable for CentOS) and continued on. Next, install Docker:
yum install -y docker
Enable at boot:
systemctl enable docker && systemctl start docker
Install kubelet, kubeadm and kubectl (this section straight from the docs):
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg EOF setenforce 0 yum install -y kubelet kubeadm kubectl systemctl enable kubelet && systemctl start kubelet
Next, use kubeadm to create a cluster. Reading forward in the instructions, to initialize the networking overlay with for example flannel, you also need to pass options to init, so:
kubeadm init --pod-network-cidr=10.244.0.0/16
This gave me some errors about kubelet not yet running, so I missed the step above to enable and start kubelet:
[preflight] Running pre-flight checks [preflight] WARNING: kubelet service is not enabled, please run 'systemctl enable kubelet.service' [preflight] WARNING: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly [preflight] WARNING: Running with swap on is not supported. Please disable swap or set kubelet's --fail-swap-on flag to false.
One more try:
systemctl enable kubelet && systemctl start kubelet
kubeadm init --pod-network-cidr=10.244.0.0/16
It seems you can’t run init more than once:
[preflight] Some fatal errors occurred: /etc/kubernetes/manifests is not empty /var/lib/kubelet is not empty [preflight] If you know what you are doing, you can skip pre-flight checks with `--skip-preflight-checks`
Per the ‘tear down’ instructions,
Then once more time:
kubeadm init --pod-network-cidr=10.244.0.0/16
[kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp [::1]:10255: getsockopt: connection refused.
Checking ‘systemctl status kubelet’ shows it failed on startup because swap was still enabled:
kubelet: error: failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap
To disable swap per answers to this question,
And then I edited /etc/fstab to remove the /dev/mapper/centos-swap line, and then rebooted.
kubeadm reset kubeadm init --pod-network-cidr=10.244.0.0/16
To address the error about opening ports:
[preflight] WARNING: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
I opened up the ports with new rules:
sudo firewall-cmd --zone=public --add-port=6443/tcp --permanent sudo firewall-cmd --zone=public --add-port=10250/tcp --permanent
At this point the cluster starts initializing and I get this message for a several minutes while it’s downloading/initializing:
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests" [init] This often takes around a minute; or longer if the control plane images have to be pulled.
I waited for several minutes, but it didn’t seem to do anything. Looking in ‘journalctl -xeu kubelet’ there’s many ‘connection refused’ errors repeating over and over:
192.168.1.79:6443/api/v1/nodes: dial tcp 192.168.1.79:6443: getsockopt: connection refused 192.168.1.79:6443/api/v1/services?resourceVersion=0: dial tcp 192.168.1.79:6443: getsockopt: connection refused 192.168.1.79:6443/api/v1/nodes?fieldSelector=metadata.name%3Dunknown0a7dd195e7cb&resourceVersion=0: dial tcp 192.168.1.79:6443: getsockopt: connection refused 192.168.1.79:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dunknown0a7dd195e7cb&resourceVersion=0: dial tcp 192.168.1.79:6443: getsockopt: connection refused n0a7dd195e7cb.14ebbc48a9463585: dial tcp 192.168.1.79:6443: getsockopt: connection refused' (may retry after sleeping)
Searching for anything similar I came up with these which have similar errors and symptoms:
The issue that helped for me though was this one, where in one of the steps to recreate there is a comment:
# edit /etc/selinux/config and set SELINUX=disabled
The kubeadm install steps do state:
“Disabling SELinux by running
setenforce 0 is required to allow containers to access the host filesystem”
but in the steps it only mentions running
which I’m not sure is the same as editing /etc/selinux/config. I edited the file, set SELINUX=disabled, rebooted, fired up kubeadm init again, and this time we’re in luck!
[init] Waiting for the kubelet to boot up the contr ol plane as Static Pods from directory "/etc/kubernetes/manifests" [init] This often takes around a minute; or longer if the control plane images have to be pulled. [apiclient] All control plane components are healthy after 43.002301 seconds [uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [markmaster] Will mark node unknown0a7dd195e7cb as master by adding a label and a taint [markmaster] Master unknown0a7dd195e7cb tainted and labelled with key/value: node-role.kubernetes.io/master="" [bootstraptoken] Using token: xyz [bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: kube-dns [addons] Applied essential addon: kube-proxy Your Kubernetes master has initialized successfully! To start using your cluster, you need to run (as a regular user): mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: http://kubernetes.io/docs/admin/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join --token xyz --discovery-token-ca-cert-hash sha256:abc
Following the next steps in the instructions, I set up the config for my regular user:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
And then continuing the steps to add the networking config, for flannel:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.8.0/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.8.0/Documentation/kube-flannel-rbac.yml
Checking the running pods to confirm we’re up and running: (two ‘-‘ in the param)
[kev@unknown0A7DD195E7CB ~]$ kubectl get pods –all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-unknown0a7dd195e7cb 1/1 Running 0 2h
kube-system kube-apiserver-unknown0a7dd195e7cb 1/1 Running 0 2h
kube-system kube-controller-manager-unknown0a7dd195e7cb 1/1 Running 0 2h
kube-system kube-dns-545bc4bfd4-wfs5s 3/3 Running 0 2h
kube-system kube-flannel-ds-qtfwh 1/2 CrashLoopBackOff 4 3m
kube-system kube-proxy-ggvqx 1/1 Running 0 2h
kube-system kube-scheduler-unknown0a7dd195e7cb 1/1 Running 0 2h
Now we’re looking good!
Last step adding an additional node and joining the cluster. Repeating all the steps up until but instead of kubeadm init, run the kubeadm join command passing the token and has values.
Back on the master node, running:
[kev@unknown0A7DD195E7CB ~]$ kubectl get nodes NAME STATUS ROLES AGE VERSION unknown0a7dd195e7cb Ready master 3h v1.8.0 unknown121e8862fff9 NotReady <none> 48s v1.8.0
A few more seconds later:
[kev@unknown0A7DD195E7CB ~]$ kubectl get nodes NAME STATUS ROLES AGE VERSION unknown0a7dd195e7cb Ready master 3h v1.8.0 unknown121e8862fff9 Ready <none> 51s v1.8.0
A 2 node kubernetes cluster, one master and one worker, ready for running some containers!