Troubleshooting User Data scripts when creating AWS EC2 instances

When an AWS EC2 User Data script fails, you’ll see something like this in /var/log/cloud-init.log in your instance:

2018-02-03 06:08:16,536 - util.py[DEBUG]: Failed running /var/lib/cloud/instance/scripts/part-001 [127]

Traceback (most recent call last):

  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 806, in runparts

    subp(prefix + [exe_path], capture=False)

  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1847, in subp

    cmd=args)

cloudinit.util.ProcessExecutionError: Unexpected error while running command.

Command: ['/var/lib/cloud/instance/scripts/part-001']

Exit code: 127

Reason: -

Stdout: -

Stderr: -

2018-02-03 06:08:16,541 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)

2018-02-03 06:08:16,541 - handlers.py[DEBUG]: finish: modules-final/config-scripts-user: FAIL: running config-scripts-user with frequency once-per-instance

It tells you something failed, but not what. The trouble seems that output from your user data script does not go to the cloud-init.log by default.

One of the answers in this post suggests to pipe your script commands and output to logger into a separate log file like this:

set -x
exec > >(tee /var/log/user-data.log|logger -t user-data ) 2>&1
echo BEGIN
date '+%Y-%m-%d %H:%M:%S'

Now running my script with a ‘apt-get update -y’ looks like:

+ echo BEGIN
BEGIN
+ date '+%Y-%m-%d %H:%M:%S'
2018-02-03 23:37:55
+ apt-get update -y
... output continues here

And further down, here’s my specific error I was looking for:

+ java -Xmx1024M -Xms1024M -jar minecraft_server.1.12.2.jar nogui

/var/lib/cloud/instance/scripts/part-001: line 11: java: command not found

My EC2 running the Ubuntu AMI does not have Java installed by default, so I need to install it with (adding to my User Data script):

apt-get install openjdk-8-jre-headless -y

… and now my script runs as expected.

 

gitlab service control commands

After installing gitlab from the omnibus install, use the gitlab-ctl command to query status, and start/stop the gitlab service (see here):

$ sudo gitlab-ctl status

If gitlab’s main service has been disabled, all the sub-services will report ‘runsv not running’:

fail: gitaly: runsv not running

You can reset the main service to run at startup with (see here):

sudo systemctl enable gitlab-runsvdir.service

To disable startup at boot:

sudo systemctl disable gitlab-runsvdir.service

If runsvdir is not enabled to start at boot, then start with:

sudo systemctl start gitlab-runsvdir.service

To start/stop gitlab:

$ sudo gitlab-ctl start

$ sudo gitlab-ctl stop

Deploying kubernetes Dashboard to a kubeadm created cluster

Per installations steps here, to deploy the dashboard:

kubectl --kubeconfig ./admin.conf apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

The start the local proxy:

./kubectl --kubeconfig ./admin.conf proxy

Accessing via http://localhost:8001/ui, gives this error:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "no endpoints available for service \"kubernetes-dashboard\"",
  "reason": "ServiceUnavailable",
  "code": 503
}

Checking what’s currently running with:

./kubectl --kubeconfig ./admin.conf get pods --all-namespaces

Looks like the dashboard app is not happy:

kube-system   kubernetes-dashboard-747c4f7cf-p8blw          0/1       CrashLoopBackOff   22         1h

Checking the logs for the dashboard:

./kubectl --kubeconfig ./admin.conf logs kubernetes-dashboard-747c4f7cf-p8blw --namespace=kube-system

2017/10/19 03:35:51 Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the –apiserver-host param points to a server that does not exist. Reason: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: getsockopt: no route to host

OK.

I setup my master node using the flannel overlay. I don’t know if this makes any difference or not, but I noticed this article using kubeadm used Weave Net instead. Not knowing how to move forward (and after browsing many posts and tickets on issues with kubeadm with Dashboard), knowing at least that kubadm + Weave Net works for installing dashboard, so I tried this approach instead.

After re-initializing and the adding weave-net, my pods are all started:

$ kubectl get pods –all-namespaces

NAMESPACE     NAME                                          READY     STATUS    RESTARTS   AGE

kube-system   etcd-unknown000c2960f639                      1/1       Running   0          11m

kube-system   kube-apiserver-unknown000c2960f639            1/1       Running   0          11m

kube-system   kube-controller-manager-unknown000c2960f639   1/1       Running   0          11m

kube-system   kube-dns-545bc4bfd4-nhrw7                     3/3       Running   0          12m

kube-system   kube-proxy-cgn45                              1/1       Running   0          4m

kube-system   kube-proxy-dh6jm                              1/1       Running   0          12m

kube-system   kube-proxy-spxm5                              1/1       Running   0          5m

kube-system   kube-scheduler-unknown000c2960f639            1/1       Running   0          11m

kube-system   weave-net-gs8nh                               2/2       Running   0          5m

kube-system   weave-net-jkkql                               2/2       Running   0          4m

kube-system   weave-net-xb4hx                               2/2       Running   0          10m

 

Trying to add the dashboard once more:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

… and, o. m. g. :

$ kubectl get pods –all-namespaces

NAMESPACE     NAME                                          READY     STATUS    RESTARTS   AGE

kube-system   etcd-unknown000c2960f639                      1/1       Running   0          37m

kube-system   kube-apiserver-unknown000c2960f639            1/1       Running   0          37m

kube-system   kube-controller-manager-unknown000c2960f639   1/1       Running   0          37m

kube-system   kube-dns-545bc4bfd4-nhrw7                     3/3       Running   0          38m

kube-system   kube-proxy-cgn45                              1/1       Running   0          30m

kube-system   kube-proxy-dh6jm                              1/1       Running   0          38m

kube-system   kube-proxy-spxm5                              1/1       Running   0          31m

kube-system   kube-scheduler-unknown000c2960f639            1/1       Running   0          37m

kube-system   kubernetes-dashboard-747c4f7cf-jgmgt          1/1       Running   0          4m

kube-system   weave-net-gs8nh                               2/2       Running   0          31m

kube-system   weave-net-jkkql                               2/2       Running   0          30m

kube-system   weave-net-xb4hx                               2/2       Running   0          36m

Starting kubectl proxy and hitting localhost:8001/ui now gives me:

Error: 'malformed HTTP response "\x15\x03\x01\x00\x02\x02"'
Trying to reach: 'http://10.32.0.3:8443/'

Reading here, trying the master node directly:

https://192.168.1.80:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

gives a different error:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "services \"https:kubernetes-dashboard:\" is forbidden: User \"system:anonymous\" cannot get services/proxy in the namespace \"kube-system\"",
  "reason": "Forbidden",
  "details": {
    "name": "https:kubernetes-dashboard:",
    "kind": "services"
  },
  "code": 403
}

… but reading further ahead, it seems accessing via the /ui url is not correctly working, you need to access via the url in the docs here,  which says the correct url is:

http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

and now I get an authentication page:

Time to read ahead on the authentication approaches.

List available tokens with:

kubectl -n kube-system get secret

Using the same token as per the docs (although at this point I’ve honestly no idea what the difference in permissions is for each of the different tokens):

./kubectl --kubeconfig admin.conf -n kube-system describe secret replicaset-controller-token-7tzd5

And then pasting the token value into the authentication dialog gets me logged on! There’s some errors about this token not having access to some features, but at this point I’ve just glad I’ve managed to get this deployed and working!

If you’re intested in the specific versions I’m using, this is deployed to CentOS 7, and kubernetes version:

$ kubectl version

Client Version: version.Info{Major:”1″, Minor:”8″, GitVersion:”v1.8.1″, GitCommit:”f38e43b221d08850172a9a4ea785a86a3ffa3b3a”, GitTreeState:”clean”, BuildDate:”2017-10-11T23:27:35Z”, GoVersion:”go1.8.3″, Compiler:”gc”, Platform:”linux/amd64″}

Server Version: version.Info{Major:”1″, Minor:”8″, GitVersion:”v1.8.1″, GitCommit:”f38e43b221d08850172a9a4ea785a86a3ffa3b3a”, GitTreeState:”clean”, BuildDate:”2017-10-11T23:16:41Z”, GoVersion:”go1.8.3″, Compiler:”gc”, Platform:”linux/amd64″}