1

Background

Created a fresh Kubernetes cluster using kubeadm init --config /home/kube/kubeadmn-config.yaml --upload-certs and then joining the 2nd control plane node by running the below.

kubeadm join VIP:6443 --token <token> \
    --discovery-token-ca-cert-hash sha256:<hash> \
    --control-plane --certificate-key <key> \
    --v=5

Question

Is etcdctl commands supposed to come back with a return value? Either using the command directly or using the docker exec method shown below. I have these packages installed kubeadm, kubectl, kubelet, and docker.

Kubectl version: 1.20.1 OS: Ubuntu 18.04

Commands from the first node

Command

etcdctl cluster-health

Response

cluster may be unhealthy: failed to list members
Error:  client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:4001: connect: connection refused
; error #1: EOF

error #0: dial tcp 127.0.0.1:4001: connect: connection refused
error #1: EOF

Command

docker container ls | grep k8s_POD_etcd

Response

k8s_POD_etcd-<nodename>_kube-system_<docker container id>

Command

docker exec -it k8s_POD_etcd-<nodename>_kube-system_<docker container id> etcdctl --endpoints=https://<node ip>:2379 --key=/etc/kubernetes/pki/etcd/peer.key --cert=/etc/kubernetes/pki/etcd/peer.crt --cacert=/etc/kubernetes/pki/etcd/ca.crt member list

Response

OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"etcdctl\": executable file not found in $PATH": unknown

EDIT

Upgraded to v3.2 etcdctl API

Command

etcdctl endpoint status

Response

Failed to get the status of endpoint 127.0.0.1:2379 (context deadline exceeded)
PieDev
  • 71
  • 1
  • 5
  • Hi PieDev, welcome to S.F. That 4001 port is the legacy one, used by etcd2 which is almost certainly not supported by k8s; I would guess it's either an ancient binary or is missing `ETCDCTL_API=3` and the associated --endpoints (`ETCDCTL_ENDPOINTS`) values to point it to the modern :2379 port. I would further guess the etcd certs are volume mounted from `/etc/kubernetes/pki/etcd` on the host, and thus you don't need etcdctl to exist inside the docker image, you can use the system version – mdaniel Jan 24 '21 at 03:59
  • @PieDev Any progress? – Wytrzymały Wiktor Jan 27 '21 at 10:10
  • @mdaniel See the edit I made. I do see .crt and .key files in `ls -l /etc/kubernetes/pki/etcd/`. – PieDev Feb 01 '21 at 03:20
  • @WytrzymałyWiktor See the edit I made. – PieDev Feb 01 '21 at 03:20
  • Did @Matt answer help you to solve your problem?If yes,Please consider accepting and up voting it. [What should I do when someone answers my question](https://stackoverflow.com/help/someone-answers)? – Fariya Rahmat Apr 02 '22 at 07:07

3 Answers3

1

The error mentioned by OP is caused by non existing etcdctl exacutable in container.

Why? Because he used the wrong container. Look at the following command:

docker container ls | grep k8s_POD_etcd
be510c179ced   k8s.gcr.io/pause:3.2   "/pause"  2 days ago  Up 2 days   k8s_POD_etcd-minikube_kube-system_2315889f8b2b54f1b9d43feafe941d01_0

Notice the container is k8s.gcr.io/pause:3.2. It's not an etcd container.

But why?? what is this pause container? I won't answer this question because somebody already answered it here: what-are-the-pause-containers.

I will try to answer a better question: Where is the actual etcd container?

Let's have a look at the output of the same command but with slightly modified grep command; lets grep for etcd:

docker container ls | grep etcd
c989e7d1d25b   0369cf4303ff           "etcd --advertise-cl…"   2 days ago       Up 2 days k8s_etcd_etcd-minikube_kube-system_2315889f8b2b54f1b9d43feafe941d01_0
be510c179ced   k8s.gcr.io/pause:3.2   "/pause"                 2 days ago       Up 2 days k8s_POD_etcd-minikube_kube-system_2315889f8b2b54f1b9d43feafe941d01_0

Now we have two lines of output, one is the previously found pause container, and the second one is our etcd container with a name starting with k8s_etcd_etcd. Let's see if we can run docker exec on this container:

$ docker exec -it k8s_etcd_etcd-<nodename>_kube-system_<docker container id> etcdctl version
etcdctl version: 3.4.13
API version: 3.4

Yes, we can!


To summarize: it looks like you were looking at the wrong container from the very beginning.

Matt
  • 528
  • 3
  • 7
0

The context deadline exceeded is an unclear error returned by grpc client when it can't establish the connection. If you want to see the exact error message you should set ETCDCTL_API=2 (more details on that can be found here).

The cert/key pairs in /etc/kubernetes/pki/etcd/ should look something like this:

# ls -l /etc/kubernetes/pki/etcd/
total 32
-rw-r--r--    1 root     root          1017 Nov 12 15:32 ca.crt
-rw-------    1 root     root          1679 Nov 12 15:32 ca.key
-rw-r--r--    1 root     root          1094 Nov 12 15:32 healthcheck-client.crt
-rw-------    1 root     root          1675 Nov 12 15:32 healthcheck-client.key
-rw-r--r--    1 root     root          1180 Nov 12 15:32 peer.crt
-rw-------    1 root     root          1675 Nov 12 15:32 peer.key
-rw-r--r--    1 root     root          1180 Nov 12 15:32 server.crt
-rw-------    1 root     root          1679 Nov 12 15:32 server.key

# etcdctl --version
etcdctl version: 3.3.1
API version: 2

# ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
  --cacert /etc/kubernetes/pki/etcd/ca.crt \
  --cert /etc/kubernetes/pki/etcd/server.crt \
  --key /etc/kubernetes/pki/etcd/server.key
Snapshot saved at snapshot.db

# ETCDCTL_API=3 etcdctl --write-out=table snapshot status snapshot.db
+----------+----------+------------+------------+
|   HASH   | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| b9d500f7 |    72966 |       1194 |     4.9 MB |

Make sure that you apply the right cert/key pair. Also, this guide can help you out.

Note that etcd takes several certificate related configuration options, either through command-line flags or environment variables. The basic setup for it can be found here.

  • @PieDev Does this [answer your question](https://stackoverflow.com/help/someone-answers)? – Wytrzymały Wiktor Feb 05 '21 at 09:19
  • @PieDev If the above does not solve your issue than please edit your question and provide the output from `kubectl get pods -n kube-system` and `kubectl -n kube-system describe pod etcd` to verify if etcd is running in you k8s cluster. Also, what was your `kubeadm init` config in terms of the etcd? – Wytrzymały Wiktor Feb 09 '21 at 12:28
0

if you are using alpine try

docker exec -it <container-id> sh

It can happen due to an ordering mistake You might need to run use /bin/bash or /bin/sh, depending on the shell in your container.

The reason is documented in the ReleaseNotes file of Git and it is well explained here - Bash in Git for Windows: Weirdness

some more solution:

MD SHAYON
  • 101