How to recover from master failure in kubernetes

Question

I have three nodes multi-master kubernetes(1.17.3) cluster(Stacked control plane and etcd nodes),

11.11.11.1 - master1
11.11.11.2 - master2
11.11.11.3 - master3

So before going to productions, I am trying possible failures and did below steps

Graceful Removal of Master Nodes

Run kubectl drain 11.11.11.3 on master3
kubeadm reset on master 3
kubectl delete node 11.11.11.3 on master3

So by applying above steps all pods are running on master 1 and 2, it removes entries from kubeadm-conf configmap and also from etcd, infact i run above steps on master2 and still one master is up and running and i can run kubectl.

Non-Graceful Removal of Master Nodes

I shutdown master3 but dont face any issue still two master accessible and i can run kubectl and do administrations.
As soon as i shut master2 i have no access to kubectl and its saying apiserver is not accessible. How i can recover master1 in this situation?

This can happen that two nodes may have hardware issue in production, from my search it look like etcd issue but how i can access etcd and remove master2 and master3, i thought to do docker ps and docker exec <etcd container> but docker ps is not showing etcd container.

If your etcd cluster is on you masters and you shut down master2 and master3 at the same time etcd will not be able to form a quorum so nothing will work. Best to have a 5+ etcd cluster. — Mark Wagner, Jun 05 '20 at 18:38

score 1 · Answer 1 · answered Jun 06 '20 at 00:29

A topic near and dear to my heart.

The short version is:

create an etcd snapshot from the surviving etcd node
create a new, 3-node "disposable" etcd cluster, restoring from that snapshot
stop master1's etcd member
reset its state (rm -rf /var/lib/etcd, and delete the server and peer certs unless you have used the same CA for the disposable cluster -- something I highly recommend but may not be possible for a variety of reasons)
join it to the new, now healthy disposable etcd cluster
now you can either bring the other masters back online, joining them to that new cluster until you have a new quorum, and tear down the disposable cluster, or you can walk over the disposable etcd members, removing them from the cluster one at a time until only master1 remains in your "cluster of one"

I have had great success using etcdadm to automate all of those steps I just described, with the bad news being that you have to build the etcdadm binary yourself, because they don't -- as of this message -- attach built artifacts to their releases

In the future, you'll want to include etcdctl member remove $my_own_member_id from any orderly master teardown process, since if a member just disappears from an etcd cluster, that's damn near fatal to the cluster. There is an etcd issue speaking to the fact that etcd really is fragile, and you need a bigger team running it than you do kubernetes itself :-(

How i can create snapshot from surviving node? as i mentioned its "Stacked control plane and etcd nodes" and i am not able to access etcd container on surviving node. — ImranRazaKhan, Jun 06 '20 at 21:36
Was the etcd container running _in cluster_ style "stacked," or merely running on the master nodes style stacked? Also, did you run `docker ps -a`, to show stopped containers, or just literally `docker ps`? It has been my experience that kubelet continues to run Pods even with the loss of the control plane — mdaniel, Jun 06 '20 at 21:53
Please also [edit your question](https://serverfault.com/posts/1020224/edit) to include more details of what you have already tried and the results of those experiments. SF is not "ssh commands over email," so you'll need to write down what you've done rather than go back-and-forth forcing us to ask clarifying questions — mdaniel, Jun 06 '20 at 21:54
I have updated answer, hope it might clarify my intention in question. — ImranRazaKhan, Jun 11 '20 at 10:39

score 1 · Answer 2 · answered Jun 10 '20 at 21:11

Keeping in view my findings and testing, To restore the full functionality of your k8s cluster, make a remaining etcd node a standalone etcd cluster. Follow below steps

As we lost two master nodes out of three and now etcd pod on remaining node is not able to complete quorum so this static pod is continuously failing and exiting and you cant do etcdctl member remove <node>

Stop the etcd pod:

mv /etc/kubernetes/manifest/etcd.yaml /backup
mv /var/lib/etcd/member /backup/

Temporarily force a new cluster on the etcd host:

cp /backup/etcd.yaml etcd.yaml.bak
edit /backup/etcd.yaml and change following values

etcd.yaml

Remove lost nodes(master2 and master3) from initial cluster
- --initial-cluster=master1=https://10.11.158.114:2380

Remove following line
- --initial-cluster-state=existing

Add Following line
- --force-new-cluster

Restart the etcd pod:

 cp /backup/etcd.yaml /etc/kubernetes/manifest/

Remove the --force-new-cluster command from /etc/kubernetes/manifest/etcd.yaml as it will force every time.

Restore to original state (if possible)

if master2 and master3 were not crash and lost due to other reason(mean disk and data is available) then as soon as these masters are available you can do following steps to go to original state

cd /etc/kubernetes/manifest
rm -rf etcd.yaml
cd /var/lib/etcd
rm -rf member
cp -R /backup/member /var/lib/etcd/
cp /backup/etcd.yaml /etc/kubernetes/manifest/

This is how things works for me, Please suggest if i can optimize these steps.

I also have the same issues about restore etcd. I follow your steps when restore first master is good. but master2 and master3 don't work. its log as the below "caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"192.168.50.41:55540","server-name":"","error":"tls: first record does not look like a TLS handshake" do you know how to fix it ? — letitbe, Apr 06 '22 at 10:25

How to recover from master failure in kubernetes

2 Answers2