Recovering from losing quorum in etcd cluster

In case you are running a KubernetesKubernetes
Kubernetes is a container orchestration platform. This note serves as a Map of Content for this topic. Start your exploration on one of these notes:

[[K8S Node]]
[[K8S Object]]
[[What happen...
cluster in a HA setup with say, 3 K8S Master NodeK8S Master Node
[[Kubernetes]] Master nodes (also known as k8s controlplane) are [[K8S Node]]s responsible for managing the cluster. No user processes normally run on master node, this is the job of the [[K8S Work...
s, loss of majority of the nodes will stop the cluster from reaching Consensus and thus, from functoning.

The main reason this breaks tha HA cluster is that etcd clusteretcd cluster
Etcd is a reliable key-value [[Database]]. It is one of the most important parts of a [[Kubernetes]] cluster used to store the data of the [[K8S Apiserver]]. It uses [[Raft Protocol]] to establish ...
loses ability to establish quorum between its members. Even though 2/3 members are dead, if they are not explicitly removed by etcdctl member remove, the remaining node will still try to reach them.

This is a problem, but luckily all we need to do is remove the dead members, right? Right?

Well, if the etcd cluster can't elect a leader, you can't ever run etcdctl member list against it. The cluster is stuck in leader election and doesn't care about your input.

The way you can get out of this mess is thankfully pretty simple. You just need to add --force-new-cluster flag to etcd command (inside of KubeletKubelet
Kubelet runs on a [[Kubernetes]] node and is responsible for managing the node it's runnning on. It starts and stops nodes as requested by the [[K8S Apiserver]]. It also updates the kube apiserver ...
's K8S Static PodK8S Static Pod
Static pods are [[K8S Pod]]s run solely by [[Kubelet]] on any [[K8S Node]] without any interference from kube-apiserver and other controlplane components.

Kubelet uses /etc/kubernetes/manifests di...
manifest), which will initialize a new single-node etcd cluster and removes the dead members.

After this is done, you just need to add the rest of the nodes to this new cluster with etcdctl member add or kubeadm join phase control-plane-join etcd --control-plane if you are using kubeadm.

Status: #🌲