2

As you can see in the picture below, when upgrading nodes to a new version of Kubernetes, the pods on each node are recreated on a new node with a newer version of Kubernetes.

However, it seems that the old pods are being destroyed (Terminating) without waiting for the new pods to be in a Ready state, leading to downtime as the new pods are still in a ContainerCreating state.

Is there an explanation for this, or am I doing something wrong?

enter image description here

Nick
  • 163
  • 1
  • 5

1 Answers1

6

When a node is upgraded, GKE will first drain the pods from the node before removing the deleting the virtual machine and removing from the node from the cluster. The process for terminating a pod is explained in detail here and what you should realize is that this process starts before the new pods is created, scheduled, or starts running.

If you have a deployment and want to keep a specific number of replicas up during the node upgrade process, you should configure pod disruption budgets, with which you can ensure that the node upgrade process will proceed at a slow enough rate to keep enough containers running for your service to handle traffic without any downtime.

Robert Bailey
  • 599
  • 3
  • 6
  • Thanks! I wonder why Kubernetes doesn't do this by default. – Nick Jul 11 '19 at 06:25
  • Which part? Graceful termination periods and disruption budgets are application specific settings and need to be customized for each application / system separately. – Robert Bailey Jul 12 '19 at 03:18