1

We are running a Google Kubernetes Engine cluster where all the nodes are marked "preemptible". From the Google documentation:

Preemptible VMs are Compute Engine VM instances that last a maximum of 24 hours and provide no availability guarantees.

However, when I look at my pods running on one of these nodes I see this:

NAME              READY   STATUS    RESTARTS   AGE
mypod-dev-0       3/3     Running   0          20h
mypod-dev-1       3/3     Running   0          26h

Note that the age of mypod-dev-1 is more than 24 hours. Using pod describe on mypod-dev-1 I see that all the containers were started at 07:08 AM this morning (about 3 hours ago).

Turning to the node that mypod-dev-1 is running on, when I look at the node details I see some things that are very confusing. First of all, the creation time is more than 24 hours ago (current time 10:00 AM on 20 Dec 2019):

CreationTimestamp:  Thu, 19 Dec 2019 06:55:26 -0800

Next, there are a bunch of "Conditions" messages that suggest that the node was re-created more recently:

Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                     Message
NetworkUnavailable   False   Thu, 19 Dec 2019 06:55:26 -0800   Thu, 19 Dec 2019 06:55:26 -0800   RouteCreated               NodeController create implicit route
KernelDeadlock       False   Fri, 20 Dec 2019 10:03:00 -0800   Fri, 20 Dec 2019 07:07:21 -0800   KernelHasNoDeadlock        kernel has no deadlock
ReadonlyFilesystem   False   Fri, 20 Dec 2019 10:03:00 -0800   Fri, 20 Dec 2019 07:07:21 -0800   FilesystemIsNotReadOnly    Filesystem is not read-only
... 

It appears that mypod-dev-1 had all of its containers restarted at the same time (07:08 AM) something happened with the node.

  1. How do I determine when a node was created?

  2. Why does the pod time show an age longer than what is supposed to be allowed by the preemptible nature of the nodes?

  3. Is there some log that shows when a pod was migrated off of one node and on to another?

user35042
  • 2,601
  • 10
  • 32
  • 57
  • After more investigation it appears that the GKE Kubernetes shows a creation timestamp for a node that differs from the creation timestamp that the GCP Compute Instance shows. This looks like a GKE bug. – user35042 Dec 20 '19 at 20:48

1 Answers1

1

You are right, I was able to reproduce this error. Also I found an open issue @ Issue Tracker (https://issuetracker.google.com/146928126) about this. So seems that this is something that needs to be addressed by Google.

Armando Cuevas
  • 233
  • 1
  • 15