0

I'm looking for a query for stackdriver which will yield preemption events on GCP VMs.

Why? Because I have pods disappearing from nodes. Apparently those nodes are later on terminated due to missing workloads and autoscaling being enabled. So everything looks like the pods die first and the autoscaler acts as it should and shuts down the nodes. Still, it doesn't seem to happen when I avoid preemptible VMs/nodes.

Mofef
  • 21
  • 3

2 Answers2

2

Shorty after asking this question I found https://cloud.google.com/logging/docs/audit/#system_event

So filtering for logName="projects/<my-project-name>/logs/cloudaudit.googleapis.com%2Fsystem_event" showed a couple of preemptions happening. I didn't know that preempted resources automatically get recreated. This explains why I had the impression that pods disappear, while nodes are left behind empty. (see also Why do pods on a node that was recreated after being preempted get stuck in ContainerCreating?)

Mofef
  • 21
  • 3
0

The logs for preemption on this instance can be found using the following Stackdriver advanced filter [1] in Advance logs queries.

You can change the last line of this filter to [2] to check when it was last started. Alternatively you can run commands such as "uptime" on the VM to see how long it's been up.

[1]

resource.type="gce_instance"
resource.labels.instance_id="[INSTANCE ID]"
jsonPayload.event_subtype="compute.instances.preempted"

[2] jsonPayload.event_subtype="compute.instances.start"

Mustafiz
  • 186
  • 4