I'd like to set up instrumentation for OOMKilled events, which look like this when examining a pod:
Name: pnovotnak-manhole-123456789-82l2h
Namespace: test
Node: test-cluster-cja8smaK-oQSR/10.x.x.x
Start Time: Fri, 03 Feb 2017 14:34:57 -0800
Labels: pod-template-hash=123456789
run=pnovotnak-manhole
Status: Running
IP: 10.x.x.x
Controllers: ReplicaSet/pnovotnak-manhole-123456789
Containers:
pnovotnak-manhole:
Container ID: docker://...
Image: pnovotnak/it
Image ID: docker://sha256:...
Port:
Limits:
cpu: 2
memory: 3Gi
Requests:
cpu: 200m
memory: 256Mi
State: Running
Started: Fri, 03 Feb 2017 14:41:12 -0800
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 03 Feb 2017 14:35:08 -0800
Finished: Fri, 03 Feb 2017 14:41:11 -0800
Ready: True
Restart Count: 1
Volume Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-tder (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-46euo:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-tder
QoS Class: Burstable
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
11m 11m 1 {default-scheduler } Normal Scheduled Successfully assigned pnovotnak-manhole-123456789-82l2h to test-cluster-cja8smaK-oQSR
10m 10m 1 {kubelet test-cluster-cja8smaK-oQSR} spec.containers{pnovotnak-manhole} Normal Created Created container with docker id xxxxxxxxxxxx; Security:[seccomp=unconfined]
10m 10m 1 {kubelet test-cluster-cja8smaK-oQSR} spec.containers{pnovotnak-manhole} Normal Started Started container with docker id xxxxxxxxxxxx
11m 4m 2 {kubelet test-cluster-cja8smaK-oQSR} spec.containers{pnovotnak-manhole} Normal Pulling pulling image "pnovotnak/it"
10m 4m 2 {kubelet test-cluster-cja8smaK-oQSR} spec.containers{pnovotnak-manhole} Normal Pulled Successfully pulled image "pnovotnak/it"
4m 4m 1 {kubelet test-cluster-cja8smaK-oQSR} spec.containers{pnovotnak-manhole} Normal Created Created container with docker id yyyyyyyyyyyy; Security:[seccomp=unconfined]
4m 4m 1 {kubelet test-cluster-cja8smaK-oQSR} spec.containers{pnovotnak-manhole} Normal Started Started container with docker id yyyyyyyyyyyy
All I get from the pod logs is;
{
textPayload: "shutting down, got signal: Terminated
"
insertId: "aaaaaaaaaaaaaaaa"
resource: {
type: "container"
labels: {
pod_id: "pnovotnak-manhole-123456789-82l2h"
...
}
}
timestamp: "2017-02-03T22:34:48Z"
severity: "ERROR"
labels: {
container.googleapis.com/container_name: "POD"
...
}
logName: "projects/myproj/logs/POD"
}
And the kublet logs;
{
insertId: "bbbbbbbbbbbbbb"
jsonPayload: {
_BOOT_ID: "ffffffffffffffffffffffffffffffff"
MESSAGE: "I0203 22:41:11.925928 1843 kubelet.go:1816] SyncLoop (PLEG): "pnovotnak-manhole-123456789-82l2h_test(a-uuid)", event: &pleg.PodLifecycleEvent{ID:"another-uuid", Type:"ContainerDied", Data:"..."}"
...
Which doesn't seem like quite enough to uniquely identify this as an OOM event. Any other ideas?