Finding wasteful or over-provisioned pods on a "full" but underutilized Kubernetes cluster

Question

I work on a Kubernetes cluster where, right now, about 95% of the CPUs and 90% of the memory have been allocated to pods. However, according to the Kubernetes Dashboard, the overall instantaneous CPU load on the cluster is only about 5% of the total number of cores in the cluster, and the total used memory is only about 33% of the total memory in the cluster. So clearly some and possibly most pods running on the cluster are dramatically overprovisioned; most of the requested CPU and memory is not actually in use at any given time.

How do I find out which pods are most to blame for this? The Dashboard will show me the how much of each node's resources are allocated, and how what resources are actually in use in each running pod. But to see a pod's requests I have to kubectl describe it; I can't find the requests anywhere in the dashboard. Moreover, when a pod finishes and gets cleaned up it disappears, and I don't know of any way to ask questions like "What portion of the requested memory did this completed pod use at its peak?", or "How many core-hours did this pod request but not use over its lifetime?".

What tools exist for finding and diagnosing wasted, requested-but-not-consumed resources in Kubernetes clusters? And what best practices should be employed for right-sizing pods to workloads? I think we got into this situation by letting all the users just double their resource requests until their pods stopped being evicted.

score 2 · Answer 1 · answered Apr 08 '20 at 13:35

This can be achieved with Vertical Pod Autoscaler (VPA)

Even If You are running HPA You can enable recommendation mode which will calculate recommended resource requirements of the pods without changing anything automatically.

After installation the system is ready to recommend and set resource requests for your pods. In order to use it you need to insert a Vertical Pod Autoscaler resource for each controller that you want to have automatically computed resource requirements. This will be most commonly a Deployment. There are three modes in which VPAs operate:

...

"Off": VPA does not automatically change resource requirements of the pods. The recommendations are calculated and can be inspected in the VPA object.

Alternative ways of achieving this could be done with grafana.

In case of GKE cluster on GCP there is metrics explorer.

Hope it helps.

Finding wasteful or over-provisioned pods on a "full" but underutilized Kubernetes cluster

1 Answers1