I have a liveness probe which is configured to check an endpoint's availability:
livenessProbe:
httpGet:
path: /path_example/
port: 8000
initialDelaySeconds: 10
periodSeconds: 60
The cluster has autoscaling enabled as per instructions here - https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-autoscaler with 1 minimum and 3 maximum
Although ten plus minutes go by the cluster always displays "current total size - 3". Other than the liveness probe nothing is using the application.
Could this be causing the nodes to remain and never get scaled down?
I cannot see any other reason why the nodes don't ever go down.
UPDATE: I've set resource for cpu and set autoscaler on the deployments so now for 'kubectl get hpa' I get:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
one-deployment Deployment/one-deployment 34%/80% 1 3 1 2m8s
two-deployment Deployment/two-deployment 47%/80% 1 3 1 8m16s
three-deployment Deployment/three-deployment 35%/80% 1 3 1 3m29s
four-deployment Deployment/four-deployment 33%/80% 1 3 1 2m48s
five-deployment Deployment/five-deployment 47%/80% 1 3 1 2m24s
But still I remain at the max of 3 nodes.
Another update: I'd appreciate any feedback on what I believe is the summary of my learning. I'm quite new to Kubernetes and GKE I'm so please forgive me.
Firstly I now better understand there is autoscaling of nodes on clusters and then there is autoscaling of pods on nodes.
The part I needed to get right first was autoscaling of nodes on clusters. When autoscaling is enabled on a cluster, with for example, --enable-autoscaling --no-of-nodes 2 --min-nodes 1 --max-nodes 3. Then I'm causing the deployment to run on 2 nodes, if there is so little resource required and pods can move nodes then it may go down to 1. If a specified --num-of-nodes 3 then I'd deploy to three and this could result in unmoveable pods being spread across all three nodes preventing the ability to downscale to 2 or 1.
As starting with 1 caused my application to fail to fully deploy I've set this to 2.
Now to scaling my deployments to potentially increase the number of pods: In the GCP GKE console I've selected 'workload' and then one of my deployments in the list of pods. Then from here I select 'Action' from the menu at the top and then 'autoscaling', I've left the default of 1 minimum and 3 maximum and ok'd this. I've repeated this for the other 4 deployments I have. This is the horizontal pod scaling that I was getting mixed up with when I first started looking at cluster scaling. This is what I get details about when I run 'kubectl get hpa'. This doesn't relate to the cluster's node scaling at all as far as I can tell.
Now my application runs and when there is sufficient load on my pods the hpa autoscaling will kick in and create new pods. These pods will be run in my existing two nodes unless there is insufficient space at which point the cluster (configured to have max 3) will add a third node and assign the new pod to this node.
So hopefully my final question - Have I put two and two together and got 5?