Infrastructure used in Amazon EKS

Question

I was looking into a demo of an application built on Amazons kubernetes service, EKS. However, I am struggling to understand what infrastructure is used underneath, as I don't have access to AWS directly.

My understanding.

You define a cluster, regardless of whether you use it there is a cost, so I suspect there is a master node always up.
While a job is running, you pay VM costs, so it is clear that it runs on VMs

Now my question:

What happens when you spin up and down?

First of all, do the VM's spin down exactly to what you need, or is there always some part up to allow you to scale up quickly?

Secondly, if the VM spins down, does that mean the instance is terminated or just stopped.

I noticed that the scaling up happens in a few seconds, so this makes me doubt that VMs are actually made each time when you spin up.

score 3 · Accepted Answer · answered Oct 04 '19 at 12:05

Your understanding is roughly correct. There is a 'control plane' that is managed by Amazon with EKS (effectively the master nodes for the kubernetes cluster). This is invisible to you as the AWS account holder, and you can't get to these underlying machines yourself. Amazon charges a flat rate for this, and you can't scale it back/down to lower costs.

You pay $0.20 per hour for each Amazon EKS cluster that you create.

Your second point around jobs running is not very clear. You don't necessarily run jobs in Kubernetes - you run containers in pods (and you can also run 'jobs' which are also pods with a limited lifespan based on when a process completes).

By default, you need to create 'worker groups' for your EKS cluster. How you create these is up to you.

Generally, you create an autoscale group for each worker group, and you can define yourself how that autoscale group scales worker nodes in your cluster out and in. These are classic EC2 VMs as you guessed, and you can access them with SSH or SSM for example. (Managed by you).

So to scale the worker groups that run your container workloads, you can either hand scale them up and down, rely on autoscale group metrics to scale them in/out or you can use a bespoke solution like cluster-autoscaler to more intelligently scale them in and out based on what your containers in the cluster are doing.

So generally when an autoscale group / worker group scales in, it will terminate the EC2 instance. When a new one comes up, your launch configuration for the worker group should have everything it needs to know to allow the new instance to auto-join the EKS cluster and begin scheduling pods.

So yes, VMs are indeed made/started/provisioned when the worker group scales out. If they're linux based EKS worker nodes these normally start fairly quickly. Windows ones are generally a bit slower.

To answer your other question - VMs spin down to what you need only if you've configured your scaling mechanisms carefully and to your own requirements. Cluster-autoscaler helps a lot with this.

Hope that helps clear things up for you.

So, when I see scaling up in less than 10 seconds, could that just be VM's spinning up really fast, or does that imply some VM's were already up? — Dennis Jaheruddin, Oct 04 '19 at 12:50
Where are you seeing the 'scaling up' and what do you mean by scaling up that you're seeing? Is it the kubernetes worker nodes you see being added, or replicas (count) of your application pods/containers increasing? If you're seeing the latter, that means that you probably have something like a HorizontalPodAutoscaler (HPA) running, which allows pods to scale out / in on the same worker node (or across other worker nodes). This means that the worker node EC2 instances might not be scaling out, but rather just your application pods. — Shogan, Oct 04 '19 at 12:56
I am looking at a (black box) application, and am trying to understand what the infra costs will be. I cannot see the actual scaling implementation, or what is happening underneath, but there is a dashboard with how many resources are accessible. When adding a workload it jumps from 0 to N quite quickly, when increasing the workload further it jumps to 2N. So I think it is a proxy for the (pods/cores on the) nodes I can use. -- Just wondering if the quick jump must imply that the VM was already up, or if the VM can just be ready in 5 sec or so. — Dennis Jaheruddin, Oct 04 '19 at 13:02
I've not seen a worker node in kubernetes be provisioned and added to the cluster in as little as 10 seconds. Usually its at least 2 minutes. So that would imply that your pod replicas are being scaled up with a horizontal pod autoscaler. Containers generally take seconds to startup, VMs take more like minutes. Here is how the HPA works - my best guess is that this is what's in the black box you're looking at. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#how-does-the-horizontal-pod-autoscaler-work — Shogan, Oct 04 '19 at 13:05
Thanks, this gave me a better understanding already, and a good foundation to ask the creator the right questions! — Dennis Jaheruddin, Oct 04 '19 at 13:44

score 0 · Answer 2 · answered Mar 21 '20 at 16:17

Though I was unable to confirm this in full detail, another source informed me that after you have a cluster set up, you can scale into resources (that you did not pay for) comparatively fast. For instance 20 seconds.

Perhaps this requires you to leverage some standard resources that are comparatively 'hot' available.

Infrastructure used in Amazon EKS

What happens when you spin up and down?

2 Answers2