Oversubscription vs Over allocation of vCPU resources in vSphere

Question

I'm trying to understand over allocation of vCPU compared to over subscription of vCPU.

For the purposes of the question lets assume the host that I will be using has a 16 core Xeon(2.1Ghz).

Based on my research over subscription is having VMs subscribed to more resources than there are available on the ESXi host. So generally a 1:1 ratio for resources to subscription is preferred, but a 1:3 ratio can be recommended. So if I have hyper-threading enabled I'll have a total of 32 vCPU and the recommended over subscription will be at 32*3 = 96 vCPU total for all the VMs on the host. Is the way I'm looking at it correct, or should over subscription not take into consideration hyperthreading?

When I look at allocation of vCPU resources I see that you can reserve, limit, or prioritize vCPU resources. Assuming my host reserves minimum vCPU resource at 339 Mhz, is it correct to assume the following:

1) vCPU with no hyperthreading(16)-> Total of 336000 Mhz -> 99 single vCPU VM

2) vCPU with hyperthreading (32) -> Total of 672000 Mhz -> 198 single vCPU VM

What would happen in the above scenario when a few of those VMs have differing max limits, and they require the vCPU resource at the same time? And is over allocation represented as using more resources than one vCPU can provide? For example a vCPU has 2.1Ghz, but for a single VM I reserve 2.5Ghz, such that it has to "borrow" resources from another vCPU.

score 1 · Answer 1 · answered Dec 15 '17 at 07:47

You're trying to understand the problem from the most difficult end, CPU resources. Approach it from Memory instead and it's a bit easier to understand.

The host has x amount of memory, and the host reserves y amount of memory for it's own nefarious purposes (which by the way follows the formula a MB initially and the b MB for every VM that's powered on). If you don't allow over-subscription/allocation, you assign a total of x - y to your VMs, and every VM can always get the memory it's allocated. This is however very inefficient, so you generally over-allocate memory to your machines. If your policy states that 3:1 is ok, then you can assign 3*(x-y) to your VMs (or practically 3x as the amount VMware ESXi need is nothing compared to total memory).

If the host runs out of memory, it first tries ballooning (which starts at 95% utilization), where it via VMtools starts asking the guest OS for memory, and then it deletes all the pages it gets (as whatever the OS gives the balloon driver is not being currently used by the OS, so it's safe to delete). If this doesn't help, ESXi does what every other operating system does when it runs out of memory, it starts swapping memory to disk. ESXi though does it per VM (as it has to take shares, reservations and limits into account).

vCPUs work a bit differently, as it's not technically a resource the way memory and disc is (I mean, a 20 core 2 GHz CPU isn't the same as a 1 core 40 GHz CPU now is it?). First of all, Hyperthreading doesn't give you twice the amounts of cores, official VMware figures is that HT gives you abouts a 10-30% performance gain. Secondly, vCPUs are multiplexed, so the scheduler gives out time slots on the pCPUs for the different vCPUs. So if you have an 8 core CPU on your host, it can run either 8 1-vCPU VMs at the same clock cycle, or 1 8-vCPU. This gives you a problem that on a highly loaded host, you might see performance drops by assigning more cores to a VM, as the scheduler has a hard time finding a large enough time slot to run the larger VM.

Shares, reservations and limits are a way around some of this.

Reservations are a guarantee, aka. if you give a VM a reservation of 500 MHz, the hypervisor will give it 500 MHz regardless of what's happening on the host. If the reservation isn't being used at the moment though, it will give the resources to another VM for the time being (so you're never wasting resources by setting reservations).

Limits on the other hand set a hard "roof" or cap for a resource. This is useful if you want a VM to believe that it has more resources than it actually can get. This has really no practical use (it was used in earlier ESXi versions as a way to get hot-add of memory by removing the limit) today other than tricking software vendors, and it's not very useful at that either. Limits are enforced ruthlessly, so if you assign a limit of 1 GB memory to a machine with 4 GB configured, that machine will never get more than 1 GB of physical memory.

Shares are a bit different, they allow you to "weight" the "importance" of a particular VM when it comes to different resources. In ESXi every VM is equal, and when competing for resources 2 VMs with 1000 shares each will get the resource 50% of the time. By altering the shares to say 3000 vs 1000 you can change the divide to 75%/25%. Shares are only used when VMs are competing for resources.

So, to actually answer your question, over-allocation means that you assign for example 32 vCPUs (2 GHZ each) to your VMs on a host that has a 16 core CPU without HT. Your CPU has 32 GHZ "available" but you have assigned 64 GHz to your VMs. This is 2 to 1 over-allocation/subscription. But if you know anything about servers it's that they are idle most of the time, so your host might not see more than 16 GHz of load. But, theoretically, this setup could produce a demand for 64GHz, which would bog down your host. HT isn't very useful for these calculations, but you could maybe add 25% on top of your "non-HT" GHz value to get some idea of it's benefits.

Oversubscription vs Over allocation of vCPU resources in vSphere

1 Answers1