9

We've got a KVM host system on Ubuntu 9.10 with a newer Quad-core Xeon CPU with hyperthreading. As detailed on Intel's product page, the processor has 4 cores but 8 threads. /proc/cpuinfo and htop both list 8 processors, though each one states 4 cores in cpuinfo. KVM/QEMU also reports 8 VCPUs available to assign to guests.

My question is when I'm allocating VCPUs to VM guests, should I allocate per-core or per-thread? Since KVM/QEMU reports the server has 8 VCPUs to allocate, should I go ahead and set a guest to use 4 CPUs where I previously would have set it to use 2 (assuming 4 total VCPUs available)? I'd like to get the most possible out of the host hardware without over-allocating.

Update: Chopper3's answer is undoubtedly the right approach. However, I'd still love to hear from any hardware experts out there who could elucidate the performance aspects of threads vs. cores... anyone?

nedm
  • 5,610
  • 5
  • 30
  • 52

4 Answers4

9

Set the lowest number of vCPUs your servers need to perform their function, don't over-allocate them or you could easily slow down your VMs.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
  • 1
    This seems like a wise approach. Still, I'm curious how allocating VCPUs per thread rather than per core affects performance. But I have seen some of the very bad things that can happen from over-allocation, and using the same number of VCPUs as in a non-hyperthreaded host seems to handling the load adequately for the guests, so I'll leave well enough alone and plan to experiment on a non-production box sometime. – nedm Apr 15 '10 at 19:45
  • 1
    +1, The answer also depends on your workload. For VMs that are heavily CPU bound, count them as taking a whole core, for VMs that are idle or IO bound count them as taking a thread. But in general, always allocate as little as you can get away with and you'll avoid big headaches. – Chris S Apr 25 '10 at 03:36
  • 1
    While I agree with the minimalistic approach, KVM is not VMWare in this sense. No gang scheduling means more vCPUs per VM can be used harmlessly – dyasny Jul 16 '12 at 20:01
5

Typically, HT works well on workloads that are heavier on IO -- the CPU can schedule in more processing tasks from the queue of the other virtual CPU while the first virtual CPU waits on the IO. Really all the HT subsystems get you is hardware-accelerated context switching -- which is the workload pattern that's also used when switching between VMs. So, HT will (usually) reduce the slowdown a bit when you have more VMs then cores, provided each VM gets one virtual core.

Assigning multiple vCPUs to a VM can improve performance if the apps in the VM are written for threading, but it also makes life harder for the hypervisor; it has to allocate time on 2 or 4 CPUs at once -- so if you have a quad-core CPU and a quad-vCPU VM, only one VM can get scheduled during that timeslice (whereas it can run 4 different single-vCPU VMs at once).

techieb0y
  • 4,161
  • 16
  • 17
  • @Chris, @techieb0y: Thanks, this is exactly the kind of insight I was looking for. – nedm Apr 25 '10 at 04:50
  • This is not exactly true. When a VM with quad vCPUs needs to schedule a single v-core, this is what gets scheduled on the host, not all 4 cores. At least this is the case with KVM (I know vmware's approach is less effective, as they do gangscheduling) – dyasny Aug 03 '12 at 20:53
5

This is rather tricky. Depending on the loads, HT can increase performance by ~30% or decrease it. Normally I advise not to allocate more vCPUs than you have physical cores, to a single VM, but if the VM is rather idle (and of course, such a VM will not really require too many CPUs), it can be given up to as many vCPUs as you have threads. You don't really want to give a single VM more vCPUs than you have schedulable cores is what I'm getting at. And in any case, @Chopper3's advice is right - don't give the VM more v-CPUs than it absolutely requires.

So, depending on how loaded and critical your VMs are, you either don't overallocate at all, stick to the physical core count, or go as high as the thread count per VM.

Now, getting into the question of HT, it is generally a good thing to have, especially when you commit more vCPUs to your VMs than you have physical cores or even threads, because it makes it easier for the Linux scheduler to schedule those vCPUs.

One last thing, with kvm a vCPU assigned to a VM is just a process on the host, scheduled by the Linux scheduler, so all the normal optimizations you can do here easily apply. Moreover, the cores/sockets setting is just the way this process will be displayed for the VM's guest OS, on the host it's still just a process, regardless of how the VM sees it.

dyasny
  • 18,482
  • 6
  • 48
  • 63
2

I think to elaborate on Chopper3's answer: if the systems are mostly cpu-idle, don't assign a bunch of vcpu, if they are cpu-intense, be very careful to not overallocate. You should be able to allocate a total of 8 vCPU without contention. You can overallocate, but if you do, make sure no single guest, especially a CPU-intensive guest, has 8 vcpu, or you will have contention. I don't know the KVM scheduler mechanism to be more specific than that.

The above is based on the following understanding of vCPU versus pinned CPU, but also the assumption that KVM will allow a single guest (or multiple guests) to hog all the actual CPU from others if you allocate it(/them) enough threads. vCPU ~ host thread, guest CPU CPU = host Core, guest CPU (Haven't played with mixed vCPU and pinned CPU on the same guest, because I don't have Hyperthreading.)

Nakarti
  • 21
  • 1
  • 1
    pinned vCPU is just a virtual CPU process that is assigned to run only on a specific core (or a subset of cores). If you are not overallocating and want to make sure the VMs are not contending for the same core's CPU time, you can pin them to different cores. This is also a way to do NUMA pinning, though you can do that directly nowadays – dyasny Oct 18 '13 at 03:35