67

Our IT created a VM with 2 CPUs allocated rather than the 4 I requested. Their reason is that the VM performs better with 2 CPUs rather than 4 (according to them). The rationale is that the VM hypervisor (VMWare in this case) waits for all the CPUs to be available before engaging any of them. Thus, it takes longer to wait for 4 rather than 2 CPUs.

Does this statement make sense?

AngryHacker
  • 2,877
  • 6
  • 28
  • 33

3 Answers3

62

This used to be true, but is no longer exclusively true.

What they are referring to is Strict Co-Scheduling.

Most important of all, while in the strict co-scheduling algorithm, the existence of a lagging vCPU causes the entire virtual machine to be co-stopped. In the relaxed co-scheduling algorithm, a leading vCPU decides whether it should co-stop itself based on the skew against the slowest sibling vCPU

Now, if the host only has 4 threads, then you'd be silly to allocate all of them. If it has two processors and 4 threads per processor, then you might not want to allocate all of the contents of a single processor, as your hypervisor should try to keep vCPUs on the same NUMA node to make memory access faster, and you're making this job more difficult by allocating a whole socket to a single VM (See page 12 of that PDF above).

So there are scenarios where fewer vCPUs can perform better than more, but it's not true 100% of the time.

All that said and done, I very rarely allocate more than 3 vCPUs per guest. Everyone gets 2 by default, 3 if it's a heavy workload, and 4 for things like SQL Servers or really heavy batch processing VMs, or a terminal server with a lot of users.

Mark Henderson
  • 68,316
  • 31
  • 175
  • 255
  • 3
    Just an aside - even apart from virtualization hurdles, it's generally tricky to write software that exploits parallelism. If your software guys aren't good enough, it might actually be better to have four virtual hosts running one instance of the software each than one host running four threads of execution. – Luaan Apr 30 '15 at 16:06
  • 4
    @Luaan The box has SQL Server on it, so I assume they got some good software guys there. – AngryHacker Apr 30 '15 at 21:33
  • 1
    @AngryHacker yes SQL server can use all 4 cores very efficiently assuming you have appropriate `MAXDOP` settings. However depending on the workload, an overly taxed SQL server is often a sign of bad database design - bad indexes, no clustered indexes, too many indexes, no optimisation, etc. (not always, but often). – Mark Henderson Apr 30 '15 at 23:12
  • 1
    @Luaan That's still parallelism... just with much higher latency if they need to talk to each other. :) – reirab May 01 '15 at 02:00
  • @MarkHenderson Its pretty well optimized, but some workloads are just too large to be handled easily by 2 CPUs plus a ton of other queries at the same time. – AngryHacker May 01 '15 at 05:23
  • @MarkHenderson conversely if all the VM needs is 4 vCPU worrying about database design and optimisation can be poor resource management. Hardware is typically cheap, dev time - especially good dev time - is typically not. – NPSF3000 May 01 '15 at 12:51
  • @reirab Yeah, of course. But it forces you to think a lot harder about avoiding shared state and solving the issues when shared state collides - and from personal experience, you might be surprised how even a horrible parallel architecture that's *distributed* will allow scaling with load. In one case, I managed to replace ten machines with just one handling parallelism well (the original solution including beauties like `while (true)` loops with thread sleeps), and it wasn't even a lot of work - but the original solution *did* work even though written with little understanding of concurrency. – Luaan May 04 '15 at 07:19
15

This largely depends on the underlying hypervisor and the admins who run it, let me explain:

  1. It's bad practice to just arbitrarily give you 4 CPU's just because you requested it. Generally speaking, you think you need 4; but resource monitoring says you only need 1.
  2. VMware ESXi for example requires locking all pCPUs when a vCPU makes a request for CPU resources; so on this hypervisor it is bad for performance. KVM doesn't do locking like ESXi does; it uses the underlying kernel scheduler, but still in the long run can create CPU contention.
  3. If you're building systems from the start with 4 CPU's you aren't really scaling out, but up (which is bad practice especially on VMs). You may want to check into how you're architecting whatever it is you're working on so that it can be built to scale to today's modern cloud infrastructures.

What can you learn from this? Always create VMs with minimal resources and increase as needed. Always scale out instead of up and you'll be able to run your app anywhere.

Because of the locking that can be experienced with the hypervisor, it is indeed true that 2 CPU's can be faster than 4 CPUs.

4

Yes, the statement makes sense in general. However, it's something you should test for your exact configuration and workload though. Sometimes more CPUs is better if you can actually take advantage of them. However, if you don't actually have that much parallelism, a VM configured with less CPUs will often perform slightly better as it avoids slowdowns due to CPU Ready State pauses.

I've reduced the vCPUs on a number of our VMs and have seen an improve on throughput on the majority. A handful got worse and needed to be bumped up on vCPU count.

Brian Knoblauch
  • 2,188
  • 2
  • 32
  • 45