43

This question has been asked before, but I believe that the world has changed enough for it to be asked again.

Does irqbalance have any use on today’s systems where we have NUMA-capable CPUs with memory sharing between their cores?

Running irqbalance --oneshot --debug shows that a virtual guest on a modern VMware ESXi environment is sharing the NUMA nodes between cores.

# irqbalance --oneshot --debug 3
Package 0:  numa_node is 0 cpu mask is 0000000f (load 0)
    Cache domain 0:  numa_node is 0 cpu mask is 0000000f  (load 0)
            CPU number 0  numa_node is 0 (load 0)           
            CPU number 1  numa_node is 0 (load 0)
            CPU number 2  numa_node is 0 (load 0)
            CPU number 3  numa_node is 0 (load 0)

irqbalance will in this case detect that it is being run on a NUMA system, and exit. This messes with our process monitoring.

Should we look into running numad instead of irqbalance on such systems?

This is mostly interesting for VMware virtualised servers.

TRiG
  • 1,167
  • 2
  • 13
  • 30
espenfjo
  • 1,676
  • 2
  • 13
  • 15

1 Answers1

30

Here is one answer from a technician in RedHat. Although I do believe that most enterprise hardware is NUMA capable. And as far as I know VMware will also try to fit your VMs on the same NUMA node as long as its CPU configuration fits.

Experiences (Especially concerning VMware) would be greatly appreciated.

This is true "because" of modern servers. Keep in mind that Multi-CPU/Muli-Core is not the same as NUMA. There are many Multi-CPU/Core systems that do not have NUMA.

Before reading my explanation below, please read the IRQ Affinity document above, as well as the following guides:

RHEL 6 Performance Tuning Guide

Low Latency Performance Tuning for RHEL 6

Got all that read? Great, you need to hear nothing more from me! ;-) But just in case you were impatient, here is why you want them...

IRQbalance keeps all of the IRQ requests from backing up on a single CPU. I have seen many systems with 4+ CPU cores perform slow because all of the processes on various CPU's are waiting on CPU 0 to process network or storage IRQ requests. CPU 0 looks very, very busy, all the other CPUs are not busy, yet the apps are very slow. The apps are slow because they are waiting on their IO requests from CPU 0.

IRQbalance tries to balance this out in an intelligent way accross all the CPUs and, when possible, puts the IRQ processing as close to the process as possible. This might be the same core, a core on the same die sharing the same cache, or a core in the same NUMA zone.

You should use irqbalance unless:

You are manually pinning your apps/IRQ's to specific cores for a very good reason (low latency, realtime requirements, etc.)

Virtual Guests. It does not really make sense because unless you are pinning the guest to specific CPUs and IRQs and dedicated net/storage hardware, you will likely not see the benefits you would on bare metal. But your KVM/RHEV host SHOULD be using irqbalance and numad and tuned.

Other very important tuning tools are tuned profiles and numad. Read about them! Use them!

Numad is similar to irqbalance in that it tries to make sure that a process and its memory are in the same numa zone. With many cores we see a significant reduction in latencies resulting in much smoother reliable performance under loads.

If you are skilled, diligent, and monitor regularly or have a very predictable workload, you may get better performance by manually pinning processes/IRQs to CPUs. Even in these situations, irqbalance and numad come very close to matching. But if you are uncertain or your workload is unpredictable, you should use irqbalance and numad.

espenfjo
  • 1,676
  • 2
  • 13
  • 15
  • 5
    FWIW, some 10GbE manuals recommend disabling irqbalance to get better throughput... – rogerdpack Nov 12 '13 at 20:35
  • 9
    In order to get absolute maximum to match their benchmark numbers, yes, you need to bolt things together in a certain way, but these benchmarks generally don't match real life work loads. If you have ONE application running in the server with an extremely latency sensitive requirement in a very predictable usage pattern, fine, go ahead and manually configure processor affinity for things. But if the application is more of a real world use case where things can vary over a wide range of processes and loads, I agree with the Red Hat tech. Linux NUMA balancing is progressing nicely. – GeorgeB Jun 18 '15 at 22:59