1

Quoting the RedHat Performance Tuning Guide

3.3.7. Setting interrupt affinity

Interrupt requests have an associated affinity property, smp_affinity, that defines the processors that will handle the interrupt request. To improve application performance, assign interrupt affinity and process affinity to the same processor, or processors on the same core. This allows the specified interrupt and application threads to share cache lines.

I have an application which receives and processes large amounts of UDP data. If I'm looking to cut the time between the UDP packet arriving and the application processing the packet completely, should I assign the same affinity to the NIC receiving the packets & the application? Or should I assign them different affinities? I feel like the quote above suggests the former, but I would've thought the later might be more beneficial.

Any help would be great.

Thanks

user3513346
  • 87
  • 3
  • 9

1 Answers1

1

While only a directed benchmark can really reply to your question, the safest bet is to run IRQs and application on the same CPU/socket, but on different cores. In this manner, available CPU power is maximized and, at the same time, the L3 cache common in all recent servers enables fast data sharing between IRQs and application.

If you really are interested in keep down the latency between packet receive and processing, you should tune your ethernet adapter's packet buffer and IRQ coalesce settings.

You can use the very good ethtool to do that:

  • ethtool -c give you an overview of default packet coalesce settings, while ethtool -C enable you to change them.
  • ethtool -g shows ring buffer settings, while with ethtool -G you can alter them.
shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • Could you please expand on buffer & IRQ coalesce settings? If I'm not dropping any packets, could I improve performance by reducing buffer size? And what exactly might I look to tune with coalesce settings? Thank you for the great answer :) – user3513346 May 29 '15 at 01:00
  • 1
    This is a vast topic, and it is difficult to explain it in the space of a comment. The rule of thumb is that smaller coalesce values improve latency but have the downside of increased CPU load and somewhat lower throughput. Increasing the ring buffer values typically has no downside, but in a very latency constrained environment maybe must be tuned. I suggest you to read these excellent documents: [1](https://fasterdata.es.net/host-tuning/nic-tuning/) and [2](http://www.intel.it/content/dam/doc/application-note/82575-82576-82598-82599-ethernet-controllers-latency-appl-note.pdf) – shodanshok May 29 '15 at 09:08