interrupt coalescing for high bandwidth packet capture?

Question

I have an application which does packet capture from an ethernet card. Once in a while we see packets dropped (we suspect due to the buffer in the network card or kernel being overrun). I am trying to figure out if turning on interrupt coalescing will help or worsen the situation. On the one hand, there should be less work on the CPU since there should be less interrupts to process, on the other hand, it seems that if the IRQs are not processed as frequently, there is a higher probability of a buffer being overrun. Does that mean that maybe I should turn it on and increase the size of rmem_max settings?

UPDATED TO INCLUDE OS/HW Details:

Dell PowerEdge 1950, Dual Quad-Core Xeon X5460 @ 3.16GHz Broadcom NetXtreme II BCM5708 Linux OS

proc/sys/net/core
  dev_weight          64
  netdev_budget       300
  rmem_default        110592
  somaxconn           128
  wmem_max            16777216
  xfrm_aevent_rseqth  2
  message_burst       10
  netdev_max_backlog  65536
  rmem_max            16777216
  warnings            1
  xfrm_acq_expires    30
  xfrm_larval_drop    1
  message_cost        5
  optmem_max          20480
  rps_sock_overflow_entries 0
  wmem_default        110592
  xfrm_aevent_etime   10

Please tell us what operating system, network card vendor, and network card model you are using (edit your question to include this information)... — voretaq7, Jul 28 '12 at 00:13
This depends on a few things. What type of server hardware and NIC are you using? What distribution and kernel version are you using? What are the `net.core.rmem_max` and `net.core.rmem_default` settings right now? — ewwhite, Jul 28 '12 at 00:15
*Which* Linux operating system are you using? Specifically, which distribution... — ewwhite, Jul 29 '12 at 03:44
@ewwhite - it's RedHat but I don't have specifics right now. Systems run at client data center and I don't have remote access. Will try to get full details on OS version tomorrow. Thanks — Andy F, Jul 29 '12 at 23:25

score 1 · Answer 1 · answered Jul 28 '12 at 00:51

Without knowing why you're dropping packets, it's impossible to know whether it'll help or not. Your analysis is fundamentally correct -- if interrupts arrive (are serviced) less often, there's a greater chance of buffers filling up, all things being equal. If you don't know why you're losing packets, though, you can't tell if making that change will improve the situation or not.

Personally, I find throwing good-quality NICs with good drivers into a good quality server makes all my problems go away. Much cheaper than spending days grovelling through debug data.

ewwhite · Answer 2 · 2012-07-31T02:54:54.297

1

Okay, you've not given some of the basic information (like particular OS distribution or kernel version). That matters because the sysctl/kernel setting defaults differ across distros and certain tunables aren't exposed in some Linux systems. You're working with a server from 2008, so how do we know that your OS and kernel aren't from the same era?

Looking at your network parameters, though, I'd increase the default buffer sizes. A recent system setup for high-frequency trading I deployed had much higher wmem_default and rmem_default settings. Try "8388608" to start and see if that helps. It's a basic change, but usually the first step...

I would also look at changing the realtime priorities of your (presumably custom) application. Are you using any form of CPU affinity (taskset, cgroups) in your app or wrapper script? How about the realtime priority of your app? Look into the chrt command and its options to see what would be appropriate for your situation. Is your application multithreaded?

Luckily, the 5400-series CPU doesn't have hyperthreading to deal with, but how are your other BIOS settings? Did you disable power management and C-states? Are there any unnecessary daemons running on the system? Is irqbalance running?

Now, as to the hardware you're using, if this if for HFT use, you're behind; literally THREE jumps in CPU and architectural changes... The Nehalem (5500-series) brought a big jump in tech over the 5400-series you're using. Westmere (5600) was even better. Sandy Bridge was a big enough change over the 5500/5600 to spur another hardware refresh in my environments.

It also sounds like you're using the onboard NICs. There were some hoops we needed to jump through when dealing with Broadcom... But you're not at that point yet. How does CPU load look when you encounter dropped-packets? What type of data flow rate are you experiencing during your captures? This may just be a case of your system not keeping up.

There are a lot of knobs to tweak/tune here. A better understanding of what you're working with will help us narrow things down, though.

Edit: you mentioned Red Hat. The options for EL5 and EL6 differ, but the suggestions above do apply in theory.

Edit: It's good that you're on RHEL 6. There's a lot you can do. Try setting the priority of your app and test. Another useful guide is the RHEL MRG tuning guide. Not all of the features will be available to your kernel, but this will give you some ideas and explanations for some of the things you can modify for more deterministic performance.

edited Jul 31 '12 at 02:54

answered Jul 29 '12 at 11:01

ewwhite

194,921
91
434
799

sorry for the delay - still trying to get a handle on some of the data and also had some follow up questions on your comments. Will post them as separate comments below: – Andy F Jul 31 '12 at 01:59
why do you point out that we're using onboard NICs. Is the other option PCI NICs? Are they better for this type of application? Thanks! – Andy F Jul 31 '12 at 02:01
for the rmem_default parameter, this sets how big the socket receive buffer will be, meaning, how many bytes the socket will read from the NIC before blocking? Is that correct? I also found tcp_rmem params in /proc/sys/net/ipv4. What are they used for? I found some docs that say the tcp_rmem value overrides the rmem_default, but maybe I am misreading the docs. Also, it seems that the NIC driver has the ability to increase the size of its buffer. Is that worthwhile to do? – Andy F Jul 31 '12 at 02:12
Sometimes... Most of the industry has gome to 10GbE NIC interfaces in PCIe form-factor. Solarflare, Myricom, Chelsio, Intel, etc... In terms of Gigabit, some people go with straight Intel adapters for predictability and ease of access to the drivers (for modification). It's not a must, though. – ewwhite Jul 31 '12 at 02:14
System setup: the server has 2 NICs. One is connected to the mirrored port where we're capturing the packets from, the other is connected to the lan. irqbalance is running, but I think that is a mistake. My plan is to turn off irqbalance and pin the NIC with the captured traffic to one core. The application which reads the socket has two threads, one which reads from the socket and writes to shared memory, and the second thread which reads from shared memory and writes to file. Would I want to pin the reader thread to a core which shares the L2 cache with the core which will handle IRQs? – Andy F Jul 31 '12 at 02:17
I'd kill `irqbalance`, increase the buffers, and worry about the application's realtime priority. Which version of RHEL and kernel is this? `cat /etc/issue` and `uname -a` – ewwhite Jul 31 '12 at 02:22
I don't have access to /etc/issue right now but here's the output of uname and lsb_release, FWIW:Linux xxx.com 2.6.32-202.el6.x86_64 #1 SMP Wed Sep 21 15:27:03 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux LSB Version: :core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch Distributor ID: RedHatEnterpriseServer Description: Red Hat Enterprise Linux Server release 6.0 (Santiago) Release: 6.0 Codename: Santiago – Andy F Jul 31 '12 at 02:29
should I set the priority to FF then? – Andy F Jul 31 '12 at 02:30
There are too many comments here. See the edit to my post. – ewwhite Jul 31 '12 at 02:55

interrupt coalescing for high bandwidth packet capture?

2 Answers2