CPU context switch per/second increasing with network traffic increasing

Question

I have server with ubuntu server operation system, i have some applications on it which works with network. With an increase in network traffic, CPU context swithching and interrupts are also increases 40 - 60 k per/second. What i must fix kernel optimizaion, NIC optimization maybe or what?

UPDATE

First of all thank you for you answers. I have 8 CPU. My cat /proc/interrupts

          CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       
  0:    6938741    6966303    6934714    6881839    6895772    6883046    6952545    6909960   IO-APIC-edge      timer
  1:          0          0          1          0          0          1          1          1   IO-APIC-edge      i8042
  8:          0          1          0          0          0          0          0          0   IO-APIC-edge      rtc0
  9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
 12:          0          1          0          0          1          1          2          1   IO-APIC-edge      i8042
 16:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
 17:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
 18:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
 19:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb5
 21:         21         23         22         21         23         21         23         21   IO-APIC-fasteoi   ipmi_si
 22:          3          5          6          6          2          5          3          6   IO-APIC-fasteoi   uhci_hcd:usb6, hpilo
 23:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   radeon
 41:     275729        555        587        549     275294        563        583        600   PCI-MSI-edge      cciss0
 42:          2          0          2          1          1          1          2          1   PCI-MSI-edge      cciss1
 46:   31600723   31636789   31668261   31721092   31643480   31719981   31650284   31692948   PCI-MSI-edge      eth0
NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
LOC:   42250721   42318004   19164905   20751945   32012455   25335850   15889990   15935085   Local timer interrupts
SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0          0          0          0          0   Performance monitoring interrupts
IWI:          0          0          0          0          0          0          0          0   IRQ work interrupts
RES:  104005816   96594384   40149041   34906154   77175689   55787936   28455228   25633969   Rescheduling interrupts
CAL:     204860     543304    1318717    1176681     431344     876239    1046465    1257472   Function call interrupts
TLB:     308034     229917     230598     299353     362103     478994     256456     212019   TLB shootdowns
TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
MCP:       1263       1263       1263       1263       1263       1263       1263       1263   Machine check polls
ERR:          0
MIS:          0

I have very big:

 RES:  104150407   96747853   40291367   35052019   77327041   55940217   28595113   25775538   Rescheduling

from which it depends?

My network cards: bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.6 (Mar 7, 2011)

UPDATE 2

I made: ethtool -k eth0

And see: large-receive-offload: off

How can i make on it?

Thank you.

What is that traffic? Have you many small packets coming in/going out? Big ones? Can you raise the MTU value without problems? — Nils, Jan 17 '12 at 21:40

score 3 · Answer 1 · answered Jan 17 '12 at 15:42

I assume these are legitimate interrupts because of network load, and not a result of hardware/driver problem. So:

You can invest into TCP-offload network card (TOE), if you deal with TCP traffic. It does some processing of TCP/IP in the network card chip, and raises less interrupts (and context switches). Check if your kernel/OS/application/traffic supports it.

Alternatively, take a look into Large receive offload (LRO), which is a light-weight approach.

score 1 · Answer 2 · answered Jan 17 '12 at 15:19

Hardware interrupts are a normal part of computer operation. Your NIC is going "hey! hey! hey! hey!" telling your CPU that it needs attention.

Excessive hardware interrupts are typically caused by bad drivers. So the first thing I would look at are your NIC drivers.

score 1 · Answer 3 · answered Jan 17 '12 at 15:40

That indeed is a huge number of interrupts. This often is an APIC Problem, though.

cat /proc/interrupts

should tell. If you only see your interrupts hitting CPU0 try

echo "2" > /proc/irq/"somenumber"/smp_affinity

Which should push the irqs of process "somenumber" on CPU2.

CPU context switch per/second increasing with network traffic increasing

3 Answers3