1
I'm having trouble getting my expected throughput of a Intel dual port 82599EB 10-Gigabit. I've tried many things and want to know if there's anything I could try that I've missed.
My hardware configuration
Two servers with OpenSUSE and a Intel dual port 82599EB 10GbE in each. They are manually configured to static IPs and each port on one machine is connected to a port on the second.
lspci -vv
Throughput Test
I am using iperf to test. The cards are driving by ixgbe.
On the receiver side, I run
iperf -s
On the transmitter side:
iperf -c 192.168.1.10 -t 20 -B 192.168.1.20
iperf -c 192.168.1.11 -t 20 -B 192.168.1.21
And I am now getting around 4.x Gb per interface. If I run only one interface, I get 9.x Gb.
Configuration Attempts
I have looked around SE sites, and many other articles. Here are three helpful ones I found.
- Network Connectivity — Tuning Intel® Ethernet Adapter throughput performance
- https://www.kernel.org/doc/Documentation/networking/ixgbe.txt
- http://www.redhat.com/promo/summit/2008/downloads/pdf/Thursday/Mark_Wagner.pdf (PDF)
The two things that really helped:
- Using jumbo frames by setting the MTU at 9000.
- Increasing rmem settings in
/etc/sysctl.conf
However, I am still running at about 9.5Gbe combined for both channels. I'm thinking I should get 9Gbe or more per channel.
Things I've tried without much success:
- Used ethtool -c to vary interrupt coalescing
- Used ethtool to disable/enable flow control
Edits as per comments
To test the CPU utilization I am using mpstat -P ALL 5
. On the transmitting server, I see
61% utilization.
01:12:59 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
...
01:12:59 PM 4 0.00 0.00 61.33 0.00 0.00 9.38 0.00 0.00 29.29
That should be okay right? On the receiver I see a max of 30%.
Using lspci, I got the following. I can post the full outputs if needed, but think this shows the required pcie info:
Sender:
1: LnkCap: Port #16, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 <2us, L1 <32us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
2: LnkCap: Port #16, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 <2us, L1 <32us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
Receiver:
1: LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 <1us, L1 <8us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
2: LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 <1us, L1 <8us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
5 G/T at x8 should be plenty right?
1
And I am now getting around 4.x Gb per interface. If I run only one interface, I get 9.x Gb.
Are you sure there are no other bottlenecks and your problem is the network interfaces? Is your CPU maxed out or something? – Zoredache – 2014-07-31T00:26:05.2172
Or your PCI Express bus? https://communities.intel.com/community/wired/blog/2009/06/08/understanding-pci-express-bandwidth
– cpt_fink – 2014-07-31T06:00:45.100The PCI bus is a common bottleneck for 10G cards. – MaQleod – 2014-07-31T19:34:02.933
Ah good points! Thanks! I'm a little out of my element here but I think I've provided the correct info. If not, let me know. It seems like 5G/T per second at x8 should be plenty, but I'll keep looking. – Nate – 2014-07-31T19:38:08.663