2

I have two nodes connected with dual-port Mellanox Connext-X3 VPI HCAs via an IB switch. The nodes are two socket machines with Hasswell CPUs and 2 16GB DIMMs per each socket (totaling 64GB). Everything seems to work perfectly, except for the performance numbers that don't seem right.

When I run ib_read_bw benchmark:

server# ib_read_bw --report_gbits
client# ib_read_bw server --report_gbits

---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
 65536      1000             37.76              37.76          0.072016
---------------------------------------------------------------------------------------

But when I run dual-port:

server# ib_read_bw --report_gbits -O
client# ib_read_bw server --report_gbits -O
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
 65536      2000             52.47              52.47          0.100073
---------------------------------------------------------------------------------------

I only get less than 40% improvement (am I wrong to expect ~2x the single port bandwidth)?

I don't know what could be the bottleneck here and how to find it.

Other configurations that may be helpful:

  • Each socket has 8 cores, overall each machine has 32 HTs
  • Each DIMM provides ~14GB/s bw (per socket mem-bw: ~28 GB/s, overall ~56 GB/s)
  • I used Mellanox's Auto Tuning Utility tool to tune the interrupts.
  • IB links are 4X 10.0 Gbps (FDR10) -- each 40 Gb/s
  • I am using Mellanox OFED 4.3.
Mohammad Hedayati
  • 629
  • 1
  • 6
  • 12

2 Answers2

4

I think the bottleneck here is the PCIe link between the ConnectX and the host. The ConnectX-3 has a Gen. 3 x8 PCIe connection, which is bound to a theoretical maximum of 63.04 Gbps (according to this answer), and that doesn't include overhead (see here).

haggai_e
  • 272
  • 1
  • 7
  • I just wonder why would they put dual-port on a x8 PCIe card?! – Mohammad Hedayati Jul 20 '18 at 04:42
  • I'm not sure what was the reason in this case, but in general a dual-port NIC can be used for "active/passive" fault-tolerance, so only one port is active at any given time, but they are connected to different switches so if one switch fails the system switches to using the other port. In that case the PCIe bandwidth should be enough. – haggai_e Jul 22 '18 at 05:45
1

I have (2) systems each with a Mellanox FDR MCX354A-FCBT CX354A (ConnectX-3 VPI.) Only having the (2) using InfiniBand, I don't have a switch and just have them directly connected. I'm running dual Xeons (Sandy Bridge.)

I had a 40Gb/sec cable which was preventing a FDR connection, and was getting:

#bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
65536      1000             31.11              31.11              0.059329

I got a FDR (56Gb/sec) cable, and started getting:

#bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
65536      1000             49.58              49.58              0.094569

I have always wondered how it would be if I used both ports, so tried that, and got:

#bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
65536      2000             52.28              52.28              0.099717

Oh well. I probably won't bother for that gain.

I definitely think haggai_e is right, because my cards are also PCI Express 3.0 x8. I think to see faster, we'd need 3.0 x16 or 4.0 cards.

Another advantage to dual ports is they can connect directly to different networks or machines, and each get their full speed if they aren't constantly transmitting.

user1902689
  • 131
  • 6