1

I have installed two dual-port FDR Infiniband VPI HBAs, one in each of two servers running CentOS 6.9,

server1>lspci
03:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

server2>lspci
81:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

I want to use these for high-speed NFSv4 (probably via RDMA) connectivity between these two machines, directly attached to each other via Infiniband (2-meter 56 Gbps QSFP+ passive cable). I have done the following on both (substituting the correct PCI address below).

yum -y install rdma infiniband-diags
chkconfig rdma on
service rdma start
printf "0000:XX:00.0 eth eth\n" >> /etc/rdma/mlx4.conf
echo eth > /sys/bus/pci/devices/0000:XX:00.0/mlx4_port1
echo eth > /sys/bus/pci/devices/0000:XX:00.0/mlx4_port2
modprobe -r mlx4_core
modprobe mlx4_core
modprobe ib_umad
cp -f ifcfg-eth4 /etc/sysconfig/network-scripts/ifcfg-eth4
cp -f ifcfg-eth5 /etc/sysconfig/network-scripts/ifcfg-eth5
chmod 644 /etc/sysconfig/network-scripts/ifcfg-*
chcon system_u:object_r:net_conf_t:s0 /etc/sysconfig/network-scripts/ifcfg-*
ifup eth4
ifup eth5

An example network configuration file (e.g. ifcfg-eth4) looks thus, substituting the appropriate MAC and IP address for each port:

DEVICE=eth4
HWADDR=XX:XX:XX:XX:XX:XX
TYPE=Ethernet
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=none
USERCTL=no
NETWORK=10.72.1.0
NETMASK=255.255.255.0
IPADDR=XXX.XXX.XXX.XXX

There are three other similar files, two on each machine, and ifup and ifdown work for both interfaces on both machines. Additionally, routes exist

server1>ip route show
10.72.1.0/24 dev eth4  proto kernel  scope link  src 10.72.1.3
10.72.1.0/24 dev eth5  proto kernel  scope link  src 10.72.1.4
...

This is where things start going badly.

CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 2
        Firmware version: 2.11.500
        Hardware version: 0
        Node GUID: 0xf45...
        System image GUID: 0xf45...
        Port 1:
                State: Down
                Physical state: Disabled
                Rate: 10
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x04010000
                Port GUID: 0xf6...
                Link layer: Ethernet
        Port 2:
                State: Down
                Physical state: Disabled
                Rate: 40
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x04010000
                Port GUID: 0xf6...
                Link layer: Ethernet

Both machines show the same thing, "State: Down" and "Physical state: Disabled". Status lights on the HBAs themselves are dark. I have tried all combinations of connections between the two machines, including connecting each card to itself.

I have read about the need for opensm, and I tried installing it, but despite what seems like correct configuration, it fails:

May 09 20:18:14 888369 [A8697700] 0x01 -> osm_vendor_bind: ERR 5426: Unable to register class 129 version 1
May 09 20:18:14 888418 [A8697700] 0x01 -> osm_sm_mad_ctrl_bind: ERR 3118: Vendor specific bind failed
May 09 20:18:14 888436 [A8697700] 0x01 -> osm_sm_bind: ERR 2E10: SM MAD Controller bind failed (IB_ERROR)

Further, I have read some people say that opensm is not needed for this type of configuration.

At this point, I do not know if this suggests that one or both cards are bad, the cable is bad, there is an aspect of my configuration that is bad, or something else. I have tried yum -y groupinstall "Infiniband Support", but this did not help, and I subsequently removed the extraneous packages.

What I have not done is reboot the machine, because that is not presently an option, but I thought that the modprobe -r; modprobe sequence would be equivalent, and all aspects of the configuration related to module installation seems to be working correctly.

I will appreciate any thoughts!

rg6
  • 185
  • 1
  • 11

1 Answers1

0

First of all, opensm is only used for Infiniband (IB). You have configured your cards to be in Ethernet mode, so opensm is not required.

The basic configuration looks okay. I assume when you added mlx4_core, mlx4_en was inserted at the same time? lsmod | grep mlx

However, I suspect the problem is with the cables. Are they Mellanox brand FDR or Ethernet cables? If not, they are probably being ignored by the card as not supported in Ethernet mode. Go look up the model number of each of the parts to verify compatibility. Cables with VPI cards not working in Ethernet mode has been a thorn in my side more than once.

Another quick test would be to remove the modules, back out your "eth" mode settings, then plug in two nodes back-to back with only and IB cable, then reinsert the modules. IB usually does really well at linking in non-optimal conditions. ibstat will show a physical state other than down - it'll either link partially (with no opensm), or will link fully; if the cable is a non-FDR cable, it'll still link at QDR or DDR. If you can at least get IB working, you know the cards are good. You can also use IPoIB (interfaces ib0 and ib1 - use "connected mode"), albeit at a performance hit from Ethernet. If you're only doing NFS, then you might as well go ahead and use IB mode. Enable NFS over RDMA (don't forget to change your client mounts to use it as well,) and enjoy the benefits of near wire speed NFS on a 56Gbps link.

Paul
  • 424
  • 1
  • 5
  • 14