1

I have a few HP Blades Gen7 equipped with QLogic Infiniband cards IBA7322 which I would like to use with CentOS 8. The problem I have is that I cannot find the right drivers for them. All the information I find is either old, or the links don't work (e.g. the marvell download links).

I can see the card with lspci but other than that I didn't have any luck turning it on (so to say).

Can anyone point me in some sort of direction with this problem?

Cheers.

Edit:

results from lspci -vv:

01:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02)
        Subsystem: Hewlett-Packard Company Device 178a
        Physical Slot: 0
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 10
        Region 0: Memory at fd400000 (64-bit, non-prefetchable) [size=4M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <4us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [b0] MSI-X: Enable- Count=32 Masked-
                Vector table: BAR=0 offset=00008000
                PBA: BAR=0 offset=00009000
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP- SDES+ TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
lucian
  • 131
  • 1
  • 5
  • Did you check to see if the drivers were dropped due to the hardware's age? RHEL often removes drivers for ancient hardware from its latest kernels. – Michael Hampton Aug 18 '20 at 18:48
  • I didn't see that card on the list... If it is still supported how would i know? – lucian Aug 18 '20 at 21:05

1 Answers1

0

The first thing to do is to get the PCI vendor and device IDs for the hardware in question. Your card appears to be 1077:7322. A quick look at the Linux Driver Database tells me this card uses the ib_qib driver.

I took a look through the RHEL documentation of removed drivers in RHEL 8, but did not see this driver. However, I fired up a RHEL 8 VM and the driver is no longer present and is not enabled in the corresponding kernel configuration. It is present in RHEL 7 though.

What people usually do in such a circumstance, in order to use the old hardware, is to use the elrepo repository, which for RHEL 8 contains the missing ib_qib driver (and several other drivers that Red Hat dropped). For example:

dnf install https://www.elrepo.org/elrepo-release-8.el8.elrepo.noarch.rpm
dnf install kmod-ib_qib
Michael Hampton
  • 237,123
  • 42
  • 477
  • 940
  • thank you for your help! I did install the driver but I still cannot see the infiniband card as active. How do I activate it after the driver installation? – lucian Aug 19 '20 at 09:03
  • this is what I get from `ibstatus`: `Fatal error: No devices` – lucian Aug 19 '20 at 09:16
  • Is the module loaded? Does `lspci -vv` show it being used? – Michael Hampton Aug 19 '20 at 14:06
  • I am editing my original post to show you what `lspci -vv` says. – lucian Aug 19 '20 at 15:06
  • Oh, you have an HP-branded card, not an OEM QLogic card. What's the actual vendor and device ID for your card? – Michael Hampton Aug 19 '20 at 15:36
  • This is what I get from `lspci -nn`: `01:00.0 InfiniBand [0c06]: QLogic Corp. IBA7322 QDR InfiniBand HCA [1077:7322] (rev 02)` – lucian Aug 19 '20 at 15:46
  • Well, that's the right vendor and device ID. So, again, did you load the module? – Michael Hampton Aug 19 '20 at 15:47
  • i have tried `modprobe ib_qib` but that leads to an error message: # modprobe ib_qib modprobe: ERROR: could not find module by name='ib_qib' modprobe: ERROR: could not insert 'ib_qib': Unknown symbol in module, or unknown parameter (see dmesg) modprobe: ERROR: Error running install command for ib_qib modprobe: ERROR: could not insert 'ib_qib': Operation not permitted. – lucian Aug 19 '20 at 16:25
  • I tried this: `# insmod ib_qib.ko` but got this error: `insmod: ERROR: could not insert module ib_qib.ko: Unknown symbol in module`. The file is located here: `/usr/lib/modules/4.18.0-193.6.3.el8_2.x86_64/extra/ib_qib/ib_qib.ko` – lucian Aug 19 '20 at 16:44
  • Well, that's interesting. I'd have a chat with the elrepo people about that. – Michael Hampton Aug 19 '20 at 16:50
  • I have found this similar problem but I have no clue how to use the [info](https://elrepo.org/bugs/bug_view_page.php?bug_id=1015&history=1). How do I get in touch with them? – lucian Aug 19 '20 at 16:51
  • Probably the same way they did. – Michael Hampton Aug 19 '20 at 16:52
  • Ok I will do that! Thank you for your help! – lucian Aug 19 '20 at 16:53