7

I have a server with CentOS 6.5 installed with 2x 1Gbps Ethernet Cards, I have added a new interface (Intel(R) 10 Gigabit PCI Express Network) but the problem. is the system is not detecting the ethernet information with (ifconfig -a) or (ifcfg-ethX) files.

I have tried the following:

  • ifconfig -a: not showing the new NIC info or MAC.
  • removed /etc/udev/rules.d/70-persistent-net.rules and rebooted.
  • Tried to make a new ifcfg-eth2 file with the real HwAddr but still showing:

     Bringing up interface eth2:  
     Device eth2 does not seem to be present, delaying initialization. [FAILED]
    
  • rmmod ixgbe; modprobe ixgbe
  • Output of dmesg :

    ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 3.15.1-k
    ixgbe: Copyright (c) 1999-2013 Intel Corporation.
    ixgbe 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
    ixgbe 0000:03:00.0: setting latency timer to 64
    ixgbe 0000:03:00.0: The EEPROM Checksum Is Not Valid
    ixgbe 0000:03:00.0: PCI INT A disabled
    ixgbe: probe of 0000:03:00.0 failed with error -5
    ixgbe 0000:03:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
    ixgbe 0000:03:00.1: setting latency timer to 64
    ixgbe 0000:03:00.1: The EEPROM Checksum Is Not Valid
    ixgbe 0000:03:00.1: PCI INT B disabled
    ixgbe: probe of 0000:03:00.1 failed with error -5
    

None of these works . But I am sure if I format / reinstall OS it will work perfectly. I have the same problem on a previous server. How can I fix this without reinstalling the OS?

NOTE : This is the same NIC model working on another fresh installed CentOS 6.5 Server : dmesg output :

ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 3.15.1-k
ixgbe: Copyright (c) 1999-2013 Intel Corporation.
ixgbe 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
ixgbe 0000:03:00.0: setting latency timer to 64
  alloc irq_desc for 39 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 39 for MSI/MSI-X
  alloc irq_desc for 40 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 40 for MSI/MSI-X
  alloc irq_desc for 41 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 41 for MSI/MSI-X
  alloc irq_desc for 42 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 42 for MSI/MSI-X
  alloc irq_desc for 43 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 43 for MSI/MSI-X
  alloc irq_desc for 44 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 44 for MSI/MSI-X
  alloc irq_desc for 45 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 45 for MSI/MSI-X
  alloc irq_desc for 46 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 46 for MSI/MSI-X
  alloc irq_desc for 47 on node -1
  alloc kstat_irqs on node -1
ixgbe 0000:03:00.0: irq 47 for MSI/MSI-X
ixgbe 0000:03:00.0: Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8
ixgbe 0000:03:00.0: (PCI Express:2.5GT/s:Width x8) 00:1b:21:69:89:61
ixgbe 0000:03:00.0: MAC: 1, PHY: 5, PBA No: E18269-001
ixgbe 0000:03:00.0: Intel(R) 10 Gigabit Network Connection

Here is the parameters of e1000 There is no parm regarding allow bad checksum !!! [root@tv ~]# modinfo e1000 | grep parm

parm:           TxDescriptors:Number of transmit descriptors (array of int)
parm:           RxDescriptors:Number of receive descriptors (array of int)
parm:           Speed:Speed setting (array of int)
parm:           Duplex:Duplex setting (array of int)
parm:           AutoNeg:Advertised auto-negotiation setting (array of int)
parm:           FlowControl:Flow Control setting (array of int)
parm:           XsumRX:Disable or enable Receive Checksum offload (array of int)
parm:           TxIntDelay:Transmit Interrupt Delay (array of int)
parm:           TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm:           RxIntDelay:Receive Interrupt Delay (array of int)
parm:           RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm:           InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm:           SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm:           KumeranLockLoss:Enable Kumeran lock loss workaround (array of int)
parm:           copybreak:Maximum size of packet that is copied to a new buffer on receive (uint)
parm:           debug:Debug level (0=none,...,16=all) (int)
Areeb111
  • 73
  • 1
  • 1
  • 8
  • Does it appear in the `dmesg` output? – dawud Jun 29 '14 at 15:16
  • Hi,I have added the output of dmesg above. – Areeb111 Jun 29 '14 at 15:21
  • Might be kernel version? Centos runs old kernels so often isn't quite up with the latest hardware. – hookenz Jun 29 '14 at 21:17
  • It's running on `2.6.32-431.5.1.el6.x86_64` , I have updated to `2.6.32-431.20.3.el6` But it's still undetectable ! – Areeb111 Jun 29 '14 at 21:53
  • 1
    Check what parameters this module has (`modinfo ixgbe | grep parms`), `e1000` for instance has a `eeprom_bad_csum_allow` parameter you can activate. – dawud Jun 29 '14 at 21:59
  • Hi @dawud, Here are the parms : parm: `IntMode:Change Interrupt Mode (0=Legacy, 1=MSI, 2=MSI-X), default 2 (array of int) parm: FdirMode:Flow Director filtering modes (0=Off, 1=On) default 1 (array of int) parm: max_vfs:Maximum number of virtual functions to allocate per physical function - default is zero and maximum value is 63 (uint) parm: allow_unsupported_sfp:Allow unsupported and untested SFP+ modules on 82599-based adapters (uint) parm: debug:Debug level (0=none,...,16=all) (int)` – Areeb111 Jun 29 '14 at 22:15
  • `[root@store ~]# modprobe -r e1000 [root@store ~]# modprobe e1000 eeprom_bad_csum_allow=1 FATAL: Error inserting e1000 (/lib/modules/2.6.32-431.20.3.el6.x86_64/kernel/drivers/net/e1000/e1000.ko): Unknown symbol in module, or unknown parameter (see dmesg)` – Areeb111 Jun 29 '14 at 22:37
  • @Areeb111 What type of server hardware is this? – ewwhite Jun 29 '14 at 22:57
  • Supermicro motherboard with Intel Core i7 CPU – Areeb111 Jun 29 '14 at 23:10
  • Oh, *Supermicro* :( – ewwhite Jun 30 '14 at 00:04

3 Answers3

2

This actually looks like a problem with your server's motherboard.

We can see from your dmesg output that it is failing to communicate correctly with the PCIe card in the failing server, but works correctly in a different server.

So you most likely have a bad PCIe slot, or bad motherboard components.

You can try using a different PCIe slot, if you have another one available, checking that your NIC and riser card (if any) are firmly seated, or replacing the riser card or motherboard.

It could also be, if you haven't actually tried this specific NIC in a different server and had it work, that the NIC itself is bad.

Michael Hampton
  • 237,123
  • 42
  • 477
  • 940
  • No sir i can see this is a bug in linux kernel as you can see here is the discussion : http://www.thinkwiki.org/wiki/Problem_with_e1000:_EEPROM_Checksum_Is_Not_Valid – Areeb111 Jun 29 '14 at 23:01
  • Also i have the same problem with the other server and i have re installed CentOS 6.5 and it's worked fine without any tuning. – Areeb111 Jun 29 '14 at 23:02
  • 1
    That patch went in years before EL6. So I suspect you are having a possibly related issue. Does it actually work if you completely disconnect power from the server, connect an Ethernet cable from the NIC to a switch, and then power it on? – Michael Hampton Jun 29 '14 at 23:04
  • Ok let me try this tomorrow, Because i am accessing the server from a remote destination right now. And i will try to change the PCI-E slot of course. – Areeb111 Jun 29 '14 at 23:08
  • Hi Micheal, I have changed the PCI-E Slot and the same problem exists. – Areeb111 Jul 07 '14 at 22:50
1

Try ifconfig eth2 up.

I seem to recall having to do that to get the interface seen.

dmourati
  • 24,720
  • 2
  • 40
  • 69
  • [root@store ~]# ifconfig eth2
    eth2: error fetching interface information: Device not found
    It's not detecting the NIC as a network device.
    – Areeb111 Jun 29 '14 at 21:19
0

Here is Intel BootUtil (available for both Windows and Linux) you can try to re-flash/upgrade the EEPROM for the Intel 10GbE interface cards: https://downloadcenter.intel.com/download/19186

I recently purchase several Intel X550-T1 cards, and some of them gave me the

The EEPROM Checksum Is Not Valid

errors during boot up on RHEL6.9 systems. After I ran the BootUtil to update the firmware, these cards work like a charm.