We do special hardware configurations that require the heavy use of LLDP. We have a few new racks of servers that all use the Intel X710 10Gb network card. LLDP suddenly stopped working. Our implementation of LLDP is simple. Enable LLDP on the TOR (top of rack) switch using default TLVs. Enable LLDP on the Linux image using lldpad (CentOS 6.5) and use lldptool to extract neighbor information, which has worked for thousands of machines in the past. Only, for these machines with these NICs, the whole thing just stopped working.
Use of packet dumps from the switches and the server showed that frames were properly sent to the switch from the servers and conversely, the switches were properly receiving frames from the servers and sending TLV frames back to the servers. The servers were not receiving the switch frame TLVs, though, leaving us scratching our heads. We placed other machines using different NICs on the TOR and they get LLDP data as expected.
I asked the Googles...
According to this link it seems that these X710s are probably running an internal LLDP agent, which is intercepting LLDP frames from the switch. The firmware on the affected machines we're seeing this occur is:
# ethtool -i eth2
driver: i40e
version: 1.3.47
firmware-version: 4.53 0x80001e5d 17.0.10
bus-info: 0000:01:00.2
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
The method to disable the internal LLDP agent on the NIC does not work. Nevertheless, I'm still digging around, but I figure I have a few options:
- find the correct way to disable the internal LLDP agent on the NIC and use the existing method of extracting LLDP data on these machines -- preferred.
- Use the NIC LLDP agent and find a way to extract the neighbor TLVs from the NIC.
Has anyone else experienced the same or similar issues with these cards and if so, how did you get around the problem?
I figure that if I wanted to use the internal agent data that it would be exposed via ethtool
or snmp
, but I have been unsuccessful as yet at finding a way to surface the information.
TIA
EDIT For the record, when I attempt the steps outlined in the Intel forums, I get the following output:
root@host (~)# find /sys/kernel/debug/
/sys/kernel/debug/
root@host (~)# mkdir /sys/kernel/debug/i40e
mkdir: cannot create directory `/sys/kernel/debug/i40e': No such file or directory