Machine with bonded interface does not receive multicast packets on all slave interfaces

Question

After an upgrade of our machines from RHEL 6.6 to RHEL 6.7 we observed a problem where 4 of our 30 machines only receive multicast traffic on one of their two slave interfaces. It is unclear if the upgrade is related or if the included restart triggered the behavior - restarts are rare.

We expect to receive lots of multicast packts to the group 239.0.10.200 on 4 different ports. If we check statistics with ethtool on one of the problematic machines we see the following output:

Healthy interface:

 # ethtool -S eth0 |grep mcast
 [0]: rx_mcast_packets: 294
 [0]: tx_mcast_packets: 0
 [1]: rx_mcast_packets: 68
 [1]: tx_mcast_packets: 0
 [2]: rx_mcast_packets: 2612869
 [2]: tx_mcast_packets: 305
 [3]: rx_mcast_packets: 0
 [3]: tx_mcast_packets: 0
 [4]: rx_mcast_packets: 2585571
 [4]: tx_mcast_packets: 0
 [5]: rx_mcast_packets: 2571341
 [5]: tx_mcast_packets: 0
 [6]: rx_mcast_packets: 0
 [6]: tx_mcast_packets: 8
 [7]: rx_mcast_packets: 9
 [7]: tx_mcast_packets: 0
 rx_mcast_packets: 7770152
 tx_mcast_packets: 313

Broken interface:

 # ethtool -S eth1 |grep mcast
 [0]: rx_mcast_packets: 451
 [0]: tx_mcast_packets: 0
 [1]: rx_mcast_packets: 0
 [1]: tx_mcast_packets: 0
 [2]: rx_mcast_packets: 5
 [2]: tx_mcast_packets: 304
 [3]: rx_mcast_packets: 0
 [3]: tx_mcast_packets: 0
 [4]: rx_mcast_packets: 5
 [4]: tx_mcast_packets: 145
 [5]: rx_mcast_packets: 0
 [5]: tx_mcast_packets: 0
 [6]: rx_mcast_packets: 5
 [6]: tx_mcast_packets: 10
 [7]: rx_mcast_packets: 0
 [7]: tx_mcast_packets: 0
 rx_mcast_packets: 466
 tx_mcast_packets: 459

Multicast is expeted from 10 other machines. If we check which hosts a broken machine receives multicast from (using tcpdump), it only receives from a subset (3-6) of the expected hosts.

Configuration

Linux version:

# uname -a
Linux ab31 2.6.32-573.3.1.el6.x86_64 #1 SMP Mon Aug 10 09:44:54 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux

Ifconfig:

# ifconfig -a
bond0     Link encap:Ethernet  HWaddr 4C:76:25:97:B1:75
          inet addr:10.91.20.231  Bcast:10.91.255.255  Mask:255.255.0.0
          inet6 addr: fe80::4e76:25ff:fe97:b175/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:18005156 errors:0 dropped:0 overruns:0 frame:0
          TX packets:11407592 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:10221086569 (9.5 GiB)  TX bytes:2574472468 (2.3 GiB)

eth0      Link encap:Ethernet  HWaddr 4C:76:25:97:B1:75
          inet6 addr: fe80::4e76:25ff:fe97:b175/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:13200915 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3514446 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9386669124 (8.7 GiB)  TX bytes:339950822 (324.2 MiB)
          Interrupt:34 Memory:d9000000-d97fffff

eth1      Link encap:Ethernet  HWaddr 4C:76:25:97:B1:75
          inet6 addr: fe80::4e76:25ff:fe97:b175/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:4804241 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7893146 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:834417445 (795.7 MiB)  TX bytes:2234521646 (2.0 GiB)
          Interrupt:36 Memory:da000000-da7fffff

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:139908 errors:0 dropped:0 overruns:0 frame:0
          TX packets:139908 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:210503939 (200.7 MiB)  TX bytes:210503939 (200.7 MiB)

Network configuration:

# cat /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
IPADDR=10.91.20.231
NETMASK=255.255.0.0
GATEWAY=10.91.1.25
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
BONDING_OPTS="miimon=100 mode=802.3ad"

# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE="eth0"
HWADDR="4C:76:25:97:B1:75"
BOOTPROTO=none
ONBOOT="yes"
USERCTL=no
MASTER=bond0
SLAVE=yes

# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE="eth1"
HWADDR="4C:76:25:97:B1:78"
BOOTPROTO=none
ONBOOT="yes"
USERCTL=no
MASTER=bond0
SLAVE=yes

Driver info (same for eth1):

# ethtool -i eth0
driver: bnx2x
version: 1.710.51-0
firmware-version: FFV7.10.17 bc 7.10.11
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

Adapter:

# lspci|grep Ether
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

/proc/net/bonding/bond0:

$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 2
        Actor Key: 33
        Partner Key: 5
        Partner Mac Address: 00:01:09:06:09:07

Slave Interface: eth0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 4c:76:25:97:b1:75
Aggregator ID: 1
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 4c:76:25:97:b1:78
Aggregator ID: 1
Slave queue ID: 0

Other information

Restarting (ifconfig down, ifconfig up) the broken interface fixes this
Occasionally during bootup we see the following message in our syslog (we do not use IPv6), however, our problem occurs even when this message is not logged
```
Oct  2 11:27:51 ab30 kernel: bond0: IPv6 duplicate address fe80::4e76:25ff:fe87:9d75 detected!
```

Output from syslog during configuration:

Oct  5 07:44:31 ab31 kernel: bonding: bond0 is being created...
Oct  5 07:44:31 ab31 kernel: bonding: bond0 already exists
Oct  5 07:44:31 ab31 kernel: bond0: Setting MII monitoring interval to 100
Oct  5 07:44:31 ab31 kernel: bond0: Setting MII monitoring interval to 100
Oct  5 07:44:31 ab31 kernel: ADDRCONF(NETDEV_UP): bond0: link is not ready
Oct  5 07:44:31 ab31 kernel: bond0: Setting MII monitoring interval to 100
Oct  5 07:44:31 ab31 kernel: bond0: Adding slave eth0
Oct  5 07:44:31 ab31 kernel: bnx2x 0000:01:00.0: firmware: requesting bnx2x/bnx2x-e2-7.10.51.0.fw
Oct  5 07:44:31 ab31 kernel: bnx2x 0000:01:00.0: eth0: using MSI-X  IRQs: sp 120  fp[0] 122 ... fp[7] 129
Oct  5 07:44:31 ab31 kernel: bnx2x 0000:01:00.0: eth0: NIC Link is Up, 10000 Mbps full duplex, Flow control: none
Oct  5 07:44:31 ab31 kernel: bond0: Enslaving eth0 as a backup interface with an up link
Oct  5 07:44:31 ab31 kernel: bond0: Adding slave eth1
Oct  5 07:44:31 ab31 kernel: bnx2x 0000:01:00.1: firmware: requesting bnx2x/bnx2x-e2-7.10.51.0.fw
Oct  5 07:44:31 ab31 kernel: bnx2x 0000:01:00.1: eth1: using MSI-X  IRQs: sp 130  fp[0] 132 ... fp[7] 139
Oct  5 07:44:31 ab31 kernel: bnx2x 0000:01:00.1: eth1: NIC Link is Up, 10000 Mbps full duplex, Flow control: none
Oct  5 07:44:31 ab31 kernel: bond0: Enslaving eth1 as a backup interface with an up link
Oct  5 07:44:31 ab31 kernel: ADDRCONF(NETDEV_UP): bond0: link is not ready
Oct  5 07:44:31 ab31 kernel: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready

The bond0 interface is joined to the multicast group, as seen by ip maddr:
```
...
4:      bond0
inet  239.0.10.200 users 16
...
```
Everything works on other machines on the same network. However, it seems (not 100% confirmed) that the working machines have another network adapter:
```
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
```
When checking our switch statistics we can see data being sent to both interfaces.

What we have tried so far

As suggested in Linux Kernel not passing through multicast UDP packets we investigated whether we had an rp_filter problem. However, changing these flags did not change anything for us.
Downgraded the kernel to the one used before the RedHat upgrade - no change.

Any hints how to further troubleshoot are appreciated. If more information is needed, please let me know.

Are the aggregation IDs still the same? `cat /proc/net/bonding/bond0` should show the same `Aggregator ID` for each bound interface. — Cameron Kerr, Oct 05 '15 at 13:19
If an interface restart helps resolve it, then perhaps the NIC needed to be powercycled (different firmware being downloaded?). A cold boot would have been useful perhaps. — Cameron Kerr, Oct 05 '15 at 13:23
@CameronKerr `Aggregator ID`s are the same. I added the contents of `/proc/net/bonding/bond0` to the question. I do not think we have tried a cold boot, will do! — K Erlandsson, Oct 05 '15 at 13:26
@CameronKerr Thank you for your suggestions. Unfotunately no luck with a cold boot. — K Erlandsson, Oct 05 '15 at 13:51
Presumably you have a Red Hat subscription; their technical support is very good. Sounds like a potential firmware issue, and the logs say that it is requesting a particular firmware version. — Cameron Kerr, Oct 05 '15 at 13:57
Also, its possible that a linecard in your switch could do with a restart? — Cameron Kerr, Oct 05 '15 at 14:01
@CameronKerr Indeed, next step will be to contact Red Hat. I was just hoping we were missing something obvious which the community could detect easily but I fear that is not the case. Our switch seems to be funcioning, we can see data being transmitted to both interfaces, it seems to be discarded when it reaches the machine. I will investigate if it is possible to restart switch components. — K Erlandsson, Oct 05 '15 at 14:17
@KErlandsson, do you have any update from Red Hat? We think we're seeing the exact same problem. — wolfcastle, Oct 22 '15 at 22:10
@wolfcastle Some of our people are working with dell support on the matter right now (we are using dell servers so they decided to involve them first). From what I have heard they have made some firmware upgrades but that has not helped so the issue is still unresolved. I will come back here and update when I know more. — K Erlandsson, Oct 23 '15 at 06:32
We're seeing the same issue with any new server or upgraded server running Ubuntu 12. Dell hardware also... — tc0nn, Nov 20 '15 at 18:43
@wolfcastle see my answer for the latest information about this problem in case you still suffer the same issue. — K Erlandsson, May 25 '16 at 11:19

score 1 · Answer 1 · answered May 25 '16 at 11:16

We were using Dell blade servers where this problem appeared. After working with the Dell support it seems that we are using IGMPv3 EXCLUDE filtering when joining the multicast group. Apparently exclude mode is not supported by the switch in the blade server. We are recommended to switch to IGMPv3 INCLUDE filter mode.

However, we have now stopped using multicast in our platform, why we will probably not get around to try out these changes. Hence I cannot say for sure this is the root cause.

Machine with bonded interface does not receive multicast packets on all slave interfaces

Configuration

Other information

What we have tried so far

1 Answers1