Team interface drops packets

1

After upgrading from 3.10.104 to kernel 4.14 I see constant drops (around 10 packets/sec) on team interface (TX) in active-backup mode, increasing txqueuelen on a physical port helps, however, drops still occur from time to time.

Here is ifconfig output with txqueuelen 0 on eth6

team0      Link encap:Ethernet  HWaddr 00:1E:67:B5:7F:76
      inet addr:192.168.221.203  Bcast:0.0.0.0  Mask:255.255.255.0
      inet6 addr: fe80::21e:67ff:feb5:7f76/64 Scope:Link
      UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
      RX packets:837 errors:0 dropped:671 overruns:0 frame:0
      TX packets:282219490 errors:0 dropped:169438 overruns:0 carrier:0
      collisions:0 txqueuelen:10000
      RX bytes:27531 (26.8 KiB)  TX bytes:113995992408 (106.1 GiB)

Physical interface:

eth6      Link encap:Ethernet  HWaddr 00:1E:67:B5:7F:76  
      UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
      RX packets:837 errors:0 dropped:0 overruns:0 frame:0
      TX packets:282218759 errors:0 dropped:0 overruns:0 carrier:0
      collisions:0 txqueuelen:0 
      RX bytes:74718 (72.9 KiB)  TX bytes:383252978322 (356.9 GiB)

In order to debug this, I've added a printk in team driver where dev_kfree_skb_any is called, however, while dropped counter increases, I do not see messages in dmesg until I reset physical interface by ethtool, therefore packets are being dropped somewhere else, however, for now, I have no clue how to find the exact cause.

I've also used a dropwatch utility, it produces a lot of output showing numerous drops, however, it generates similar output on kernel 3.10, where no drops are seen with ifconfig. I assume that it logs something else, maybe input packets dropped by the kernel, so useful info is lost between these. Here is a dropwatch output on 4.14 sorted by unique symbols:

__udp4_lib_rcv+6b0 (0xffffffff8155ce77)
ip_forward+98 (0xffffffff81534f88)
kfree_skb_list+13 (0xffffffff814f0d22)
sk_stream_kill_queues+4a (0xffffffff814f68a9)
tcp_v4_do_rcv+154 (0xffffffff815508e9)
tcp_v4_rcv+1de (0xffffffff81552490)
unix_stream_connect+3b4 (0xffffffff81586890)

Do you have any ideas/suggestions on how to debug this? Thanks!

yy7k

Posted 2019-09-22T21:55:39.493

Reputation: 9

No answers