0

I'm investigating an intermittent network failure, and when looking at /proc/net/dev file I see that WireGuard interface reports variable number of transmission errors for all VMs in the cluster (it scales more or less with the volume of traffic moving though this interface).

However, it reports no errors on receiving side. And the underlying interface reports no errors whatsoever.

How should I interpret this situation? Is this expected? Is this a bug in WireGuard? Can this possibly be the reason for intermittent loss of connectivity?

ip -s -s link show wg0
4: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/none 
    RX: bytes  packets  errors  dropped overrun mcast   
    18229925236 135673958 0       0       0       0       
    RX errors: length   crc     frame   fifo    missed
               0        0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    951387346088 775255612 26278   0       0       0       
    TX errors: aborted  fifo   window heartbeat transns
               0        0       0       0       0       

This is here just to give you some illustration of what I'm seeing.

wvxvw
  • 126
  • 8

1 Answers1

1

WireGuard uses UDP, so the sending side will almost never report any errors - and the receiving side will so, too. If packets get lost (dropped) on their way, the receiving WireGuard Interface will report that, your physical interfaces will not.

And yes, dropped packets can (and amost certainly will be) the culprit here.

bjoster
  • 4,423
  • 5
  • 22
  • 32
  • But it has a specific column for dropped packets, why not report it there? Also, the errors appear in TX, not RX (so the reporter is the transmitter, not the receiver). The receiving interface reports no errors. Sorry I wasn't clear: "transmission errors" here are used literally to mean errors in TX column, not "errors generally happening when transmitting packets". – wvxvw Mar 04 '20 at 10:52
  • "But it has a specific column for dropped packets, why not report it there?" - Sounds like they are not dropped, but changed (which makes them unuseable in the VPN). – bjoster Mar 04 '20 at 11:18
  • 1
    Most probably a wireguard interface would only report dropped packets when a receiver rejected incoming packets with ICMP messages. Because of the lower MTU of a VPN interface it is important that all clients using the tunnel have PMTUD enabled. Or you could set the VM interfaces also to a MTU of 1420 to match. – Gerrit Mar 04 '20 at 11:27
  • This doesn't seem to explain the situation... if the MTU was the problem, then I'd expect to see *a lot more* errors. Also, my impression so far was that errors reported in TX section are of a kind "I needed to write Ethernet frame, but ARP request didn't know how to map my MAC to IP" or something like that. I.e. errors that happen *before* the packet is sent, not after it is sent but then reported as erroneous (in that case, I'd expect the receiver to also have record of the problem, which it hasn't). – wvxvw Mar 04 '20 at 11:56