5

The backups (via Bacula) of one of my servers (“A”) connected via IPSec (Strongswan on Debian testing) to a storage daemon (“B”) don't finish 95% of the times they run. What apparently happens, is:

  1. Bacula opens a TCP connection to the storage daemon's VPN IP. (A → B)
  2. Since the kernel setting net.ipv4.ip_no_pmtu_disc=0 is set by default, the IP Don't Fragment bit is set in the plaintext packet.
  3. When routing the packet into the IPSec tunnel, the DF bit of the payload is copied to the IP header of the ESP packet.
  4. After some time (often around 20 mins) and up to several gigabyte of data sent, a packet slightly larger than ESP packets before is sent. (A → B)
  5. As the storage daemon interface has a lower MTU than the one of the sending host, a router along the way sends an ICMP type 3, code 4 (Fragmentation Needed and Don't Fragment was Set) error to the host. (some router → A)
  6. Connection stalls, for some reason host A floods ~100 empty duplicate ACKs to B (within ~20 ms).

(The ICMP packets are reaching host A and there are no iptables rules in place that block ICMP.)

Possible reasons why this happens, that I can think of:

  • Kernel bug (Debian 3.13.7-1)
  • Linux' IPSec implementation intentionally ignores the PMTU message as a security measure since it is unprotected and would affect an existing SA. (seems to be valid behavior according to RFC 4301 8.2.1)
  • Has to do something with PMTU Aging (RFC 4301 8.2.2)

What is the best way to fix this, without disabling PMTU discovery globally or lowering the interface MTU? Maybe clear the DF bit somehow like FreeBSD does with ipsec.dfbit=0?

al.
  • 915
  • 6
  • 17
  • 1
    I was able to confirm that disabling PMTU discovery by setting `net.ipv4.ip_no_pmtu_disc=1` indeed fixes this particular problem. It's not an ideal solution in my opinion, though. – al. Jul 11 '14 at 08:32

2 Answers2

2

You could try creating a rule in iptables to set the TCP MSS for the VPN-destined traffic to a lower value. But without a packet capture it's difficult to guess what's going on.

  • As ESP is sent via UDP I can't tweak the MSS. Of course I could apply MSS restrictions to the unencrypted TCP connection, but that'd be no less a workaround than lowering interface MTU or disabling PMTU discovery altogether. – al. Jul 14 '14 at 00:08
  • Most probably PMTU discovery does not work, due to missing ICMP replies (Defrag needed) from IPsec server. The advice above is about limiting MSS for connections over IPsec, not underlying ESP tunnel. – stimur Jul 19 '14 at 22:32
  • The ICMP errors are reaching the IPSec host just fine. As the smallest MTU of all hops isn't known, a certain MSS limit would have to be guessed. – al. Jul 20 '14 at 11:55
  • You could run `mturoute` to determine the smallest MTU. –  Jul 20 '14 at 22:44
0

If PMTU discovery in a VPN scenario fails this is typically a problem with the public IP addresses of the gateways or routers in between or filtered ICMP messages. MSS clamping is only a ugly workaround.

eckes
  • 835
  • 9
  • 21