1

I have an NFS mount over a Strongswan IPSec tunnel, which is encapsulated in a 6to4 tunnel. The IPSec is because I need encryption for NFS traffic, the 6to4 is because the VPS provider won't assign a native IPv6 prefix to my server. Because I had MTU problems with the 6to4 tunnel, I had to lower the MTU on the tunnel interface to the minimum (1280 – if I try to set anything lower, I get an "Error: mtu less than device minimum." message).

NFS still wants to send packets over MTU. I know this, because I have an nftables rule to log ESP packets:

    chain output {
    type filter hook output priority filter; policy accept;
    ip6 nexthdr esp counter packets 303367 bytes 323173696 log accept
}

Thus I see these packets logged in syslog/journal:

Jan 29 21:41:18 nfsclient kernel: IN= OUT=he-ipv6 SRC=fd48:2b50:6a95:a6db:0000:0000:0000:0004 DST=fdc8:d5f9:cbbf:b206:0000:0000:0000:2001 LEN=1316 TC=0 HOPLIMIT=64 FLOWLBL=155038

(IPs are changed to private for privacy reasons.)

Now I can't see the logged packet with tcpdump because supposedly they get dropped by the kernel due to being over MTU. I assume NFS tries to adhere to the MTU setting, but it doesn't know that its packages will be encapsulated in IPSec. So even if NFS generates a packet under 1280 bytes, the ESP header added to it gets it over the set MTU. I also suspect that NFS sets the DF flag on its packets, because otherwise fragmentation would work. (I tested it with ping6 -M want and fragmented packets went through.) So I can't lower the MTU, NFS insists to send packets those will be over MTU when encrypted and even sets the DF flag.

What can I do now? The following things I thought about, but don't know how to implement:

  1. Set a maximum packet size for NFS, e.g. with a mount option, but I don't think there is such a setting, I already searched for it.
  2. Configure Strongswan to deal with the situation, but I didn't find such option either.
  3. Set an nftables rule that somehow notifies NFS that it should generate smaller packets. E.g. report an even lower MTU for NFS when it looks for it – don't know if it's even possible.
  4. Remove the DF flag from the packets to force fragmentation. I don't know how to do it either, or whether it's possible.

I think nftables is the way to go, but if it could be solved on NFS level, it would be even better. I'd also appreciate solutions with iptables, I could look up what's the nftables equivalent.

Because it's asked in comments, I provide information about my interfaces.

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether aa:00:11:4d:f7:01 brd ff:ff:ff:ff:ff:ff
4: he-ipv6@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1280 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/sit 192.168.32.84 peer 216.66.84.42

And here are the tunnels (ip tunnel):

sit0: ipv6/ip remote any local any ttl 64 nopmtudisc 6rd-prefix 2002::/16
he-ipv6: ipv6/ip remote 216.66.84.42 local 192.168.32.84 ttl 255 6rd-prefix 2002::/16

(Changed my public IPv4 to a private address, but in reality I have a globally routable address for local; 216.66.84.42 is the HE 6to4 tunnel gateway, which is well-known so left here.)

And here is the default route that applies for the traffic:

default via fd48:2b50:6a95:a6db::1 dev he-ipv6 metric 1024 onlink pref medium

So applications believe their packets will go out on he-ipv6, which has an MTU of 1280. But their packets first get encapsulated in IPSec ESP, and then sent through the he-ipv6 tunnel. The result is an IPSec-encrypted NFS data packet encapsulated in a 6to4 packet which itself goes out on the eth0 interface towards 216.66.84.42 (HE gateway).

MegaBrutal
  • 163
  • 6
  • DF bit doesn't apply directly on the reincapsulating packets - the network stack kernel can copy it indeed, or it can skip it on newly created packets that incapsulate the DF'ed payload, thus latter ones can be fragmented even if the payload restricts it. So your diagnostics are at least incomplete. Don't play with MTU, play with DF bit copying. – drookie Jan 30 '22 at 05:56
  • Please show the output of `ip link` and tell what interfaces are used for this traffic. – Tero Kilkanen Jan 30 '22 at 08:49
  • @TeroKilkanen, drookie, thanks for your comments! I amended my post with my interface and tunnel details. Please ask if you need to know anything else. – MegaBrutal Jan 30 '22 at 12:15

0 Answers0