The diagram below depicts a scenario that involves aggregation of three slow channel throughputs over a WAN
.
A fast host on a WAN
(@ 54.239.98.8
) is communicating with a host on a LAN
(@ 192.168.0.100
) which is connected via three slow channels to the WAN
through three routers running Linux v4.14.151 and netfilter
/iptables
firewalls:
The IP traffic from the fast host arrives fragmented and randomized at the three routers (but always from 54.239.98.8
). I have no control over this fragmentation (corporate politics, go figure) - I suspect the fragmentation is done on purpose by the fast host.
THE PROBLEM: Each router attempts to reassemble the fragmented IP packets which leads to data loss because the fragments take random paths through the three routers and often one router cannot collect all of the packet fragments for successful reassembly.
Upon analyzing the iptables / netfilter
diagram below, I can see that the offending reassembly is occurring in the PREROUTING netfilter
hook, before the rule chains in the raw
table are processed.
THE ATTEMPTED SOLUTION: I have modified the kernel module nf_defrag_ipv4
to disable the offending defragmentation in the PREROUTING
hook as follows:
static const struct nf_hook_ops ipv4_defrag_ops[] = {
{
.hook = ipv4_conntrack_defrag, /* I changed this to point to: return NF_ACCEPT; */
.pf = NFPROTO_IPV4,
.hooknum = NF_INET_PRE_ROUTING,
.priority = NF_IP_PRI_CONNTRACK_DEFRAG,
},
{
.hook = ipv4_conntrack_defrag,
.pf = NFPROTO_IPV4,
.hooknum = NF_INET_LOCAL_OUT,
.priority = NF_IP_PRI_CONNTRACK_DEFRAG,
},
};
The complete source code of this module can be viewed here.
This code alteration disables the reassembly of all incoming packets and allows the unaltered IP fragments to pass to the destination host on the LAN (@ 192.168.0.100
) which accomplishes its own packet reassembly with packets coming from all three routers. This solution works, but it is ugly, since it modifies the kernel code and disables defragmentation for ALL forwarded packets (without regard to their source).
THE QUESTION: Is there a better solution then making this code change in the kernel ?
Especially a way to selectively disable IP defragmentation only for packets coming from the fast host on the WAN @ ip.src == 54.239.98.8
.