Transparent firewall with nftables and VLANs

Question

I want to ask you for best practice advice in transparent firewall build.

I have 2 segments of network and CentOS serv with 2 10G interfaces. I want to filter/monitor/limit/drop traffic between segments. Traffic is tagged. Should I untagg traffic for filtering and tag it back or nftable can handle it tagged?

Now scheme looks like:

PCs--|                                         |--PCs
PCs--|--untag--[Switch]--tag--[Switch]--untag--|--PCs
PCs--|                                         |--PCs

I want:

PCs--|                                                              |--PCs
PCs--|--untag--[Switch]--tag--**[Firewall]**--tag--[Switch]--untag--|--PCs
PCs--|                                                              |--PCs

When you say "traffic is tagged" you mean 802.1Q VLAN ids? Also, in the diagrams above, the CentOS host is in between of the switch, and it is the intended firewall on the diagram above? — Pablo, Jun 29 '17 at 14:29

A.B · Answer 1 · 2018-04-06T20:57:40.360

TL;DR: nftables, at bridge level, can handle fine both tagged or untagged packets, by using slightly different rules. All the tagging work can be done on the Linux side by making with a vlan-aware bridge, so no change of configuration is needed on the switches whatever the choice made in the firewall for nftables.

A lot of interesting documentation about testing VLANs can be found in these blog series (especially part IV, even if a few informations might not be fully accurate):

Fun with veth-devices, Linux bridges and VLANs in unnamed Linux network namespaces I II III IV V VI VII VIII

Let's put two minimalistic models of the firewall (in a network namespace). trunk100 and trunk200 are linked to the two switches sending vlan 100 tagged packets from left computers and vlan 200 tagged packets from right computers. Note that here VLANs tags are explicitely allowed to appear on the other side either by creating a sub-interface with the other's side VLAN id, either by directly adding the other side's VLAN id to the trunk interface.

vlan sub-interfaces putting untagged packets in the bridge

ip link add fw0 type bridge vlan_filtering 1
ip link set fw0 up
for trunk in 100 200; do
    for vlan in 100 200; do
        ip link add link trunk$trunk name trunk$trunk.$vlan type vlan id $vlan
        ip link set trunk$trunk.$vlan master fw0
        bridge vlan add vid $vlan pvid untagged dev trunk$trunk.$vlan
        bridge vlan del vid 1 dev trunk$trunk.$vlan
        ip link set trunk$trunk.$vlan up
    done
done
bridge vlan del vid 1 dev fw0 self

For this case the tagged packets arriving through trunk100 and trunk200 are split in per-vlan sub-interfaces and the packets are untagged. The bridge is still internally aware of the VLANs in use, and is applying vlan filtering on sources and destinations. nft will add its own restrictions. The outgoing packets will be retagged once arriving on the parent trunk interface.

tagged packets directly into the bridge

ip link add fw0 type bridge vlan_filtering 1
ip link set fw0 up
for trunk in 100 200; do
    ip link set trunk$trunk master fw0
    for vlan in 100 200; do
        bridge vlan add vid $vlan tagged dev trunk$trunk
    done
    bridge vlan del vid 1 dev trunk$trunk
    ip link set trunk$trunk up
done
bridge vlan del vid 1 dev fw0 self

For this simpler case, the tagged packets traverse the bridge while retaining their vlan tag.

Here is a single nftables ruleset showing how both cases are handled. iifname was chosen here instead of iif so the same set of rules can work in both cases (without having an error due to a missing interface). Normally iif should be preferred. There are additional counter entries just to check what exactly did or didn't match (with nft list ruleset -a):

#!/usr/sbin/nft -f

flush ruleset

table bridge filter {
    chain input {
        type filter hook input priority -200; policy drop;
    }

    chain forward {
        type filter hook forward priority -200; policy drop;
        counter
        arp operation request counter
        arp operation reply counter
        vlan type arp arp operation request counter
        vlan type arp arp operation reply counter
        arp operation request counter accept
        arp operation reply counter accept
        vlan type arp arp operation request counter accept
        vlan type arp arp operation reply counter accept
        ip protocol icmp icmp type echo-request counter
        ip protocol icmp icmp type echo-reply counter
        vlan type ip icmp type echo-request counter
        vlan type ip icmp type echo-reply counter
        iifname trunk100.100 ip protocol icmp icmp type echo-request counter accept
        oifname trunk100.200 ip protocol icmp icmp type echo-reply counter accept
        vlan id 100 vlan type ip icmp type echo-request counter accept
        vlan id 200 vlan type ip icmp type echo-reply counter accept
    }

    chain output {
        type filter hook output priority 200; policy drop;
    }
}

Note that these rules could have been written even more verbosely. Example:

iifname "trunk100.100" ether type ip ip protocol icmp icmp type echo-request

or

ether type vlan vlan id 200 vlan type ip ip protocol icmp icmp type echo-reply

When the first setup is in use (untagged packets through sub-interfaces) only the classical rules will match. When the second setup is in use, only the rules explicitely using vlan will match. So this set of dual rules, allowing basic ARP resolution as well allowing VLAN 100 to ping VLAN 200 but not the other way around, will work in both cases.

This set of rules should be working when used with CentOS' nftables v0.6 (not tested on CentOS' kernel) or current nftables v0.8.3.

Current known limitations:

Nftables as of v0.8.3 cannot use conntrack the way it was possible with ebtables/iptables interactions. It appears there are plans about it, see this PDF: bridge filtering with nftables. So this makes stateful rules very difficult to implement.

Note also that nftables has still (as of 0.8.3) display issues: nft list ruleset -a will drop vlan from the "decompiled" rules if none of its options are used. Example, those two rules:

nft add rule bridge filter forward ip protocol icmp counter
nft add rule bridge filter forward vlan type ip ip protocol icmp counter

When displayed back with nft list ruleset -a (v0.8.3):

        ip protocol icmp counter packets 0 bytes 0 # handle 23
        ip protocol icmp counter packets 0 bytes 0 # handle 24

It's only with nft --debug=netlink list ruleset -a that will dump the bytecode, that it's clear that those are indeed two different rules (data are here in little endian):

bridge filter forward 23 22 
  [ payload load 2b @ link header + 12 => reg 1 ]
  [ cmp eq reg 1 0x00000008 ]
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ cmp eq reg 1 0x00000001 ]
  [ counter pkts 0 bytes 0 ]

bridge filter forward 24 23 
  [ payload load 2b @ link header + 12 => reg 1 ]
  [ cmp eq reg 1 0x00000081 ]
  [ payload load 2b @ link header + 16 => reg 1 ]
  [ cmp eq reg 1 0x00000008 ]
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ cmp eq reg 1 0x00000001 ]
  [ counter pkts 0 bytes 0 ]

CentOS' v0.6 (tested on kernel 4.15) has also its own different "decompile" display problems:

ip protocol icmp icmp type echo-request

is displayed as:

icmp type echo-request counter

which makes a syntax error if tried as is in v0.6 (but is fine in v0.8.3).

with Linux kernel 5.3 (and nftables 0.9.2) it's at last possible to have stateful firewalling at bridge level: it's handled by newer modules `nf_conntrack_bridge`, `nft_meta_bridge` & co. — A.B, Sep 16 '19 at 13:58

Transparent firewall with nftables and VLANs

1 Answers1

Linked