0

I have a Proxmox host with kernel 5.15.19-2-pve.

It has a bond0 interface made from eth2 and eth3, which receives vlan tagged traffic.

I created a vmbr666 bridge that shows looks like this:

# /etc/network/interfaces:
auto vmbr666
iface vmbr666 inet manual
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        mtu 9220

# brctl show
vmbr666         8000.5a0a13a9dd29       no              bond0
                                                        tap151034i1
# ip -d link sh dev vmbr666
66: vmbr666: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9220 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 5a:0a:13:a9:dd:29 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 
    bridge forward_delay 0 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 1 vlan_protocol 802.1Q bridge_id 8000.5a:a:13:a9:dd:29 designated_root 8000.5a:a:13:a9:dd:29 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer  251.81 vlan_default_pvid 1 vlan_stats_enabled 0 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3124 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 

Note that vlan_filtering is 1.

If I tcpdump -enlvvv on bond0, I see traffic for VLAN42. If I tcpdump on vmbr666 or tap151034i1, I don't see traffic for VLAN42 (not even broadcasts or multicasts, even though I do see broadcast traffic of some other VLANs). Question: why not?

Relevant output from bridge -c vlan show:

bond0             1 PVID Egress Untagged
                  2-99
tap151034i1       1 PVID Egress Untagged
                  2-99
vmbr666           1 PVID Egress Untagged

Like I said, I do see traffic for other VLANs on all of these interfaces, including tags, e.g.

15:03:35.293420 00:50:56:b1:24:0c > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 64: vlan 49, p 0, ethertype ARP (0x0806), Ethernet (len 6), IPv4 (len 4), Request who-has 10.76.155.200 tell 10.76.155.51, length 46

Now let's add vlan 42 to the vmbr666 interface to see if it makes any difference:

# bridge vlan add vid 42 dev vmbr666 self
# bridge -c vlan show dev vmbr666        
port              vlan-id  
vmbr666           1 PVID Egress Untagged
                  42

In tcpdump -enlvvv -i vmbr666 I still don't see anything related to vlan42, just other VLANs (e.g. 49 and 50).

Let's create a subinterface for vlan42 on tap151034i1 like this:

ip link add link tap151034i1 name test type vlan protocol 802.1q id 42 reorder_hdr on gvrp on mvrp on loose_binding off; ip link set up dev test

Running tcpdump -enlvvv -i test I see no traffic at all.

There is a vmbr42, which may interfere (but if so, why does it interfere?):

vmbr42          8000.9a0f54fe1040       no              bond0.42
                                                        fwpr103p0
                                                        fwpr104p0
                                                        fwpr105p0
                                                        fwpr151034p0
                                                        tap102i0

In ip -d link sh:

31: vmbr42: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 9a:0f:54:fe:10:40 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 
    bridge forward_delay 0 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q bridge_id 8000.9a:f:54:fe:10:40 designated_root 8000.9a:f:54:fe:10:40 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer   53.08 vlan_default_pvid 1 vlan_stats_enabled 0 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3124 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 

Note that vlan_filtering is 0.

Running tcpump -enlvvv on vmbr42 or tap102i0, which is one of its members, shows VLAN42 traffic, without tags -- no surprises there.

There are no ebtables or arptables rules.

I guess I don't understand the interplay between VLAN memberships and bridge interfaces in Linux.

Some theoretical questions:

  1. What is the effect of adding a VLAN to a bridge master interface with the self keyword in bridge vlan add?
  2. What is the effect of creating a VLAN subinterface of a bridge member interface?
  3. If a physical interface has a VLAN subinterface, and that's added to a bridge, are any frames for that VLAN supposed to be visible on other bridges the same physical interface is a member of? If not, why not?
  4. What is the difference, from a theoretical as well as practical perspective, between, on the one hand, creating VLAN subinterfaces of physical interfaces and bridging those, and on the other hand, enabling vlan_filtering on a bridge and using bridge vlan pvid untagged to give place some member interfaces in specific VLANs?
  5. Can you mix these two approaches?

EDIT: removed stuff that was shown in comments to be irrelevant, and added theoretical questions to hopefully help better structure the answer.

András Korn
  • 641
  • 5
  • 13
  • Are you using vlan-aware bridges or not vlan-aware in Proxmox? Please, show e.g. its configuration from `/etc/network/interfaces`. Also, please notice, that for vlan-aware bridges `brctl` from `bridge-utils` is *inappropriate* tool; use `ip` and `bridge` utils from `iproute2` package (and, by the way, modern Debian uses these to set up bridges nowadays). To consider VLAN settings use somthing like `bridge vlan show`, to enslave interface — `ip link ... set master ...`, and so on. // To see VLAN tags in tcpdump use `tcpdump -e` option. – Nikita Kipriyanov May 20 '22 at 13:41
  • The bridges were created using the Proxmox GUI and don't show up in `/etc/network/interfaces`. `brctl show` was the only thing I used from `bridge-utils`, and that works fine whether `vlan_filtering` is enabled or not. `vmbr666` has it enabled (so it's "vlan aware"); the others don't. I did check `bridge vlan show` -- maybe you didn't get that far in the question? – András Korn May 20 '22 at 14:17
  • I updated the question so it includes the value of `vlan_filtering` for both bridges I examined. – András Korn May 20 '22 at 14:27
  • Proxmox networking is put into `/etc/network/interfaces` or, probably, some file in the drop directory `/etc/network/interfaces.d/`. That file has the same syntax. – Nikita Kipriyanov May 20 '22 at 16:12
  • Ah, you're right, it dumps it into `interfaces` itself, not even `interfaces.d` (where I looked, expecting all modern software to use the `.d` mechanism. – András Korn May 20 '22 at 16:34
  • 1
    Also, use `bridge -c vlan show`. It "compresses" VLAN ranges into a few lines. Also I don't see a `vmbr666` or `vmbr42`'s entry in your `bridge vlan show`. Which vlans are enabled on that port? By default Proxmox doesn't enable all vlans on the "host" bridge port. – Nikita Kipriyanov May 20 '22 at 16:35
  • Am I right, you have bond0.42 as the vlan 42 subinterface of bond0, and it's a slave of vmbr42, and also your bond0 is at the same time is a slave of vmbr666? This setup is screwed. What path should 42-tagged packets assume, to a vmbr42 via bond0.42 subinterface or to vmbr666 via bond0? I bet it's first one. – Nikita Kipriyanov May 20 '22 at 16:43
  • Yes, correct on all counts. I'll grant you the setup doesn't work as expected, but I don't think it's necessarily "screwed". :) I would expect 42-tagged packets to show up in both places -- on bond0.42 without tag, as well as on vmbr666 with tag. If this were a physical switch, I could definitely have as many interfaces in vlan42 as I want, both with and without tagging, simultaneously. I also suspect that vmbr42 "eats" the frames I expect to see on vmbr666, but haven't verified this yet. – András Korn May 20 '22 at 16:50
  • Before introduction of VLAN-aware bridges Linux directed everything to the "main" intefaces if it's in the bridge (where eth0.10 should receive vlan tag 10 on eth0, but if you put eth0 into some bridge, eth0.10 will not see any traffic — it'll be in the bridge). After introduction of VLAN-aware bridges things basically changed to be opposite. – Nikita Kipriyanov May 20 '22 at 16:54

1 Answers1

0

Buy why should it appear on the bridge interface?

Think Linux bridge as a "virtual managed L2 switch", where bridge interface is a mean for the host itself to be connected to the switch. So bridge interface is considered to be a "switch port" where host computer is "connected".

Now, let's change the "virtual" switch to a real for a minute. Some port is communicating with some other port. The switch nature is that this traffic should not be visible on other ports: it only floods traffic when it doesn't have a clue on which port the destination MAC address lives, or if the address is broadcast.

Turning back to our "virtual" setting: virtual machine tap port talks with "physical" port which is bond0, why should this traffic to be seen on the third unrelated port (which is "host"-port, named after the bridge itself)? ARP quiries are the only broadcast packets that are appear on the network, and those are properly broadcasted, so you see them; the rest is not.

STP BPDUs are different beasts. The bridge with enabled STP processing itself generates them and sends to each (STP-enabled) port. If you see it on the server it likey means you misconfigured something. Better disable the STP on the bridge and also configure the port on the other side (the bonded interface, e.g. the Port-Channel if it's Cisco and so on) to be passive for STP (don't send any BPDUs to the port, block the port if BPDU is received).


UPD:

PVE doesn't enable random vlans on the host-port. This is how my bridve -c vlan show looks:

root@vh2:~# bridge -c vlan show
port              vlan-id  
enp5s0f0          1 PVID Egress Untagged
                  2-4094
vmbr0             1 PVID Egress Untagged
veth105i0         111 PVID Egress Untagged
veth110i0         1 PVID Egress Untagged
                  2-4094
veth107i0         1 PVID Egress Untagged
                  2-4094

(that's a complete one). The config of this bridge in /etc/network/interfaces is basically like yours. As you can see, vmbr0 (which is the only bridge on this host) doesn't have any VLANs besides 1 (which is actually untagged 108 in this network). So even if I create vmbr0.111 (VLAN ID 111 subinterface of bridge), it won't see any traffic, until I add that VLAN to the vmbr0 interface, despite the fact VLAN 111 is very loud there.


Why are you arguing with me? I am doing this stuff for 14 years at least:

root@vh2:~# ip link add testbr type bridge vlan_filtering 1 vlan_protocol 802.1Q
root@vh2:~# ip tuntap add tap0 mode tap
root@vh2:~# ip link set tap0 master testbr
root@vh2:~# bridge -c vlan show dev testbr
port              vlan-id  
testbr            1 PVID Egress Untagged
root@vh2:~# bridge -c vlan show dev tap0
port              vlan-id  
tap0              1 PVID Egress Untagged
root@vh2:~# bridge vlan add vid 100 dev testbr self pvid untagged
root@vh2:~# bridge vlan add vid 100 dev tap0 pvid untagged
root@vh2:~# bridge vlan del vid 1 dev testbr self
root@vh2:~# bridge vlan del vid 1 dev tap0
root@vh2:~# bridge vlan add vid 200 dev testbr self
root@vh2:~# bridge -c vlan show dev testbr
port              vlan-id  
testbr            100 PVID Egress Untagged
                  200
root@vh2:~# bridge -c vlan show dev tap0
port              vlan-id  
tap0              100 PVID Egress Untagged
root@vh2:~# ip tuntap del tap0 mode tap
root@vh2:~# ip link del testbr
Nikita Kipriyanov
  • 8,033
  • 1
  • 21
  • 39
  • I also don't see ARP queries related to VLAN42 on `vmbr666`, and when I create an explicit vlan42 interface attached to the bridge, it still doesn't get any traffic-not even replies to ARP queries it sends. (I still need to check whether ARP queries even leave the host.) – András Korn May 20 '22 at 16:26
  • I appreciate that you're trying to help, but the setup you pasted is not analogous to my situation. In my case, all relevant interfaces have all VLAN memberships. You also can't `bridge vlan add` a VLAN to a bridge interface such as `vmbr0`, unless that interface is also itself a bridge member. Go ahead and try it (you'll get `RTNETLINK answers: Operation not supported`). I also didn't create a vmbr666.42 interface, but a .42 interface of a bridge member (`tapXXXXX.42`). – András Korn May 20 '22 at 19:18
  • No, you're absolutely wrong. **The bridge interface like `vmbr0` is always a bridge member**, it's a member of the bridge which is itself, and it denotes the port which connects the host with the bridge. I did that thing many times, I did that using Debian interfaces files (this one is actually a part of a cluster). I have a system built with systemd network configuration where several OpenVPN taps had different PVIDs and bridge interface itself has another PVID and the physical NIC is the all-tagged trunk. Your error message likely comes from the fact you just spelled the command incorrectly. – Nikita Kipriyanov May 20 '22 at 19:36
  • I updated my question to show that this is not the case. – András Korn May 22 '22 at 09:31
  • 1
    You are extremely stubborn, but instead it is better to be extremely attentive. For configuring vlans on the master bridge inteface you add "self" keyword (it is described in the `man bridge`). Also, again, it is wrong do do anything with the interface that's bridge slave. It shouldn't have any vlan subinterfaces, IP configurations and so on. – Nikita Kipriyanov May 22 '22 at 13:06
  • I have indeed missed the `self` keyword in the man page. Thanks for pointing it out. With `self` I can indeed add vlan42 to the vmbr666 master bridge interface. However, it still doesn't show any broadcast traffic in vlan42 when I `tcpdump -enlvvv` on it, only broadcast traffic for other VLANs, so this was probably a red herring. I'm not sure I understand what adding the VLAN to the bridge interface actually does, since I only have `1` and `42` on it and it still sees broadcast traffic on e.g. vlan49, but not 42. – András Korn May 22 '22 at 13:22
  • And for the record, I'm not "arguing with you"; I'm trying to understand what is happening and pointing out where what I see contradicts what you say. It's entirely possible that you're right; I hope that in the end we can arrive at an answer I can accept. – András Korn May 22 '22 at 13:25
  • Adding VLAN to a bridge port does the same as adding VLAN to the switch port if that was the physical switch in place of bridge. Until you add VLAN to a port, that port will not participate in VLAN and should not see any traffic for that VLAN, nor the bridge should accept any traffic for that VLAN coming from non-participiating port. Again, this includes the "host-facing" bridge port: if you want to see traffic on the port, add that VLAN to the "master" port. – Nikita Kipriyanov May 22 '22 at 13:50
  • But, again, this contradicts what I see. I didn't add any VLANs to `vmbr666` and yet I could see broadcasts for vlan49 and 50 on it (but not vlan42); using `bridge vlan add ... self` to add vlan42 made no observable difference. – András Korn May 22 '22 at 14:04