Promox Multi-Bonding over VLANs

0

I've been going multiple times through a fail-over setup in a small Proxmox cluster (=Debian with addon packages). As there was no good documentation, I post this question, which is my answer :-)

The idea: A separate Storage and Service network should be established with the ability to fail-over, if one of the switches fails or is in maintenance. In the service network, we want to segregate traffic further with VLANs.

The solution to the problem is:

  • use bonding in active-backup mode for each network (bond0, bond1)
  • each bond has a primary network interface, over which the traffic should go in regular mode (iface A, iface B)
  • in the failover scenario, use the other network; as both storage and service network are connected, the ARP packets will find the desired endpoint
  |---------------[                      storage switch                         ]
  |                   x              x                  x              x
  |                   |              |                  |              |
failover              |              |                  |              |
link                  x              x                  x              x
  |                 iface A       iface A            iface A        iface A
  |
  |              [  Node 1  ]    [  Node 2  ]     [  Node 3 ]     [  Node X ]
  |
  |                 iface B       iface B             iface B       iface B
  |                   x              x                  x              x
  |                   |              |                  |              |
  |                   |              |                  |              |
  |                   x              x                  x              x
  |
  |---------------[                      services switch                         ]
  • the fun is now, how to make two bonds in parallel over the same interface ? Solutions:
    • go with VLANs on top of the iface A, iface B, and bond the VLANs together
    • use traffic shaping (tc)

I've tried both solutions to make them run - I was only successful with the first:

create VLANs for both interfaces

  • iface A.100
  • iface A.101
  • iface B.100
  • iface B.101

create Bonds on top of VLANs

  • bond0

    • slave iface A.100
    • slave iface B.100
  • bond1
    • slave iface A.101
    • slave iface B.101

Create VLANs on top of Bonds- you have now Q-in-Q

  • bond1.5000
  • bond1.XXX

My challenges were to understand, where to put the bond-XXX arguments; it has to be in the first interface which is part of the bond (in my case: ifaceA.1000), where all the miimon, up- and downdelay are described. Now check with cat /proc/net/bonding/bond0:

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: ifaceA.100 (primary_reselect always)
Currently Active Slave: ifaceA.100
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 200
Down Delay (ms): 200

Slave Interface: ifaceA.100
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: XX:XX:XX:XX:XX:XX
Slave queue ID: 0

Slave Interface: ifaceB.101
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: YY:YY:YY:YY:YY:YY
Slave queue ID: 0


here is my /etc/network/interfaces file:

iface lo inet loopback

auto vmbr0
iface vmbr0 inet static
        # your usual proxmox mgmt interface
        address A.B.C.D 
        netmask 255.255.255.0
        gateway A.B.C.1
        bridge_ports eth0
        bridge_stp off
        bridge_fd 0
# Proxmox Mgmt bridge

auto ifaceA
iface ifaceA inet manual
        mtu 9100
#Storage net

auto ifaceB 
iface ifaceB inet manual
        mtu 9100
#Service net

auto ifaceA.100
iface ifaceA.100 inet manual
    bond-master bond0
    bond-primary ifaceA.100
    bond-miimon 100
    bond-updelay 200
    bond-downdelay 200
    bond-mode active-backup
    mtu 9048
#Primary leg of storage bond0

auto ifaceA.101
iface ifaceA.101 inet manual
    bond-master bond1
    bond-miimon 100
    bond-updelay 200
    bond-downdelay 200
    bond-mode active-backup
    mtu 9048
#Secondary leg of services

auto ifaceB.100        
iface ifaceB.100 inet manual
    bond-miimon 100
    bond-updelay 200
    bond-downdelay 200
    bond-master bond0
    bond-mode active-backup
    mtu 9048
#Secondary leg of services

auto ifaceB.101
iface ifaceB.101 inet manual
    bond-master bond1
    bond-primary ifaceB.101
    bond-miimon 100
    bond-updelay 200
    bond-downdelay 200
    bond-mode active-backup
    mtu 9048
#Primary leg of services

auto bond0
iface bond0 inet static
    address W.X.Y.Z
    netmask 255.255.255.0
    bond-mode active-backup
    bond-primary ifaceA.100
    mtu 9048
#Storage for Ceph (pveceph init --network W.X.Y.0/24)

auto bond1
iface bond1 inet static
    address Q.P.O.R
    netmask 255.255.255.0
    bond-mode active-backup
    bond-primary ifaceB.101
    mtu 9048
#Services/Corosync bond (pvecm create MYCLUSTER --bindnet0_addr Q.P.O.R --ring0_addr static-hostname-for-this-node)

auto bond1.5000
iface bond1.5000 inet manual
    mtu 9000
# bond1 services on VLAN 5000, has no IP bound to it

auto vmbr5000
iface vmbr5000 inet manual
    bridge-ports bond1.5000
    bridge-stp off
    bridge-fd 0 
    mtu 9000
# bond1.5000 services, which can be consumed within a VM

# AND ... more of the same 

auto bond1.XXX
iface bond1.XXX inet manual
    mtu 9000

auto vmbrXXX
iface vmbrXXX inet manual
    bridge-ports bond1.XXX
    bridge-stp off
    bridge-fd 0 
    mtu 9000


hubbaplus

Posted 2019-04-17T08:52:31.040

Reputation: 1

Answers

0

However, to complete my journey: I wasn't able to achieve the failover-bonding with traffic shaping as the idea is given in bonding.txt ; here is my failed configuration; simplified & adapted..

  • Enable traffic shaping
  • Give each leg of the bond a Queue-ID
  • Try to match VLAN Tags to override the queue to use a specific leg

I hope this post helps somebody else to figure it out..

iface lo inet loopback

auto ifaceA
iface ifaceA inet manual
        bond-mode active-backup
        bond-master bond0
        bond-primary ifaceB
        bond-miimon 100
        bond-updelay 200
        bond-downdelay 200
        mtu 9100


auto ifaceB
iface ifaceB inet manual 
        bond-mode active-backup
        bond-master bond0
        bond-primary ifaceB
        bond-miimon 100
        bond-updelay 200
        bond-downdelay 200
        mtu 9100
# Choose the second interface as the default primary

auto bond0
iface bond0 inet static
    address Q.O.P.R
    netmask 255.255.255.0
    bond-mode active-backup 
    bond-primary ifaceB
    bond-miimon 100
    bond-updelay 200
    bond-downdelay 200
    mtu 9100
    post-up echo "ifaceA:2" > /sys/class/net/bond0/bonding/queue_id
    post-up echo "ifaceB:3" > /sys/class/net/bond0/bonding/queue_id
    post-up tc qdisc add dev bond0 handle 1 root multiq
#Bond over both

auto bond0.5000
iface bond0.5000 inet static
        address H.I.K.L
        netmask 255.255.255.0
        mtu 9000
        post-up tc filter add dev bond0 basic match 'meta(vlan mask 0xfffd eq 0x1388)' action skbedit queue_mapping 2
# Should go over iface A

auto bond0.XXX
iface bond0.XXX inet static
        address K.L.M.N
        netmask 255.255.255.0
        mtu 9000
        post-up tc filter add dev bond0 basic match 'meta(vlan mask 0xffff eq 0xVLAN-ID-as-Oct)' action skbedit queue_mapping 3
# Should go over iface B



hubbaplus

Posted 2019-04-17T08:52:31.040

Reputation: 1