6

I'm running into problems with getting a LACP trunk to operate properly on Ubuntu 12.04.2 LTS.

My setup is a single host connected with two 10 Gbe interfaces to two seperate Nexus 5548 switches, with vPC configured to enabled multi-chassis LACP. Nexus config is as per Cisco guidelines, and Ubuntu config as per https://help.ubuntu.com/community/UbuntuBonding

Server is connected to port Ethernet1/7 on each Nexus switch, whose ports are configured identical and placed in Port-channel 15. Port-channel 15 is configured as VPC 15, and VPC output looks good. These are simple access ports, i.e. no 801.1q trunking involved.

Diagram:

    +----------+      +----------+      +----------+      +----------+
    | client 1 |------| nexus 1  |------| nexus 2  |------| client 2 |
    +----------+      +----------+      +----------+      +----------+
                           |                  |
                           |    +--------+    |
                           +----| server |----+
                           eth4 +--------+ eth5

When either link is down, both clients 1 and 2 are able to reach the server. However, when I bring the secondary link up, the client connected to the switch with the newly-enabled link, is unable to reach the server. See the following table for state transitions and results:

   port states (down by means of "shutdown")
     nexus 1 eth1/7        up     up    down   up
     nexus 2 eth1/7       down    up     up    up

   connectivity
    client 1 - server      OK     OK     OK   FAIL
    client 2 - server      OK    FAIL    OK    OK

Now, I belive I've isolated the issue to the Linux side. When in up-up state, each nexus uses the local link to the server to deliver the packets, as verified by looking at the mac address table. What I am able to see on the server is that the packets from each client are being received on the ethX interface (packets from client 1 on eth4, packets from client 2 on eth4) by using tcpdump -i ethX, but when I run tcpdump -i bond0 I can only traffic from either of the host (in accordance with what I stated above).

I observe the same behaviour for ARP and ICMP (IP) traffic; ARP fails from a client when both links are up, works (along with ping) when one is down, ping fails when I enable the link again (packets are still received on eth interface, but not on bond0).

To clarify, I'm setting up multiple servers in this configuration, and all show the same symptoms, so it doesn't appear to be hardware related.

So - figuring out how to fix that is what I'm dealing with; my Googling has not brought me any luck so far.

Any pointers are highly appreciated.

/etc/network/interfaces

    auto eth4
    iface eth4 inet manual
    bond-master bond0

    auto eth5
    iface eth5 inet manual
    bond-master bond0

    auto bond0
    iface bond0 inet static
    address 10.0.11.5
    netmask 255.255.0.0
    gateway 10.0.0.3
    mtu 9216
    dns-nameservers 8.8.8.8 8.8.4.4
    bond-mode 4
    bond-miimon 100
    bond-lacp-rate 1
    #bond-slaves eth4
    bond-slaves eth4 eth5

/proc/net/bonding/bond0

    A little further information:
    Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

    Bonding Mode: IEEE 802.3ad Dynamic link aggregation
    Transmit Hash Policy: layer2 (0)
    MII Status: up
    MII Polling Interval (ms): 100
    Up Delay (ms): 0
    Down Delay (ms): 0

    802.3ad info
    LACP rate: fast
    Min links: 0
    Aggregator selection policy (ad_select): stable
    Active Aggregator Info:
    Aggregator ID: 1
    Number of ports: 1
    Actor Key: 33
    Partner Key: 1
    Partner Mac Address: 00:00:00:00:00:00

    Slave Interface: eth4
    MII Status: up
    Speed: 10000 Mbps
    Duplex: full
    Link Failure Count: 8
    Permanent HW addr: 90:e2:ba:3f:d1:8c
    Aggregator ID: 1
    Slave queue ID: 0

    Slave Interface: eth5
    MII Status: up
    Speed: 10000 Mbps
    Duplex: full
    Link Failure Count: 13
    Permanent HW addr: 90:e2:ba:3f:d1:8d
    Aggregator ID: 2
    Slave queue ID: 0

EDIT: Added config from Nexus

    vpc domain 100
      role priority 4000
      system-priority 4000
      peer-keepalive destination 10.141.10.17 source 10.141.10.12
      peer-gateway
      auto-recovery
    interface port-channel15
      description server5
      switchport access vlan 11
      spanning-tree port type edge
      speed 10000
      vpc 15
    interface Ethernet1/7
      description server5 internal eth4
      no cdp enable
      switchport access vlan 11
      channel-group 15

EDIT: Added results from non-VPC port-channel on nexus1 for same server, before and after IP change (changed IP to influence load-balancing algorithm). This is still using the same settings on the server.

      port states (down by means of "shutdown")
        nexus 1 eth1/7        up     up    down   up
        nexus 1 eth1/14      down    up     up    up <= port moved from nexus 2 eth1/7

   connectivity (sever at 10.0.11.5, hashing uses Eth1/14)
       client 1 - server      OK     OK     OK   FAIL
       client 2 - server      OK     OK     OK   FAIL

The results after changing the IP is as predicted; not-used interface being brought up causes failures.

      connectivity (sever at 10.0.11.15, hashing uses Eth1/7)
       client 1 - server      OK    FAIL    OK    OK
       client 2 - server      OK    FAIL    OK    OK
Tolli
  • 61
  • 1
  • 1
  • 5
  • 1
    Do you have any other hosts using Virtual Port Channel that are working? It might be useful if you post your VPC config from the switches. Your Linux config looks valid. – Zoredache Sep 26 '13 at 19:37
  • Not on this switch, no, and all my LAG config so far has been between Cisco and/or Arista devices - never touched this on Linux before. Will add VPC config to original question. What I'll try tomorrow is to isolate further by making this a standard port-channel on only one switch, i.e. no VPC. Then it's just a matter of finding out the proper way to trick the LB algorithm to test properly. – Tolli Sep 26 '13 at 21:16
  • You can, and should edit additional details into your question instead of trying to put them in a comment. – Zoredache Sep 26 '13 at 21:18
  • Just figured - hitting Enter was playing tricks on me. :) – Tolli Sep 26 '13 at 21:21
  • As seen in the latest Edit, I reproduced with a normal port-channel. While connected to a single switch, host in mode 4, and doing ifdown for one of the interfaces, the Nexus still sees the link as up and includes in the port channel. Now, what I also did was change from mode 4 to 2 (balance-xor). After changing that, I'm not experiencing the failures described in the OP. The ifdown issue described here is still the same, though. For the application in question, mode 2 will probably work just as well. However, I would obviously love to get it working properly with mode 4. – Tolli Sep 27 '13 at 14:45

2 Answers2

3

The only LACP config I managed to get working in Ubuntu is this:

auto bond0
iface bond0 inet dhcp
  bond-mode 4
  bond-slaves none
  bond-miimon 100
  bond-lacp-rate 1
  bond-updelay 200 
  bond-downdelay 200

auto eth0
iface eth0 inet manual
  bond-master bond0

auto eth1
iface eth1 inet manual
  bond-master bond0

i.e. I don't use bond-slaves but bond-master. I'm not sure what the difference is but I found this config worked for me.

I don't have any issues with LACP under my setup although this is with 1Gbe networking.

In addition, if you're still getting problems, try plugging both cables into the same switch and configuring the ports for LACP. Just to eliminate the possibility of issues with multi chassis LACP.

hookenz
  • 14,132
  • 22
  • 86
  • 142
  • Thanks for the tip - as my OOB is not playing nicely right now I'll try this out tomorrow morning on-site. – Tolli Sep 26 '13 at 21:36
  • Using bond-slaves none changes nothing. bond-primary I believe is not relevant when using mode 4/802.3ad [AFAICT](https://www.kernel.org/doc/Documentation/networking/bonding.txt), only active-backup. What version are you running? [This guide](http://backdrift.org/howtonetworkbonding) shows two variants for Ubuntu, and [this thread](http://ubuntuforums.org/showthread.php?t=1595177) refers to config changes as well. Also, the guide at backdrift.org shows other syntax for modprobe.d/bonding. I did try mixing all options (primary and not, different bonding syntax, slaves none and not) - no luck. – Tolli Sep 27 '13 at 13:23
  • Tolli - I'm actually using this setup in mode 4 and it works perfectly. When I first looked at a bonding setup under Ubuntu I was using 10.04. There seemed to be bugs in the networking scripts to support it. This setup was the only one that worked for me at the time and that's why I stuck with it. Have you tried bonding on the same switch rather than across switch? – hookenz Sep 28 '13 at 02:09
  • Yes, I also tried hooking two ports to the same switch with same results. See the latest addition to the OP. I was also having problems with the scripts in 12.04; the most reliable approach for me when changing parameters in this setup is "ifdown eth4; ifdown eth5; ifdown bond0; rmmod bonding; ifup eth4; ifup eth5; ifup bond0" - fun. As much as I hate not solving things, perhaps leaving this in balance-xor mode might be the results now. I'd like to try on another server in a seperate environment as well though, I have another similar project coming up. – Tolli Sep 29 '13 at 20:57
  • Do your server happen to have dual 1GBe nic's as well? try it with those or with any server you might have lying around that has 1GBe nics. This will isolate the issue to either the server or the switch. I'm inclined to think it might be a setup issue on your switch. Also does you switch have the latest firmware installed? – hookenz Sep 29 '13 at 21:24
0

problem is not on the linux side but on nexus side and how it works in vPC configuration.

to configure vPC on nexus first you need to connect two nexus switches and configre that link as "peer-link".

in normal situation when both links from switches to server are in state UP traffic in vlan 11 configured on vPC are dropped on peer-link.

only when one of interfaces which are part of vPC are down - traffic in vlan 11 is allowed on peer-link.

this is how vPC works on nexus switches.

to solve this problem you can run fabric-path and make another connection between switches nexus-1 and nexus-2.