9

I have 2 linux boxes running centos 6.5 each with 2 interfaces bonded together, linked to a Cisco 2960-S switch with lacp configured ports.

The configuration on the switch

port-channel load-balance src-dst-mac
!
interface Port-channel1
 switchport access vlan 100
 switchport mode access
!
interface Port-channel2
 switchport access vlan 100
 switchport mode access
!
interface FastEthernet0
 no ip address
!
interface GigabitEthernet0/1
 switchport access vlan 100
 switchport mode access
 speed 1000
 duplex full
 spanning-tree portfast
 channel-protocol lacp
 channel-group 1 mode active
!
interface GigabitEthernet0/2
 switchport access vlan 100
 switchport mode access
 speed 1000
 duplex full
 spanning-tree portfast
 channel-protocol lacp
 channel-group 1 mode active
!
interface GigabitEthernet0/3
 switchport access vlan 100
 switchport mode access
 speed 1000
 duplex full
 spanning-tree portfast
 channel-protocol lacp
 channel-group 2 mode active
!
interface GigabitEthernet0/4
 switchport access vlan 100
 switchport mode access
 speed 1000
 duplex full
 spanning-tree portfast
 channel-protocol lacp
 channel-group 2 mode active
!

and on the both linux sides I've loaded the kernel bonding module with the configuration

alias bond0 bonding
options bond0 miimon=100 mode=4 lacp_rate=1

Now the problem is that I transfer many files from one server to another monitoring the traffic graphs showing that the speed doesn't exceed the 1Gb/s speed for the bonding interface bond0.

is there any problem with the configuration ? shouldn't the speed be doubled to 2Gb/s ?

Ammar Lakis
  • 176
  • 1
  • 1
  • 9

4 Answers4

12

LACP will not split packets across multiple interfaces for a single stream/thread. For example a single TCP stream will always send/receive packets on the same NIC.

See the following post for reference:

Link aggregation (LACP/802.3ad) max throughput

Hope this helps.

Mike Naylor
  • 927
  • 1
  • 7
  • 15
  • Well, I didn't transfer only one file, many scp instances were copying files from one server to another.Also considering the hashing algorithm I ran a copy instance from a third server to one with bonding interface. This didn't work either and the traffic is still going through only one of the bonded interfaces. – Ammar Lakis Jan 24 '14 at 20:22
  • Are you actually using vLANs as well? I noticed that you had both port groups with access to the same vLAN. Are the other servers configured for this vLAN as well? Also what kind of switch are the devices connected to? Is it a managed switch that can handle LACP port groups and are the ports for the server with LAGs configured in a Link Aggregate on the switch? – Mike Naylor Jan 24 '14 at 20:27
  • Server A have eth0->Gi0/1 on the switch and eth1->Gi0/2, Server B : eth0->Gi0/3 and eth1->Gi0/4. You can consider that they are in the same lan as the four ports are configured on the same vlan. The switch as I mentioned is Cisco Catalyst 2960-S configured with LACP for each couple of ports.You can see that in the configuration above. – Ammar Lakis Jan 24 '14 at 20:36
  • Sorry, long day and I totally missed the switch model. What is the speed if you transfer a single file? Is the single file transfer at 1Gb/s as well? – Mike Naylor Jan 24 '14 at 20:47
  • also have you tried changing the balancing algorithm? Maybe try 'port-channel load-balance src-dst-port' and see if this is any different. Depending on the type of traffic you may see a different result. – Mike Naylor Jan 24 '14 at 20:52
  • Well yes, didn't exceed the limit of 1Gb/s and that's really surprising. Is there any good way to troubleshoot this? – Ammar Lakis Jan 24 '14 at 20:54
  • A single file transfer using LACP won't exceed 1Gb/s as it will stay on one interface. If you start a second file transfer it should load-balance to the other interface though. The load-balancing is handled by the algorithm, ie src-dst-mac, which you may want to see if you can configure the same on the server side as well for outbound traffic. So far what you're describing is expected behavior. As @Gadgeteering mentioned if you have multiple connections the TOTAL transfer speed for both connections could get up to 2Gb/s but a single transfer will be 1Gb/s or less. – Mike Naylor Jan 24 '14 at 21:00
  • I've mentioned it's not a single connection. 3 * 10 GB files are Transferred from server A to server B and a third 10 GB file from a third server to server B AT THE SAME TIME. – Ammar Lakis Jan 24 '14 at 21:05
  • let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/12655/discussion-between-mike-naylor-and-ammar-lakis) – Mike Naylor Jan 24 '14 at 21:08
7

The way Link Aggregation works is by using a hashing algorithm to decide which packets should go out which port.

Packets from the same source MAC address, to the same destination MAC address, will always go out the same port.

Some Link Aggregation implementations support using layer 3 (IP addresses) and even layer 4 (TCP/UDP Port number) as part of the hash, but this is not that common.

This is why you only get 1 gbit/sec when transferring files from 1 server to another.

If the OS and Switch will both support layer 3, you can get more speed by using multiple IP addresses. However because of the way the hashing algorithm works, there is a 50/50 chance that both streams will end up going out the same link.

Allan Jude
  • 1,226
  • 9
  • 12
4

My understanding of network bonding is that you cannot exceed the link speed of the member interfaces in one connection. A connection will stick to one interface in the bond after it is established.

However, connections are now split between the two interfaces. If you were to have two connections running from server A to server B, then the connections shouldn't start bottle-necking each other as far as bandwidth goes because they will be traveling across different interfaces. Your total bandwidth using multiple connections should be 2Gb/s, but each connection will be limited to a maximum of 1Gb/s.

Gadgeteering
  • 168
  • 5
1

Its also worth considering the hash algorithm being used by Linux. Some versions of the bond driver use very simplistic hash algorithms e.g l2 hash on linux 3.6.5 is just the xor of the last byte of the source and destination MAC which leads to unbalanced traffic in a lot of circumstances. Changing the hash algorithm to l2+l3 will help a lot.

  • OK so the Cisco switch would receive on two interfaces, fine. How the switch would know to transmit utilizing two interfaces? – kubanczyk Sep 20 '16 at 15:51