Based on an earlier question over a year ago (Multiplexed 1 Gbps Ethernet?), I went off and setup a new rack with a new ISP with LACP links all over the place. We need this because we have individual servers (one application, one IP) serving up thousands of client computers all over the Internet in excess of 1Gbps cumulative.
This LACP idea is supposed to let us break the 1Gbps barrier without spending a fortune on 10GoE switches and NICs. Unfortunately, I've run into some problems regarding with outbound traffic distribution. (This despite Kevin Kuphal's warning in the above linked question.)
The ISP's router is a Cisco of some sort. (I deduced that from the MAC address.) My switch is an HP ProCurve 2510G-24. And the servers are HP DL 380 G5s running Debian Lenny. One server is a hot standby. Our application cannot be clustered. Here is a simplified network diagram that includes all relevan network nodes with IPs, MACs and interfaces.
While it has all the detail it is a bit hard to work with and describe my problem. So, for simplicity's sake, here is a network diagram reduced to the nodes and physical links.
So I went off and installed my kit at the new rack and connected my ISP's cabling from their router. Both servers have an LACP link to my switch, and the switch is has an LACP link to the ISP router. Right from the start I realized that my LACP configuration was not correct: testing showed all traffic to and from each server was going over one physical GoE link exclusively between both server-to-switch and switch-to-router.
With some google searches and lots of RTMF time regarding linux NIC bonding, I discovered that I could control the NIC bonding by modifiying /etc/modules
# /etc/modules: kernel modules to load at boot time.
# mode=4 is for lacp
# xmit_hash_policy=1 means to use layer3+4(TCP/IP src/dst) & not default layer2
bonding mode=4 miimon=100 max_bonds=2 xmit_hash_policy=1
loop
This got the traffic leaving my server over both NICs as expected. But the traffic was moving from the switch to router over only one physical link, still.
We need that traffic going over both physical links. After reading and rereading the 2510G-24's Management and Configuration Guide, I find:
[LACP uses] source-destination address pairs (SA/DA) for distributing outbound traffic over trunked links. SA/DA (source address/destination address) causes the switch to distribute outbound traffic to the links within the trunk group on the basis of source/ destination address pairs. That is, the switch sends traffic from the same source address to the same destination address through the same trunked link, and sends traffic from the same source address to a different destination address through a different link, depending on the rotation of path assignments among the links in the trunk.
It seems that a bonded link presents only one MAC address, and therefore my server-to-router path is always going to be over one path from switch-to-router because the switch sees but one MAC (and not two--one from each port) for both LACP'd links.
Got it. But this is what I want:
A more expensive HP ProCurve switch is the 2910al uses level 3 source & destination addresses in it's hash. From the "Outbound Traffic Distribution Across Trunked Links" section of the ProCurve 2910al's Management and Configuration Guide:
The actual distribution of the traffic through a trunk depends on a calculation using bits from the Source Address and Destination address. When an IP address is available, the calculation includes the last five bits of the IP source address and IP destination address, otherwise the MAC addresses are used.
OK. So, for this to work the way I want it to, the destination address is the key since my source address is fixed. This leads on to my question:
How exactly & specifically does layer 3 LACP hashing work?
I need to know which destination address is used:
- the client's IP, the end destination?
- Or the router's IP, the next physical link transmission destination.
We've not gone off and bought a replacement switch yet. Please help me understand exactly if the layer 3 LACP destination address hashing is or is not what I need. Buying another useless switch is not an option.