1

I have an interesting problem: Our load balancer (a FiberLogic OptiQroute 2140) seems to consistently favor one WAN connection over another (and not one I want it to).

We have two WAN connections from different service providers. One is a SDSL connection at 3Mbps (WAN1) and the other is provided by a cable modem (WAN2) at 4Mbps/1Mbps. Our publically available services all sit on a /24 segment on the WAN1 connection. This includes our mail gateways, DNS, access to Exchange's OWA, various sundry websites and a few well-used VPN services. We get quite a bit of inbound traffic on this side of the load balancer.

The other side (WAN2) is largely unused for inbound traffic. The load balancer is accordingly configured to weight this side. We want to push as much outgoing traffic out the WAN2 connection as possible. In practice this does not occur. What will happen is that a combination of inbound traffic (the SMTP gateway sending email, contractors using the VPN, remote users accessing OWA, etc.) will consume about 1/3 to 1/2 of the available bandwidth on WAN1. Then 300 or so users all hop on Facebook. Some of this traffic will head out WAN2, but enough of it is assigned to the WAN1 connection that it will saturate the link and then everything runs slow for everybody. Really slow. Meanwhile there's still 2Mbps worth of bandwidth space being unused on the other connection. This happens frequently enough to be more than a nuisance.

WAN1:

WAN1

WAN2:

WAN2

So as you can see from the Rrdtool graphs while WAN1 is hovering around 2.5-3Mbps during peak usage, WAN2 seems to always stay around 1-2Mbps range. The Priority and Weighted values have been set the maximum value (65535) for WAN2 and set to 10 for WAN1. The manual has this to say about these configurations options:

Priority: Enter a priority value for the WAN
port. The range is between 0 and 65535.

Weighted: Enter a weight value for the WAN
port. The range is between 0 and 65535.

The higher the weight placed on an interface, the more opportunity for this interface to transmit data. 
For example, when the device makes use of 2 WAN ports. WAN 1 has a weighted value of 1 and WAN 2 has a weighted value of 3. 
The transmission will be like the following:

    At t=0, WAN 1 transmits
    At t=1, WAN 2 transmits
    At t=2, WAN 2 transmits
    At t=3, WAN 2 transmits
    At t=4, WAN 1 transmits

So what am I doing wrong here? Theoretically by setting the Priority and Weighted values for WAN2 so high, that unless a packet is already part of an established flow started by inbound traffic on WAN1, nine times out of ten it should be sent out WAN2. In fact, this is the behavior we want out of the OptiQroute; if a packet's not already part of established inbound flow, send it out WAN2. WAN1 should essentially be used just for serving our /24 public network segment.

How can I configure the OptiQroute to get this behavior?

1 Answers1

0

I dislike answering my own question but I did some more digging in the OptiQroute's documentation and discovered the magic settings.

The OptiQroute does something called 'HyperNAT'. It is not completely clear whether or not this is Network Address Translation. As far as I can tell, it is presumably just forwarding and re-direction but the actual implementation is pretty opaque. Regardless, the OptiQroute supports multiple means of "redirection" when more than one interface is involved. These settings are under Configuration -> NAT -> Scheduling-method for HyperNAT.

The default is 'auto-learning' which completely ignores any of the Priority and Weighted settings. Auto-learning "chooses a WAN port with the highest unused downstream bandwidth and a feedback threshold under 66% (default)." There's a separate algorithms using the Priority and Weight (Weighted Round Robin) settings. I decided to use the Weighted Round Robin as it seemed closest to my goal of evenly spreading the traffic across the two WAN ports but at the same time "preferring" the WAN2 connection.

With the Weights set at WAN1/10 and WAN2/100 almost all of our traffic leaves through the WAN2 side. Using Sflow to look at the protocols, the WAN1 connection essentially sits around 1Mbps doing all of our HTTPS and DNS traffic with occasional bursts up to 2.5Mbps when email is sent or received. Also mixed in there is our VPN access. I'm still playing around with less aggressive weight settings, but even with pretty limited HTTP traffic (which composes about 80% of the outgoing traffic) going over the WAN1 link it looks like we can still saturate the connection when email kicks off. Moving to a less aggressive weighting of 10/50 pushes a pretty constant 1Mbps of traffic (mostly HTTP) over the WAN1 link.

There's no specific documentation on the nature of the Weights settings. I can't tell whether or not the values are absolute or relational. That is to say, is the important value the difference between the WAN interface's Weights (e.g., 1:10) or is each Weight evaluated independently as an absolute value. The documentation is not helpful here, but my experimentation leads me to believe the Weights are evaluated proportionally to each other as ratios. I eventually settled on using WAN1/10 and WAN2/60 for Weight settings.

The other magic setting is in Configuration -> Interface -> WAN -> 'Smart Outgoing'. If a WAN connection has reached its maximum specified bandwidth, this will push the traffic to another WAN interface. Without this enabled (and with aggressive weighting settings), it'll just completely saturate the 4Mbps available on the WAN2 side. Instantly.

Tubes unclogged.


WAN1:

WAN1 Fixed

WAN2:

WAN2 Fixed