0

As I understand it (based on the kernel docs) the setting of "arp_filter=1" necessitates the use of a source based routing policy to allow multiple interfaces to route traffic, potentially (but not necessarily) between disconnected network segments.

QUESTION 1) Is "arp_filter=1" a requirement for utilizing the bandwidth of multiple links?

Anyway, for load-balancing what I did was to use a default route with multiple nexthops. At which point, if I use "ip route get [addr]" (for an address outside the local subnet) it seems to suggest that using multiple nexthops for load-balancing works...

[root@localhost ~]# ip route get 10.20.44.100
10.20.44.100 via 10.20.30.1 dev eth2  src 10.20.30.42
    cache
[root@localhost ~]# ip route get 10.20.44.100
10.20.44.100 via 10.20.30.1 dev eth3  src 10.20.30.43
    cache
[root@localhost ~]# ip route get 10.20.44.100
10.20.44.100 via 10.20.30.1 dev eth2  src 10.20.30.42
    cache
[root@localhost ~]# ip route get 10.20.44.100
10.20.44.100 via 10.20.30.1 dev eth0  src 10.20.30.40
    cache

If however, I perform this same experiment with an address on the same subnet, then only the default interface is used for outgoing traffic.

[root@localhost ~]# ip route get 10.20.30.100
10.20.30.100 dev eth0  src 10.20.30.40
    cache
[root@localhost ~]# ip route get 10.20.30.100
10.20.30.100 dev eth0  src 10.20.30.40
    cache
[root@localhost ~]# ip route get 10.20.30.100
10.20.30.100 dev eth0  src 10.20.30.40
    cache
[root@localhost ~]# ip route get 10.20.30.100
10.20.30.100 dev eth0  src 10.20.30.40
    cache

QUESTION 2) Is there any way to get more information as to what routing table this route comes from? And what rule was used to get there?

Now, I had a guess that the failure to load-balance was caused by the fact that there are the link-scope routes in the main routing table that were matching before the default route could be used... I tested this theory by inserting the same multiple-nexthop default route into a table that would match earlier than the main routing table

[root@localhost ~]# ip rule
0:      from all lookup local
32761:  from all to 10.20.30.0/24 lookup 100
...
32766:  from all lookup main
32767:  from all lookup default
[root@localhost ~]# ip route show table 100
default
        nexthop via 10.20.30.1  dev eth0 weight 1
        nexthop via 10.20.30.1  dev eth1 weight 1
        nexthop via 10.20.30.1  dev eth2 weight 1
        nexthop via 10.20.30.1  dev eth3 weight 1

And right or wrong, it produced the correct results: My traffic now was "load balanced" (not perfectly - but still) across all interfaces:

[root@localhost ~]# ip route get 10.20.30.100
10.20.30.100 via 10.20.30.1 dev eth0  src 10.20.30.40
    cache
[root@localhost ~]# ip route get 10.20.30.100
10.20.30.100 via 10.20.30.1 dev eth3  src 10.20.30.43
    cache
[root@localhost ~]# ip route get 10.20.30.100
10.20.30.100 via 10.20.30.1 dev eth2  src 10.20.30.42
    cache
[root@localhost ~]# ip route get 10.20.30.100
10.20.30.100 via 10.20.30.1 dev eth0  src 10.20.30.40
    cache

The biggest problem I see here is that, as far as I can tell, I have just made it so that traffic that would have otherwise never gone past my local switch is now going to be routed (possibly multiple hops to my default route) no matter what.

QUESTION 3) Is there a simpler/better way to manage inter & intra-subnet load-balancing other than using the extra default route table?

1 Answers1

0

I can only comment on question 3. You should look at Ethernet bonding (802.3ad). That should be much simpler than IP based load balancing.

Tero Kilkanen
  • 34,499
  • 3
  • 38
  • 58