BGP preferred outbound peer for given prefix when none of my peers directly announce the prefix (it's handled by the Default route)

Question

I'm connected to 2 BGP peers; They both give me a filtered table that includes a default + about 30K prefixes each (not a full table). For prefixes that I do receive, I simply let BGP use it's prefix-length algorithm to select the best route. For prefixes that I don't receive (ie: the "default" applies) I favor BGP peer 1. The problem is, for one particular prefix that I don't receive from either BGP peer (ie: the default would apply), I'd like to favor BGP peer 2, not the usual peer 1. I know I could easily do this using a static route, but that doesn't sound quite right, because if I add a static route towards peer 2's router and the route drops, the static route would "stick" and I will not be able to push any traffic to that prefix. If I can find a BGP mechanism to do the same, I can prefer a route to BGP peer 2, but if that route is not available, the route to BGP peer 1 would be used.

To manipulate my out-bound routes I'm using route-maps that set the an equal local-preference for outbound traffic towards received prefixes so BGP's default prefix-list-length algorithm is used, and I'm setting a higher preference for outbound traffic using the "default" route for BGP peer 1.

Unfortunately I have no idea how to use a route-map to do this for my special prefix, because no object with that prefix exists in the system: as far as both BGP peers are concerned, it is handled by the "default" route.

I'm using Quagga for my routing, so BGP is not the only available protocol. A Cisco-style solution wold be fine as well, since I suspect I'm lacking some basic knowledge and any nudge in the right direction would help me find my way.

Here's my bgpd.conf file, edited to remove personal information; Hopefully I didn't over do it:

router bgp 12345
 bgp router-id 10.0.0.1
 network 10.0.0.0/24
 redistribute connected
 !
 neighbor 10.0.1.1 remote-as 22222
 neighbor 10.0.1.1 ebgp-multihop 3
 neighbor 10.0.1.1 next-hop-self
 neighbor 10.0.1.1 distribute-list distrib-out out
 neighbor 10.0.1.1 route-map INBGP1 in
 !
 neighbor 10.0.2.1 remote-as 11111
 neighbor 10.0.2.1 ebgp-multihop 10
 neighbor 10.0.2.1 next-hop-self
 neighbor 10.0.2.1 distribute-list distrib-out out
 neighbor 10.0.2.1 route-map INBGP2 in
!
access-list distrib-out permit 10.0.0.0/24
!
access-list is-default permit 0.0.0.0/0 exact-match
!
route-map INBGP2 permit 10
 set metric 2
 set ip next-hop 89.121.231.73
 on-match next
!
route-map INBGP2 permit 20
 match ip address is-default
 set local-preference 101
 on-match goto 1000
!
route-map INRTC permit 30
 set local-preference 110
 set metric 1
!
route-map INBGP2 permit 10
 set metric 1
 on-match next
!
route-map INBGP1 permit 20
 match ip address is-default
 set local-preference 200
 on-match goto 1000
!
route-map INBGP1 permit 30
 set local-preference 110
 set metric 1

David Schwartz · Accepted Answer · 2012-01-12T07:19:03.130

3

Add a static route with a next hop that goes to peer 2 only because of a route to peer 2. That way, so long as you are receiving that route, the static route will point to peer 2. But the second you lose that route, your existing failover will flip it.

Pick any route that goes to peer 2 but that will go away if you cannot reach peer 2. Then add a static route for the traffic you want to cover with a next hop covered by the route you picked. That will cause traffic that matches the second route to track the first route.

For example, say the prefix you want to control traffic to is 216.152.32.0/24. You craft a static route to 216.152.32.0/24 and you choose its next hop. Since it's a static route, traffic to 216.152.32.0/24 will be routed towards that next hop (assuming there's no more-specific route, which there won't be). So that reduces the problem to choosing an appropriate next hop.

You want the traffic to go one way when the link to peer 2 is up and the other way when that link isn't working. So you need to pick a next hop that has that property. In principle, any IP inside a route you dynamically receive from peer 2 will work. That traffic will go to peer 2 when you have that route and will follow your default to peer 1 when you don't. (Assuming your default is properly set up to failover.)

Ideally, pick a route that's "core" to peer 2, but not too close to your point of connection. You want it to be "core" to peer 2 because you don't want it to flip over to peer 1 ever. You don't want it too close to you because if your node gets isolate, you want to fail over the peer 2. If you do a traceroute to a few random sites, you may be able to find a core router in a nearby city, on their backbone. That will do.

edited Jan 12 '12 at 07:19

answered Jan 11 '12 at 07:12

David Schwartz

31,215
2
53
82

How do I add a static route "only because of a route to peer 2"? Sounds good in concept, but don't know how to do that. Can you provide an example, or some other pointer to get me going? – Cosmin Prund Jan 11 '12 at 07:27
@CosminPrund In the Cisco world you'd use SLA tracking to have the static route depend on the link's state - not sure how to get this done with Quagga, though. – Shane Madden Jan 11 '12 at 21:17
1

@CosminPrund Say peer 2 has a core router with IP address `1.2.3.4`. If you make the next hop on the static route `1.2.3.4`, traffic that matches that route will track traffic to `1.2.3.4`. That will be to peer 2 if peer 2 is connected and will take the default if peer 2 isn't. Just find the route you want it to track and set the next hop to an IP covered by that route. – David Schwartz Jan 11 '12 at 21:26
@ShaneMadden, adding a route that depends on the link state is easy, but the physical link and the availability of the BGP peer are two different things. For me one peer is "multihop", the other one installed a managed switch on site. In both cases, the BGP may die yet the physical link state would stay up. – Cosmin Prund Jan 12 '12 at 06:58
That's why I recommend tracking a route rather than a link. It's still not perfect, but it will never be perfect if you don't take full BGP tables. (And, at least arguably, not even then.) – David Schwartz Jan 12 '12 at 07:01
DavidSchwartz, the Linux router will reject a static route with a next-hop it doesn't understand, so the route to the next-hop needs to be static as well. If that's the case, we're down to @Shane's solution of linking the static route to the physical link state, and that's, in my opinion, sub-optimal and a tad dangerous. – Cosmin Prund Jan 12 '12 at 07:02
@CosminPrund You want it to reject the route if it doesn't understand the next-hop. That's the whole point. The OP wants that behavior: "If I can find a BGP mechanism to do the same, I can prefer a route to BGP peer 2, but if that route is not available, the route to BGP peer 1 would be used." – David Schwartz Jan 12 '12 at 07:07
David, could you maybe provide an actual example of "adding a static route that's covered by a BGP received route?". It doesn't matter if it's Cisco/IOS syntax or anything else, I just can't quite grasp the concept. I'm stuck with the notion of static routes having a known and accessible next hop, but at the time the static routes are inserted there's no BGP. Are you suggesting identifying a server that's known to be on Peer 2's network and artificially adding a route with that IP as the next hop? – Cosmin Prund Jan 12 '12 at 07:15
@Cosmin Let me clarify; I didn't mean that link state should be used. SLA tracking in the Cisco umbrella verifies IP reachability of a specified host (say, your direct BGP peer or other router of that ISP). Take a look [here](http://www.cisco.com/en/US/docs/ios/12_3/12_3x/12_3xe/feature/guide/dbackupx.html). – Shane Madden Jan 12 '12 at 07:17
See my updates to the answer. (Also, since you have a default route, how can the next hop ever not be understood?! You have a defined route to *every* possible IP address.) – David Schwartz Jan 12 '12 at 07:19
I'm always thinking of what happens on a restart. The default route comes from BGP, and on restart I will have no default until BGP is up. Besides the Linux kernel rejects multi-hop next-hop routes by default (I assume there's a way to change that, but no reason to do so). This means that by default I can only have a route with a next-hop on a network I'm directly connected to. I'm going to implement something along the lines of Cisco's "IP SLA" and I'm going to use that to insert and remove my routes as appropriate; I need something like that for other forms of "magic" any way. Thank you. – Cosmin Prund Jan 13 '12 at 08:32
Does it really do that?! That's **seriously** broken. Say I know `10.1.1.0/24` is handled by the router with loopback IP address `10.1.2.1/32`, but my route to that router `10.1.2.1` is through 2 routers which may or may not always be both up. Rejecting the route just because its next hop is dynamically learned and indirect makes *no* sense. It's not unusual at all to know that a particular router handles a particular route by its loopback address not on any physical network. You want traffic to go to that router regardless of how this machine reaches that router. – David Schwartz Jan 13 '12 at 08:41
I'm not sure it's all that broken, and I'm sure the behavior can be changed. The thing is, the IP packet's header itself only has Source and Destination IP addresses: At the IP layer there's no way to say "send this packet to that IP so it will re-send it to that IP". If you're not directly connected to the gateway router your packet needs to do Source -> Router1 -> Router2, where Router1 is guessed because it's the router that handles the route for Router2. Might as well re-write the rule to specify Router1 as the gateway. – Cosmin Prund Jan 13 '12 at 09:19
@CosminPrund That still won't work. Even if you do specify Router1 as the gateway, it won't be *directly* reachable because routers typically have /32 addresses. You don't just have to know the router's address, you have to know the path to it, which won't work if the path is dynamic. Say you have a router with an interface in net1 and an interface in net2, either of which may be up or down. You don't want to use either net interface as the next hop, you want to use the networkless loopback address. It's really broken. – David Schwartz Jan 13 '12 at 15:49

Jeff Loughridge · Answer 2 · 2012-01-12T00:12:27.630

There is a straight forward solution to your problem that doesn't involve static routes. Before delving into it, I want to touch on one semantic aspect in your description. The longest prefix match rule has nothing to do with BGP. You should be thinking about the BGP path selection algorithm. Every vendor has its own tweaks on this algorithm; however, most are very similar to Cisco's algorithm.

You have complete control over traffic in the direction from your site to the ISP. You can use local preference or Multi-Exit-Discriminators (MED) to influence routing in this direction. I recommend using one or the other for simplicity's sake. Implement this policy on prefixes the ISP sends to you (i.e., your inbound policy).

1) Local Preference - The highest local preference is preferred (default local preference is 100). Local preference is earlier in the BGP best path solution and is evaluated prior to AS_LENGTH. Your provider cannot send you prefixes with local preference; local preference does not cross AS boundaries.

2) MED - The lowest MED is preferred. Your provider may send MEDs on its prefixes. On Cisco routers, no MED is treated as MED = 0 unless you enable bgp bestpath med missing-as-worst. MEDs are evaluated after AS_LENGTH in the BGP path selection algorithm.

You may want to read Cisco's BGP case studies. I also highly recommend the book Internet Routing Architectures by Sam Halabi for detailed info on BGP and its usage.

That would only work if he has two BGP routes for the same prefix and wants to control which one is preferred. That's not his situation -- he has no BGP route for the prefix. — David Schwartz, Jan 12 '12 at 02:56
@DavidSchwartz Thanks for correcting. I didn't interpret his question correctly. I'll leave my answer in case folks are seeking a more general solution for influencing traffic toward the ISP. — Jeff Loughridge, Jan 12 '12 at 13:56

BGP preferred outbound peer for given prefix when none of my peers directly announce the prefix (it's handled by the Default route)

2 Answers2