Sunday, December 20, 2020

BGP Nuts - The Crazy Metric Story



Recently ran into an interesting BGP behavior. In order to convey the issue, I have oversimplified the design.

R2 (AS 100) has a network host 22.22.22.22/32 which is trying to reach R7 (AS 200) attached network host 77.77.77.77/32. 

As you can see, between AS 100 & AS 200 we have multiple exit points for the purpose of redundancy. However the interesting twist here is that those BGP next hops are learned from different IGPs.

54.0.0.5 <- Learned from EIGRP
63.0.0.6 <- Learned from OSPF

Given the scenario, what you think about which BGP route and next hop would make into the routing table ?

Well...most likely the answer would be BGP next hop learned from EIGRP because (Assuming... 1. " no auto-summary " is now default for EIGRP (resulting into same Prefix Length for route) , 2. EIGRP AD is 90 compare to OSPF Intra Area route which has AD 110. So based on theory how Router Process this information, the EIGRP learned Next hop (54.0.0.5) should be the one picked up by BGP Best Path Selection Algorithm and should be picked up as the best route over OSPF learned Next Hop (63.0.0.6)

All good ? ... Let's look at the BGP RIB to find it out and How IGP next hops are processed based on standard BGP Next Hop Processing approach that we know from standard BGP theory. Also let's run a quick traceroute towards 77.77.77.77 sourcing from 22.22.22.22 on R2



Now sure if you expected that...right ? :)

From BGP RIB standpoint we can clearly see that BGP route that's making into routing table & FIB essentially is the one that has OSPF learned BGP next hop.

Here is what our OSPF database looks like



Let's do a basic failover test by shutting down interface g6/0 on R3 




So things works fine and as expected here, so let's "un-shut" the interface in order to dig deeper into the behavior ( a little unexpected one)




So why this behavior ?...Well this is where BGP plays a little dumb. It actually ends up comparing METRICs across two IGPs since it doesn't have view of EIGRP Topology Table and OSPF Link State Database. Now if you go back and check IGP metrics for next hops learned from EIGRP vs. OSPF, The metric for EIGRP route is pretty high (in numeric ) as 3072 vs. OSPF route as 2 (Refer to first CLI screenshot in the post)

Now what if we make the OSPF metric look bigger than EIGRP by manually increasing OSPF cost to see if that would allow R2 to install EIGRP learned route as best route





And Yes it works as expected this time too :)

Now for fun let's make the metrics look same by again changing the OSPF cost manually


Well that works too as expected. Now interesting question here is - Since IGP metrics learned from OSPF and EIGRP look same, can we turn ON " i-BGP Multipath " and achieve load balancing.

Let's try that 


Well that works well too and as expected.

One of the problem incase you haven't noticed yet with this design is - Assume the path through OSPF has many more hops added before you reach to the OSPF learned BGP ASN exit point. Since OSPF cost metrics are much lower value compare to EIGRP, You will end up taking sub-optimal path from R2's standpoint in this case which may not be desired behavior. Alternatively You can replace EIGRP with IS-IS and you will by default still end up following OSPF learned BGP next since in most IOS version 10 is the default IS-IS metric for each interface hop. Obviously in practice you should try to run Single IGP across AS 100 here to avoid such issues though M&A scearios are always interesting and challenging. :)

Try this with " DMZ Link-BW " for Unequal Cost Multi-Path and I am pretty sure it will be fun. BGP AIGP NLRI is another interesting bit if you care depending upon the design. 

And of course, the above mentioned logic doesn't apply to following design where we have single exit point being reachable through both IGPs




To learn some more around BGP anomalies and somewhat un-predictable behavior:



And if You want to master BGP from Design Standpoint, I would highly recommend " BGP Zero to Hero Design Masterclass " from my friend Orhan Ergun



HTH...
A Network Artist

1 comment:

William said...

Thanks

I recently had to start learning more about bgp and the way ot acts. How I do not use Cisco but was cisco certified. I use Mikrotik and Juniper. Interesting read. Well explained!