So in the first part of the series, we started with rather a simple decomposition model to get bit more insights into the routing protocol internals.
So let's continue the series by expanding on the very first layer in our model - Peering Management.
In any routing protocol before we get fancy in terms of which all features, functions and knobs to use, the very basic requirement is to peer with other devices into the network since eventually a routing protocol is nothing but a distributed database. Routing protocols are designed to convey different set of information to its peer devices such as:
- Topology Information
- Reachability Information
- Policy Information
The type of information that a given routing protocol would exchange with its peers would largely depend on the protocol itself (OSPF, IS-IS, BGP) & where that proposed routing protocol is used into the network (Campus, WAN, Metro-E, DC) as implementation specifics do change.
As you may notice, there are lot of things working behind the scenes when it comes to peering in a routing protocol context. But don't get carried away by looking at the complexity. Once you look closely, all of these pieces kind of makes sense.
So let's start with the top row, reading it from left to right.
Self Identity - Before the routing protocol determines whom it needs to communicate with and what information needs to be exchanged, it must find its own identity first. The most common way to give an identity to the routing protocol instance itself is assigning it a router id (RID). The RID can be configured manually or it can be derived automatically depending upon the platform and NOS.
In real life assuming network virtualization is much more common today, you are allowed to configure unique RID for each routing protocol as well as a unique RID for each instance/process under same protocol.
Protocol Addressing - Once the routing protocol is able to define its identify with a RID, the next step is to understand it's addressing. The addressing serves many purposes but the most basic one is to provide location services from the overall network view standpoint. Though protocols such as LISP was an attempt to separate device identity from device location, due to limited use cases (such as mobility) and other problems it never really took off well.
Every routing protocol has it's own addressing scheme which may further have impact on its scaling and expected working behavior if not done correctly.
Participation - The next step for the routing protocol is to determine its participating interfaces on a given device and in certain cases the entire device itself. Depending upon the design you may run into some interesting challenges though.
Reliable Transport - Every routing protocol needs a reliable transport to be able to effectively communicate with its peer device. The reliability itself is an important aspect and while some routing protocols such as BGP rely upon existing TCP stack, others like OSPF uses IP protocol 89 & EIGRP choose its own transport protocol namely RTP.
Neighbor Discovery - In the next step the protocol must discover its peer/neighbor/adjacent device depending upon which routing protocol you are following. The protocol while ideally should keep a track of its neighbor and relationship state, this may or may not be implemented.
The neighbor could be configured manually or dynamically discovered, while the discovery phase itself may use unicast or multicast as a transport for reachability purpose to the next hop device. Also the reachability to the neighbor could be over layer 2 transport or layer 3 transport depending upon the protocol. IS-IS and many other IOT industrial protocols operate at layer 2 for example. In case of eBGP, the neighbor in fact may be multiple physical hops away.
Neighbor Identity - While we may have discovered our neighbor, it doesn't mean the neighbor itself is a an intended neighbor or a legitimate neighbor always. After all someone might want to spoof or sometimes we may end up discovering somebody completely un-intentionally. So sharing any information with an unexpected neighbor won't make any sense. To prevent this we have several measures which we can put in place such as Authentication, Validating neighbor's identity (remember they also have RID), Validating neighbor based on IP Packet's TTL value etc. such as in OSPF & BGP. The modern day solutions such as SD-WAN usually uses RPKI over TLS/DTLS channel.
Establish Session - Once the neighbor is discovered and validated, we finally establish a session with it. Depending upon the protocol, we might have a single session vs. multiple sessions going on. A simple example would be networks running IPv4 & IPv6 at the same time under single routing protocol instance. While some implementations exchange information related to both IPv4 and IPv6 over a single session, some may do it over a separate dedicated session for each of them. Long time ago there was an attempt to run multi session bgp for MTR (Multi Topology Routing) for network virtualization use cases.
Capabilities Exchange - This is an another interesting step where the routing protocol running on separate devices exchange capabilities with each other to find the lowest common denominator. For example we know BGP is more like an application that runs on top of TCP as opposed to a pure layer 3 routing protocol which only carries routes. BGP though can carry layer 3 routing information and in case with most vendors it's been the default behavior, BGP does allow us to carry many other set of information depending upon the use case in the form of AFI/SAFI which is essentially an encoding format. For example BGP can carry MAC Addresses information under Layer 2 VPN EVPN address family when enabled. Though with BGP, you got to be cautious about enabling a new capability/address family in production network as highlighted here.
Establish Adjacency - The protocol reach this far and finally the peers are ready to exchange required set of information needed to populate RIB & other details such as topology graph. An interesting example in a routing protocol context would be (In case you are wondering in which scenario two devices would be neighbors but not adjacent) OSPF.
Messages Exchange - Finally we [assuming by now you are thinking like a routing protocol :) ] reach this stage where in we finally start exchanging information through messages. The messages needs to be sent reliably, keeping track of to understand which one to prefer in case of being received from multiple sources, acknowledged and so forth besides how to queue and dequeue them and at what intervals those should be sent out vs. being hold back for a while to pack multiple events together for optimization and getting the latest information being sent out.
Further Readings:
Network Routing: Algorithms, Protocols, and Architectures
Network Algorithmics,: An Interdisciplinary Approach to Designing Fast Networked Devices
Inside Cisco Ios Software Architecture
HTH...
A Network Artist 🎨