OMP Route Advertisements

In the Cisco SD-WAN solution, WAN Edge routers advertise their local networks to the Cisco vSmart controllers using the Overlay Management Protocol (OMP). In a typical production deployment, most local networks are attached to two or more vEdge devices for redundancy and most networks get advertised from multiple devices. Additionally, each subnet is advertised as reachable via each Transport Locator (TLOC) that the WAN edge router has.

OMP Routing Advertisements
Figure 1. OMP Routing Advertisements

If we look at the example shown in figure 1, three vEdge routers are connected to subnet 1.2.3.0/24. The first one advertises to the vSmart controller that 1.2.3.0/24 is reachable via TLOC T1 and also that subnet 1.2.3.0/24 is reachable via TLOC T2. In the same manner, the other two WAN edge routers advertise that subnet 1.2.3.0/24 is reachable via TLOCs 3,4,5, and 6. In the end, the vSmart controller has six OMP routes for prefix 1.2.3.0/24. By default, vSmart is configured to advertise only four paths for a given prefix. Therefore, it must compare all available routes for this prefix and select the best four that will be sent out to all WAN edge routers. This is done using the Overlay Management Protocol Best-Path Algorithm. 

Note that the number of routes per prefix that vSmart advertises can be changed using the following configuration:

send-path-limit (1-16)

It is also possible to configure the controller to send backup paths using the following configuration:

send-backup-paths

OMP Best-Path Algorithm

vSmart controllers and vEdge routers perform the Best-Path Selection when they have multiple routes for the same prefix. Figure 2 shows the Best-Path algorithm that Cisco SD-WAN devices go through.

OMP Best-Path Selection Algorithm
Figure 2. OMP Best-Path Selection Algorithm

Let's look at each step of the OMP Best-Path Selection in more detail:

  1. Prefer ACTIVE routes over STALE routes. A route is ACTIVE when there is an OMP session in UP state with the peer that sent out the route. A route is STALE when the OMP session with the peer that sent out the route is in GRACEFUL RESTART mode;
  2. Select routes that are Valid. Ignore invalid routes. A route must have a next-hop TLOC that is known and reachable.
  3. Prefer routes with lower administrative distance (AD) (on vEdge only). AD is a locally-significant value on each router and depends on the OS. Different platforms may have different AD values for different protocols. For example, OMP has AD of 250 on vEdges and 251 on cEdges.  Additionally, network admins can define floating static routes with various ADs for the same prefix. AD is only compared when the same WAN edge router receives the same site-local prefix from multiple routing protocols. AD is not a parameter in OMP, is not advertised, and does not influence vSmart. 
  4. Prefer routes with a higher route preference value. By default, all omp routes have 0 preference. This is typically the most often used value when we need to do traffic engineering.
  5. Prefer routes with a higher TLOC preference value (on vEdge only). TLOC preference is a parameter in TLOC routes. And TLOC routes are not bound to VPN-id. Therefore, changing the TLOC preference affects vEdges path selection for all VPNs.
  6. Compare the origin type, and select the first match in the following order:
    1. Connected 
    2. Static
    3. EIGRP summary
    4. EBGP 
    5. OSFP intra-area 
    6. OSPF inter-area 
    7. IS-IS level 1
    8. EIGRP external
    9. OSPF external 
    10. IS-IS level 2
    11. IBGP 
    12. Unknown
  7. Compare the origin metric - If the origin type of the routes is the same, select the routes that have the lower origin metric.
  8. Tiebreaker - Prefer vEdge sourced routes over vSmart sourced. (on vSmart only)
  9. Tiebreaker - If the origin types are equal, select the routes that have the lowest router-id (System-IP).
  10. Tiebreaker - If the router IDs are the same, prefer the routes with the lowest private TLOC IP address.

ECMP - To be considered equal, omp routes must be valid and equal-cost up to step 8 (green and blue steps in figure 2). When there are more equal-cost routes than the send-path-limit value,  the controller sorts the best ones based on the tiebreakers in descending order and advertises as many as the send-path-limit. This is visualized in figure 1. 

Note that Cisco vEdge routers install a route in their forwarding table (FIB) only if the TLOC to which it points is active. Active TLOCs are ones that have a BFD session in UP state associated with them. When for whatever reason a BFD session becomes down(inactive), the Cisco vSmart/vEdge devices remove all routes that point to that TLOC from their forwarding table.

Best-Path Selection Examples

Let's look at some basic but important examples:

  • A vSmart controller receives a route to 1.2.3.0/24 from a Cisco WAN Edge router with an origin code of eBGP.  The controller also receives the same route from another vSmart Controller, also with an origin code of eBGP. Assuming all other properties are equal, the best-path algorithm would choose the route that came from the Cisco vEdge device.

  • A Cisco vSmart Controller learns the same route, 10.10.10.0/24, from two Cisco vEdge devices on the same site. If all other parameters are the same, both routes are chosen and advertised to other peers. By default, up to four equal-cost routes are selected and advertised.

  • A Cisco vSmart controller receives eight OMP routes for prefix 172.16.1.0/24. The send-path-limit value is the default one - 4. Six of them are chosen as best based on the OMP best path algorithm. They have the flag C set as you can see on the output below. The reason the other two have not been chosen as best can be seen in the loss-reason, lost-to-peer, and lost-to-path-id columns.

vSmart# show omp routes 172.16.1.0/24 detail | t
Code:
C   -> chosen
I   -> installed
Red -> redistributed
Rej -> rejected
L   -> looped
R   -> resolved
S   -> stale
Ext -> extranet
Inv -> invalid
Stg -> staged
IA  -> On-demand inactive
U   -> TLOC unresolved
                                                                                             LOST                                                                                                                    
                                                                                             TO                                                                                                                      
                                PATH                                                         PATH   ATTRIBUTE                                                                                                        
VPN  PREFIX          FROM PEER  ID     LABEL    STATUS    LOSS REASON       LOST TO PEER     ID     TYPE       TLOC IP     COLOR            ENCAP  PROTOCOL         METRIC           DOMAIN ID   SITE ID  ORIGINATOR 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1    172.16.1.0/24   1.1.1.3    66     1004     C,R       -                 -                -      installed  1.1.1.3     mpls             ipsec  connected        0                -           2        1.1.1.3  -  
                     1.1.1.3    69     1004     C,R       -                 -                -      installed  1.1.1.3     public-internet  ipsec  connected        0                -           2        1.1.1.3  -  
                     1.1.1.4    66     1004     R         origin-protocol   14.1.1.1         69     installed  1.1.1.4     mpls             ipsec  OSPF-intra-area  11               -           3        1.1.1.4  -  
                     1.1.1.4    69     1004     R         tloc-id           1.1.1.4          66     installed  1.1.1.4     public-internet  ipsec  OSPF-intra-area  11               -           3        1.1.1.4  -  
                     11.1.1.1   34     1004     C,R       -                 -                -      installed  11.1.1.1    mpls             gre    connected        0                -           11       11.1.1.1  - 
                     11.1.1.1   66     1004     C,R       -                 -                -      installed  11.1.1.1    mpls             ipsec  connected        0                -           11       11.1.1.1  - 
                     11.1.1.1   69     1004     C,R       -                 -                -      installed  11.1.1.1    public-internet  ipsec  connected        0                -           11       11.1.1.1  - 
                     14.1.1.1   69     1004     C,R       -                 -                -      installed  14.1.1.1    public-internet  ipsec  connected        0                -           14       14.1.1.1   -

So there are six equal-cost routes (flagged with C) to 172.16.1.0/24, but the send-path-limit value is set to 4. Therefore, the controller sorts them based on the lowest originator System-IP and if there is a tie, based on the lowest TLOC private IP address, and advertises out the best four. 

OMP Graceful Restart

While studying how Cisco SD-WAN works, have you ever wondered what happens with the overlay fabric when the SD-WAN Control Plane becomes unavailable? Well, there is a feature called OMP Graceful Restart that allows the data plane to continue functioning and forwarding traffic even if the control plane suddenly goes down or becomes unavailable. WAN edge devices do this by using the last known routing information that they received from the vSmart controllers. At the same time, vEdges actively try to re-establish a control-plane connection to the vSmart controllers. When the controllers are back up and reachable, DTLS control connections are re-established, and the vEdge routers then receive updated and refreshed network information from the vSmart controllers.

OMP Graceful Restart
Figure 3. OMP Graceful Restart

Cisco vEdge and vSmart devices cache the OMP information that they learn from peers. The cached information includes OMP, TLOC, and SERVICE routes, IPsec SA parameters, and the centralized data policies in place. When a WAN edge device loses its OMP peering to the vSmart controller, the device continues forwarding data traffic using the cached OMP information. The Edge device also periodically checks whether the vSmart controller has come up. When it does come back up and the control plane peering is re-established, the device flashes its local cache and refreshes the control plane information from the vSmart controller. This same technique is valid in the opposite scenario when a vSmart controller no longer detects the presence of Cisco vEdge devices. It then uses its local cache until the WAN edge device becomes reachable again.

Manipulating the Best-Path Selection Process

Similar to the well-known process of manipulating the BGP best-path selection process with route-maps, in Cisco SD-WAN we have the ability to manipulate the route selection locally on WAN edge devices. This is typically done using a Local device template. We have done that many times in traditional networking. The more interesting ability that SD-WAN allows us to do is to manipulate the routing information that goes in and out of the controller's routing table. This allows for a per-prefix network-wide routing manipulation, similarly to modifying route properties on a BGP Route-Reflector. 

Manipulating the Cisco SD-WAN best-path selection process
Figure 4. Manipulating the Cisco SD-WAN best-path selection process

There are two configuration options that allow for that:

Using Inbound Centralized Policy - With an inbound centralized policy, we can change the origin code or the TLOC preference of a particular prefix before it goes into the routing table of the vSmart controller. This will then influence the best-path selection and lead to a different output from the best-path algorithm. Remember that the best routes are then advertised downstream to all WAN edge routers. Therefore, any manipulation of the controller's routing table will change the control-plane information across the whole overlay fabric.

Using Outbound Centralized Policy - With an outbound centralized policy, we typically modify the routing information that is sent to a particular set of WAN edge devices in order to influence their own best-path selection algorithm.

Comments

saudesh

Fri, 05/28/2021 - 15:55

This is really a very informative tutorial.
I have few doubts about OMP, which I am not able to figure out. Would you please help me with below points:
1) About point 9(best path selection): All Cisco docs says compare Router ID, however OMP doesn't have router id instead it uses System IP as a peer ID. Why dont we call it System IP instead of Router id.

2) In which scenario, tie breaker(point 9 & 10) comes in picture. If they always consider then there would be only one route selected.
3) For OMP to consider two routes "equal cost" what all attributes must be same.

In reply to by saudesh

Ivan.Ivanov

Sat, 05/29/2021 - 09:16

Hi Saudesh,
That's an excellent question! My understanding of these is the following:
1) Indeed in the official documentation, it says "Router-ID" but in reality, they mean System-IP. In the official Ciscopress book, it is written System-IP for this step.
2) Tiebreakers are considered when there are more equal-cost routes than the send-path-limit value, so the controller must advertise only a limited number of routes.
3) To be considered equal-cost, the routes must be equal up to the tiebreakers. The steps highlighted with green and blue in figure 2.
I updated the lesson with some additional explanations and one more example.
Thank you for your feedback!
Ivan

rahmanshirazur

Fri, 07/09/2021 - 15:39

What is different between Route Preference and TLOC preference . Are we refering OMP prefrence as route Prefrence

In reply to by rahmanshirazur

Ivan.Ivanov

Sat, 07/10/2021 - 17:01

Hi, rahmanshirazur,
In the context of Cisco SD-WAN, the terms "Route Preference", "vRoute Preference," and "OMP Preference" are used interchangeably and refer to the preference value in an OMP route (vRoute). Simply put, the preference value found in the output of "show omp route detail".
On the other hand, the TLOC preference is the preference value found in a TLOC route. Basically, the one found in the output of "show omp tlocs detail".
Hope it helps,
Ivan

supriyo1977

Wed, 08/04/2021 - 15:15

great tutorial

HNG

Tue, 09/21/2021 - 04:51

Just checking if the comparison of the Origin Type needs to be updated?

Connected
Static
EBGP
EIGRP Internal
OSPF intra-area
OSPF inter-area
OSPF external
EIGRP external
IBGP
Unknown

Thanks

In reply to by HNG

Ivan.Ivanov

Tue, 09/21/2021 - 12:08

Hi HNG,
Thank you for your feedback! Much appreciated!
I have updated the list according to the latest Cisco SD-WAN configuration guide for release 20.6.1 (from 19.09.2021)
Regards,
Ivan

abhisinghvaz

Sun, 10/24/2021 - 11:37

Your content is very clearly defined and I am understanding it much better here then any where else. Thank you !!!

msizi.mthembu

Sun, 03/06/2022 - 08:22

Very well explained. Many thanks.

tungnx

Mon, 03/21/2022 - 07:50

Greate explain !!!
Can you shares Offical document of Cisco SDWAN ?

zerodha00@gmail.com

Tue, 04/19/2022 - 05:46

Hi Ivan, awsome tutorial, enjoyed every single chapter, though i have few things uncleared.

1: Choose the low AD value ( i have configured 2 vedges (one internet, one mpls connection) connected them to a switch and configured loopback on switch, it worked fine choosing between OSPF/Static but when i gave 2 static routes (on from each vedge with different AD value i noticed both routes were chosen and traffic were load balanced ), so seem like within same protocol it did not worked as expected.

also when i ran route preference against tloc preference, route preference WON.

Once Again one of best tutorial.

In reply to by zerodha00@gmail.com

Ivan.Ivanov

Tue, 04/19/2022 - 07:29

Hi zerodha00,
In networking in general, the AD of a route is a locally-significant value based on the router OS. Different protocols might have different AD values on different platforms. For example, the OMP has an AD of 250 on vEdges and 251 on cEdges (because NHRP has AD 250 on cEdges).
Therefore, the AD is only compared for service-side routes on THE SAME WAN edge router. For example, if a WAN edge router receives two routes for the same prefix, one via OSPF and one via Static, it will choose the static route.
AD is not a parameter in OMP, is not advertised, and does not influence the vSmart best-path algorithm. What you probably saw in your tests is that vSmart compared the origin-protocol (vedge1 static > vedge2 ospf) and chose static. Then both OMP routes became equal (vedge1 static = vedge2 static).
Hope it help!

In reply to by zerodha00@gmail.com

Ivan.Ivanov

Tue, 04/19/2022 - 07:49

I have updated the lesson based on your question. Thank you for the feedback!

murtasilmo

Thu, 05/19/2022 - 16:05

I don't know why I found your content late . This is much more than excellent to start with SD-WAN.