In this lesson, we are going to explore a typical misunderstanding of the OMP Best Path Selection process in scenarios with multiple vSmart controllers. Along the way, we are going to talk about the order of operation between the OMP Send Path Limit parameter and the OMP best-path algorithm.

The initial state

We are going to use a topology consisting of six vEdge cloud routers (vEdges 1 through 6) and two vSmart controllers (vSmart-1 and vSmart-2). By default, when multiple vSmart controllers oversee the overlay fabric, vEdges establish an OMP peering session with two of them. This is controlled by a system parameter called max-omp-sessions set to two by default. For the purpose of this demonstration, the max-omp-sessions parameter on each vEdge is set to 1, which means that each vEdge will only form an OMP peering session with one vSmart controller.

OMP Peering with two vSmart controllers
Figure 1. OMP Peering with two vSmart controllers

As illustrated in figure 1 above, vEdges 1, 2, and 5 have established an OMP session only with vSmart-1(1.1.1.30) and vEdges 3,4 and 6 only with vSmart-2 (1.1.1.40), respectively. We can verify this directly on the vSmart controllers. You can see that vSmart-1 has OMP peering to vSmart-2 and to vEdges 1, 2 and 5.

vSmart-1# show omp peers
R -> routes received
I -> routes installed
S -> routes sent

                         DOMAIN    OVERLAY   SITE                                
PEER             TYPE    ID        ID        ID        STATE    UPTIME           R/I/S  
------------------------------------------------------------------------------------------
1.1.1.40         vsmart  1         1         100       up       0:02:39:30       8/0/8
1.1.1.1          vedge   1         1         1         up       0:01:59:04       4/0/8
2.2.2.2          vedge   1         1         1         up       0:01:51:41       2/0/4
5.5.5.5          vedge   1         1         5         up       0:01:59:07       2/0/8

And vSmart-2 has OMP peering to vSmart-1 and vEdges 3,4, and 6.

vSmart-2# sh omp peer
R -> routes received
I -> routes installed
S -> routes sent

                         DOMAIN    OVERLAY   SITE                                
PEER             TYPE    ID        ID        ID        STATE    UPTIME           R/I/S  
------------------------------------------------------------------------------------------
1.1.1.30         vsmart  1         1         100       up       0:02:39:51       8/0/8
3.3.3.3          vedge   1         1         2         up       0:01:59:28       4/0/8
4.4.4.4          vedge   1         1         2         up       0:01:59:29       4/0/8
6.6.6.6          vedge   1         1         6         up       0:01:59:28       0/0/4

There are two WAN transports that are not illustrated in the topology - an Internet cloud and an MPLS cloud. Each vEdge has two transport attachments - one TLOC marked with the biz-internet color and an IP address from the range 39.3.0.0/16 (highlighted in orange) and one TLOC marked with the mpls color and an IP address from the range 10.10.0.0/16 (highlighted in green).

vEdges 1,2,3, and 4 are directly connected to subnet 10.1.1.0/24 in VPN 100 and advertise the subnet to their respective controllers. You can see that vEdges 1 and 2 are advertising the 10.1.1.0/24 subnet to vSmart-1 (1.1.1.30)

vEdge-1# show omp routes vpn 100 advertised detail | nomore | b ADVERTISED | i peer\|\ tloc |
peer    1.1.1.30
     tloc             1.1.1.1, mpls, ipsec
     tloc             1.1.1.1, biz-internet, ipsec

vEdge-2# show omp routes vpn 100 advertised detail | nomore | b ADVERTISED | i peer\|\ tloc |
peer    1.1.1.30
     tloc             2.2.2.2, mpls, ipsec
     tloc             2.2.2.2, biz-internet, ipsec

And vEdges 3 and 4 are advertising the 10.1.1.0/24 subnet to vSmart-2 (1.1.1.40)

vEdge-3# show omp routes vpn 100 advertised detail | nomore | b ADVERTISED | i peer\|\ tloc |
peer    1.1.1.40
     tloc             3.3.3.3, mpls, ipsec
     tloc             3.3.3.3, biz-internet, ipsec

vEdge-4# show omp routes vpn 100 advertised detail | nomore | b ADVERTISED | i peer\|\ tloc |
peer    1.1.1.40
     tloc             4.4.4.4, mpls, ipsec
     tloc             4.4.4.4, biz-internet, ipsec

The question

Knowing that all routes to 10.1.1.0/24 in VPN100 have identical values for AD (250), OMP Preference (0), TLOC preference (0), Origin-type (Connected), and Origin-Metric (1), there is no policy applied on vSmart, that everything else is by default, could you guess:

  • Which OMP routes to subnet 10.1.1.0/24 will vSmart-1 choose as best? What about vSmart-2?
  • Which OMP routes to subnet 10.1.1.0/24 will vEdge-5 install in its VPN 100 routing table?
  • Which OMP routes to subnet 10.1.1.0/24 will vEdge-6 install in its VPN 100 routing table?

The common misunderstanding

In Cisco SD-WAN, vSmart controllers establish a full-mesh of OMP peering sessions between themselves and distribute all OMP routes they learn to all other controllers. Therefore, in our topology, both controllers will exchange all OMP routes resulting in both knowing all paths to 10.1.1.0/24 in VPN100. However, the key point here is in what order will each vSmart controller sort the OMP routes? We have already said that all OMP routes to 10.1.1.0/24 have identical values for AD (250), OMP Preference (0), TLOC preference (0), Origin-type (Connected), and Origin-Metric (1). Therefore, all eight routes will be considered equal-cost on both vSmarts. However, using the tiebreakers on each controller will yield a different result. 

From the perspective of vSmart-1 - the controller receives four OMP routes to 10.1.1.0/24 from the direct OMP peering with vEdges 1 and 2. At the same time, the controller receives another four OMP routes to the same destination from vSmart-2 (the routes that vEdges 3 and 4 advertise to vSmart-2). Therefore, according to the first tiebreaker of the OMP best-path algorithm, vSmart-1 will prefer the routes coming directly from vEdges 1 and 2 over the ones coming from vSmart-2. 

From the perspective of vSmart-2 - the controller receives four OMP routes to 10.1.1.0/24 from the direct OMP peering with vEdges 3 and 4. At the same time, vSmart-2 receives another four OMP routes to the same destination from vSmart-1 (the routes that vEdges 1 and 2 advertise to vSmart-1). According to the first tiebreaker of the OMP best-path algorithm, vSmart-2 will insert the routes via vEdges 3 and 4 at the top and then the ones via vEdges 1 and 2 that it received from vSmart-1.  In the end, both controllers will know about all eight paths to 10.1.1.0/24 but each vSmart controller will insert the routes in its VPN 100 route table in a different order, as illustrated in figure 2 below.

The best four routes on both vSmart controllers
Figure 2. Each vSmart controller sorts the routes in a different order

Well, let's now see how the fact that each vSmart controller sorts the routes in a different order affects routers vEdge-5 and 6 - Because the OMP send-path-limit parameter is set to four by default, each vSmart controller will only advertise the first four equal-cost best routes to its OMP peers. In our example, this means that vSmart-1 advertises only the routes via vEdges 1 and 2 to vEdge-5 (5.5.5.5) as you can see in the output below:

vSmart-1# show omp routes vpn 100 advertised detail | nomore | b ADVERTISED | i peer\|\ tloc |
...
peer    5.5.5.5
     tloc             1.1.1.1, mpls, ipsec
     tloc             1.1.1.1, biz-internet, ipsec
     tloc             2.2.2.2, mpls, ipsec
     tloc             2.2.2.2, biz-internet, ipsec

And vSmart-2 advertises only the routes via vEdges 3 and 4 to vEdge-6 (6.6.6.6) as you can see in the output below:

vSmart-2# show omp routes vpn 100 advertised detail | nomore | b ADVERTISED | i peer\|\ tloc |
...
peer    6.6.6.6
     tloc             4.4.4.4, mpls, ipsec
     tloc             4.4.4.4, biz-internet, ipsec
     tloc             3.3.3.3, mpls, ipsec
     tloc             3.3.3.3, biz-internet, ipsec

In the end, router 5 won't know that subnet 10.1.1.0/24 is reachable via vEdges 3 and 4 at all, and router 6 won't know that the subnet is reachable via vEdges 1 and 2. Let's verify this by checking the routing tables of both routers:

vEdge-5# show ip routes vpn 100 | t

     ADDRESS               PATH                                         NEXTHOP          
VPN  FAMILY   PREFIX       ID    PROTOCOL TLOC IP  COLOR         ENCAP  VPN      STATUS  
-----------------------------------------------------------------------------------------
100  ipv4     10.1.1.0/24  0     omp      1.1.1.1  mpls          ipsec  -        F,S     
100  ipv4     10.1.1.0/24  1     omp      1.1.1.1  biz-internet  ipsec  -        F,S     
100  ipv4     10.1.1.0/24  2     omp      2.2.2.2  mpls          ipsec  -        F,S     
100  ipv4     10.1.1.0/24  3     omp      2.2.2.2  biz-internet  ipsec  -        F,S   

You can see that vEdge5 has installed the routes via vEdges 1 and 2.

vEdge-6# sh ip route vpn 100 | t

     ADDRESS               PATH                                         NEXTHOP          
VPN  FAMILY   PREFIX       ID    PROTOCOL TLOC IP  COLOR         ENCAP  VPN      STATUS  
-----------------------------------------------------------------------------------------
100  ipv4     10.1.1.0/24  0     omp      3.3.3.3  mpls          ipsec  -        F,S     
100  ipv4     10.1.1.0/24  1     omp      3.3.3.3  biz-internet  ipsec  -        F,S     
100  ipv4     10.1.1.0/24  2     omp      4.4.4.4  mpls          ipsec  -        F,S     
100  ipv4     10.1.1.0/24  3     omp      4.4.4.4  biz-internet  ipsec  -        F,S     

And that vEdge6 has installed the routes via vEdges 3 and 4. Of course, we must have in mind that the maximum number of equal-cost routes that a WAN edge router would install in its routing table is controlled by the OMP ecmp-limit parameter.

The key takeaways

In this lesson, we have seen how the first tie-breaker in the OMP Best-Path algorithm works and how it interacts with the OMP send-path-limit parameter. The question is what does all that mean for a real-world deployment?

The default value set by Cisco for every parameter is not random. They know that 99% of the network environment would have two vSmart controllers per region and four equal-cost routes per destination because the typical branch network is dual-homed, dual-transport. However, in a network region with more than two vSmart controllers, vEdges may use suboptimal paths to destinations if either one of the following design recommendations is not met:

  • Each vEdge has an OMP peering session with every vSmart controller in the region. The number of OMP sessions that a vEdge router could have is controlled by a global system parameter called max-omp-sessions, set to 2 by default;
  • The OMP send-path-limit and the OMP ecmp-limit parameters are set according to the maximum number of equal-cost paths that exist in the region (both values are 4 by default);