What is OMP?

The Cisco Overlay Management Protocol (OMP) is an all-in-one TCP-based protocol, similar to BGP, that establishes and maintains the SD-WAN control plane. OMP runs between the vEdge routers and the vSmart controllers and between the controllers themselves. The protocol is responsible for:

  • Distribution of Transport Locators (TLOCs) among network sites in the sd-wan domain.
  • Distribution of service-side reachability information.
  • Distribution of service-chaining information.
  • Distribution of data plane security parameters, VPN labels, and crypto keys.
  • Distribution of data and application-aware routing (AAR) policies.

Upon joining the overlay fabric, vEdges automatically initiate OMP peering to the vSmart controllers. A key point here is that the two endpoints of this OMP peering are the System-IPs of the vEdge router and the controller, similar to a BGP peering between a route-reflector and a RR client established between loopback interfaces. Therefore, the OMP peering is not strictly bound to any of the DTLS control connections to vSmart. Similar to a BGP peering between loopbacks, the OMP peering would use all available control connections in the same way the BGP peering would use all available IP paths between the loopbacks.

SD-WAN Control plane exchange via OMP
Figure 1. Cisco SD-WAN Control plane exchange via OMP

OMP addresses the challenges that traditional IGP protocols face when scale. Since OMP operates in the overlay, the notion of routing peers is different from a traditional WAN environment. From a logical point of view, the overlay fabric consists of a few vSmart controllers and many vEdge routers. The routers peer only with the vSmart controllers. vEdges don’t form any control-plane relationship among themselves over the Cisco SD-WAN fabric. This is a significant difference from the routing peering model in traditional IGPs. vEdge routers don’t need to form routing adjacencies between themselves and do not have to respond to an excessive number of routing updates. All of this makes OMP more manageable, secure, and more efficient compared to traditional routing, especially in large-scale deployments with hundreds of spokes. 

OMP is the key ingredient that allows the WAN to scale horizontally. The vSmart controller makes all routing computations for the overlay and OMP propagates this information across the fabric. Therefore, adding new vEdge routers into the SD-WAN domain does not affect the performance of the existing WAN edge routers because individual routers do not make routing calculations for the entire overlay fabric.

OMP Peering

OMP is enabled by default on all Cisco SD-WAN edge devices. When vEdges go through the Zero-Touch Provisioning process, they learn about the addresses of all available vSmart controllers and automatically initiate secure connections to them. By default, these connections are authenticated and encrypted via the Datagram Transport Layer Security (DTLS) protocol. Depending on the number of available transports, each vEdge router will try to establish a secure control connection via every TLOC. However, as shown in figure 2, the OMP peering uses the System-IPs, and only one peering session is established between one WAN Edge device and one vSmart controller even if there are multiple DTLS connections to the same controller. 

Cisco SD-WAN OMP Peering
Figure 2. Cisco SD-WAN OMP Peering

You can see in the following output that the WAN edge device has two DTLS control connections initiated to the vSmart controller with IP address 1.1.0.3.  One connection through the MPLS transport and another one through the Internet.

vEdge-1# show control connections 
                                            PEER                       
PEER    PEER PEER        SITE   PEER        PUB                        
TYPE    PROT SYSTEM IP   ID     PUBLIC IP   PORT  LOCAL COLOR     STATE
-----------------------------------------------------------------------
vsmart  dtls 1.1.0.3     1      1.1.1.70    12346 mpls            up    
vsmart  dtls 1.1.0.3     1      1.1.1.70    12346 public-internet up    
vbond   dtls -           0      1.1.1.60    12346 mpls            up    
vbond   dtls -           0      1.1.1.60    12346 public-internet up    
vmanage dtls 1.1.0.1     1      1.1.1.50    12346 mpls            up    

However, if we check how many omp peering sessions to the controller there are, you can see that there is only one.

vEdge-1# show omp peers
R -> routes received
I -> routes installed
S -> routes sent

                DOMAIN OVERLAY SITE                     
PEER    TYPE    ID     ID      ID    STATE  UPTIME      R/I/S
-------------------------------------------------------------
1.1.0.3 vsmart  1      1       1     up     0:14:25:18  4/1/8

Another important thing to know is that these DTLS control plane tunnels are used by other protocols as well. For example, besides OMP, NETCONF and SNMP will also be transported via these secure connections. By utilizing these encrypted DTLS tunnels, we no longer need to be concerned about the native security of protocols like SNMP, NTP, etc.

Cisco SD-WAN DTLS Connection
Figure 3. Cisco SD-WAN DTLS Connection

In a typical production deployment, there are at least two or three controllers for redundancy purposes. When we have multiple vSmarts, they also establish OMP peering between them in a full-mesh manner as shown in figure 4.

OMP Peering with multiple controllers
Figure 4. OMP Peering with multiple controllers

OMP Route Advertisements

Cisco vEdge routers collect routes they learn from directly connected networks, static and dynamic routing protocols that run in the site-local environment. These routes are then advertised to all OMP peers (to controllers) along with the corresponding TLOC next-hops. The routes that represent reachability information are referred to as OMP routes or just vRoutes (to distinguish them from traditional IP routes). However, vEdges also advertise to vSmart all locally attached services that are running in the site-local network. Services include load balancers, firewalls, IDS (Intrusion Detection Systems) and could also be customer-defined ones.

The vSmart controllers learn the topology of the overlay fabric and all available network services through these OMP route advertisements coming from vEdges. 

Cisco SD-WAN Overlay Management Protocol
Figure 5. Cisco SD-WAN Overlay Management Protocol

As it is visualized in figure 5, vEdge routers advertise three types of routes via the Overlay Management Protocol (OMP) to the vSmart controllers:

  • OMP routes: OMP Routes, also referred to as vRoutes, are prefixes learned at the local site via connected interfaces, static routes, and dynamic routing protocols (such as OSPF, EIGRP, and BGP) running on the service side of the vEdge. These prefixes are redistributed into OMP and advertised to the vSmart controller so that they can be carried across the overlay fabric to all other WAN edge nodes. OMP routes resolve their next-hop to a TLOC. An OMP route is installed in the forwarding table only if the next-hop TLOC is known and there is a BFD session in UP state associated with that TLOC;
  • TLOC routes advertise Transport Locators of the connected WAN transports, along with additional attributes such as public and private IP addresses, color, TLOC preference, site ID, weight, tags, and encryption keys.
  • Service routes advertise embedded network services such as firewalls and IPS that are connected to the vEdge local-site network.

OMP Routes

Every Cisco vEdge router will advertise prefixes learned at the local site as OMP routes (vRoutes) to the vSmart controllers. OMP can import reachability information from traditional protocols such as EIGRP, OSPF, and BGP and can also advertise connected and static routes. The OMP route updates are very similar to traditional routing updates. However, the main difference is that the next-hop of a vRoute points to a TLOC route. In BGP, a route is invalid if the next-hop IP address is not resolved in the routing table. In OMP, if a vRoute has a next-hop TLOC that cannot be resolved in the TLOC route table, it is marked as Invalid, similarly to BGP.

Cisco SD-WAN OMP Operation
Figure 6. vRoutes

In figure 6 for example, vEdge-1 is directly connected to subnet 1.1.1.0/24 in VPN1 and has two transport interfaces marked with colors biz-internet and mpls (TLOCs T1 and T2). The router will advertise to the controller that 1.1.1.0/24 is reachable in VPN1 via TLOCs T1 and T2. This advertisement is carried in an OMP UPDATE message as a vRoute. On the other side, vEdge-2 imports the reachability information learned via OSPF in VPN1 and advertises that subnet 2.2.2.0/24 is reachable in VPN1 via TLOCs T3 and T4.   

vRoutes consists of a lot of attributes in addition to the reachability information. Let’s list each attribute with a short description:

  • VPN: Every OMP route is associated with a VPN, and every Cisco SD-WAN device keeps a separate routing table for each VPN. This allows for the use of overlapping subnet ranges, provided they are in different VPNs. In the example above, both subnets 1.1.1.0/24 and 2.2.2.0/24 are associated with VPN1 and will only be reachable in that network segment;
  • Originator: This is the System-IP of the router, from which the route was originally learned from. In our example for prefix 1.1.1.0/24, this value would be the system-IP of vEdge-1;
  • TLOC: This is the next-hop identifier of the OMP route. Note in the example that vEdge-1 advertises two vroutes for prefix 1.1.1.0/24, one via tloc T1 and one via tloc T2. This tells the vSmart controller and, subsequently, all remote WAN edge routers that to reach subnet 1.1.1.0/24, they must have an active overlay tunnel to either tloc T1 or T2. Active means that the BFD status associated with that tunnel must be in UP state;
  • Site ID: The site-id plays a similar role to a BGP AS number. It is primarily used for loop prevention. All sites should have a unique site ID, and all devices at the same location should have the same site-id. In the example above, the site-id for prefix 1.1.1.0/24 would be 1;
  • Origin-Protocol: This is the original protocol from which the vEdge router has learned the routing information. It may be a connected interface, static route, or any existing dynamic routing protocols such as OSPF, EIGRP, or BGP. In the example for prefix 1.1.1.0/24 via T1, the origin-proto will be Connected and OSPF-IA for 2.2.2.0/24.
  • Origin-Metric: OMP includes the original metric value alongside the origin protocol. These values are then used in the best-path algorithm when OMP calculates the most optimal routes toward destinations. In the example for prefix 1.1.1.0/24 via T1, the origin-metric will be 0.
  • Preference: This attribute is also referred to as OMP preference or vRoute preference, so it is not confused with the TLOC preference attribute in the TLOC routes. The OMP Preference is used for influencing the OMP best-path selection for a given vroute. Higher is better. In the example above, if we’d like to manipulate the overlay routing in such a way so that the traffic towards 1.1.1.0/24 always comes through the MPLS cloud, we can just set a higher OMP Preference value to vroute 1.1.1.0/24 via T2. The OMP preference operates very similarly to local_peference in BGP;
  • Tag: This is similar to the route tags in traditional routing. Once a value is set, it is a transitive attribute that can be acted upon via policy.

A real example of a single vRoute can be seen in the output below:

vEdge-1# show omp route 172.16.1.0/24
---------------------------------------------------
omp route entries for vpn 1 route 172.16.1.0/24
---------------------------------------------------
            RECEIVED FROM:                   
peer            1.1.1.30
path-id         5
label           1004
status          C,I,R
loss-reason     not set
lost-to-peer    not set
lost-to-path-id not set
    Attributes:
     originator       3.3.3.3
     type             installed
     tloc             3.3.3.3, mpls, ipsec
     ultimate-tloc    not set
     domain-id        not set
     overlay-id        1
     site-id          2
     preference       not set
     tag              not set
     origin-proto     connected
     origin-metric    0
     as-path          not set
     community        not set
     unknown-attr-len not set

OMP, as most traditional routing protocols, only advertises the best vroute or vroutes if there are multiple equal-cost ones. Logically, invalid vroutes (having unresolved TLOC next-hop) are not advertised, even if they have the best attributes. 

TLOC Routes

A TLOC route represents a WAN link that serves as a tunnel endpoint and is uniquely identified by {System-IP, Color, Encapsulation}. Note that the System IP address is used instead of the interface IP address as an identifier for a TLOC route. That’s because the interface IP can change at any given moment. Using the fixed System-IP ensures that the TLOC can be uniquely identified at all times irrespective of any interface IP changes. This is very important because an OMP route (vRoute) has a next-hop pointing to a TLOC. This separation of information allows TLOC routes to be updated with new parameters without having to invalidate the dependent vRoutes. If a vEdge router has multiple transport interfaces connected to different WAN providers, as shown in figure 7, a TLOC route is created and advertised for each WAN interface.

TLOC routes
Figure 7. TLOC Routes

Controlling the TLOC route advertisements is essential when customizing the overlay topology. vEdge routers attempt to form IPsec tunnels to all remote TLOCs they receive from vSmart via all local TLOCs. If no policy is applied on vSmart, all vEdges know about all TLOCs, resulting in a full mesh overlay (assuming that there is full IP reachability between all TLOCs). Let’s look at figure 7. If we want to make sure that vEdge-1 would never form an overlay tunnel to vEdge-3, we apply a policy on vSmart that filters TLOCs T5 and T6 in the TLOC route advertisements towards vEdge-1. In the end, vEdge-1 wouldn’t know about TLOCs T5 and T6 and would never establish a data plane connection with vEdge-3.

A TLOC route advertisement contains the following attributes:

  • Private IPv4/IPv6 addresses and ports: These are the IP addresses configured or assigned via DHCP on the vEdge’s WAN interface.
  • Public IPv4/IPv6 address/port: If the vEdge sits behind a NAT device, the outside NATed IP addresses and ports are included in the TLOC route advertisements. If the router does not sit behind NAT, the public and private addresses and ports are the same;
  • Color: The color is a logical abstraction used to identify a specific WAN interface on a WAN edge router. If no color is explicitly configured under an interface, it is marked with the default color - “default”;
  • Encapsulation type: The encapsulation could be either GRE or IPsec. To successfully form a tunnel, the encapsulation type of a TLOC must match with the remote TLOC’s encapsulation. In a typical production deployment, the encap will always be IPsec for security reasons. However, when one is studying or testing features, it is a good practice to use GRE encapsulation so that everything going through the overlay tunnels is cleartext and can be inspected with Wireshark.
  • Preference: This attribute is also referred to as a TLOC Preference, so it does not get confused with OMP Preference. It is used in the OMP best-path algorithm when comparing multiple vroutes for the same destination. Higher is better, and the default is 0;
  • Site ID: This attribute identifies the originating site for this TLOC route. WAN edge routers will never attempt to form an overlay tunnel to a remote TLOC that has the same site-id.
  • Tag: An user-defined value that can be acted upon in a control policy;
  • Weight: Weight is a parameter used to achieve unequal traffic distribution across multiple local TLOCs with equal preferences. A higher weight value sends more traffic to the local TLOC. For example, if TLOC-A is configured with a weight of 20 and TLOC B  with a weight of 1, then approximately 20 flows are sent out by TLOC A for every flow sent out by TLOC-B.

You can see a real example of a TLOC route in the output below:

vEdge-1# show omp tlocs
---------------------------------------------------
tloc entries for 1.2.3.4
                 public-internet
                 ipsec
---------------------------------------------------
            RECEIVED FROM:                   
peer            1.1.1.30
status          C,I,R
loss-reason     not set
lost-to-peer    not set
lost-to-path-id not set
    Attributes:
     attribute-type    installed
     encap-key         not set
     encap-proto       0
     encap-spi         261
     encap-auth        sha1-hmac,ah-sha1-hmac
     encap-encrypt     aes256
     public-ip         24.5.4.1
     public-port       12366
     private-ip        192.168.1.2
     private-port      12366
     public-ip         ::
     public-port       0
     private-ip        ::
     private-port      0
     bfd-status        up
     domain-id         not set
     site-id           14
     overlay-id        not set
     preference        0
     tag               not set
     stale             not set
     weight            1
     version           3
    gen-id             0x80000007
     carrier           default
     restrict          0
     on-demand          0
     groups            [ 0 ]
     bandwidth         0
     qos-group         default-group
     border             not set
     unknown-attr-len  not set

Service routes

A service route represents a network service such as a firewall or a load balancer connected to a WAN Edge router. Network services are often deployed in one or several centralized locations, for example, in a data center or regional hub site. The network must be able to reroute traffic from any remote location in the overlay through these services and then route the traffic back to its original destination. This is called service chaining and is done using service routes.

Service Routes
Figure 8. Service Routes

Let’s look at figure 8, for example. vEdge-3 connects to a firewall (5.5.5.5) that provides an FW service to the overlay network. A key point here is that the device providing the network service must be layer-2 adjacent to the vEdge router (there must be no layer-3 nodes in between). Additionally, a network service is always associated with a VPN. 

Configuring a network service on a vEdge router is as straightforward as one command line. Under the VPN configuration hierarchy, we define the service type and the service IP address as shown in the output below: 

vEdge-3
vpn 50
  service FW address 5.5.5.5
!

Once we commit the configuration, vEdge-3 will advertise the FW service to the vSmart controller using a service route via the overlay management protocol. 

A service route contains the following attributes:

  • VPN ID: The VPN that this service applies to, in our example, would be 50;
  • Service ID: The service-id defines the type of service that is being advertised. There are 7 pre-defines values:
    • FW maps to svc-id 1;
    • IDS maps to svc-id 2;
    • IDP maps to svc-id 3;
    • Custom Services: The last four values are used for customer defined services: 
      • netsvc1 maps to svc-id 4;
      • netsvc2 maps to svc-id 5;
      • netsvc3 maps to svc-id 6;
      • netsvc4 maps to svc-id 7;
  • Originator ID: The System-IP address of the vEdge that originates the service route;
  • TLOC: The TLOC (Transport Locator) where the service is located.

The key point here and the major difference between the Cisco SD-WAN service chaining and the Traditional WAN is that no configuration is required on any remote WAN edge routers. If we look at the example shown in figure 8, this means that no configuration needs to be applied on vEdge-1 and vEdge-2. The service chaining is completely done at the vSmart controller using a policy that is then propagated to all remote sites that must redirect traffic through the service. In contrast, in the traditional WAN, each node along the service chaining path must be manually provisioned to redirect traffic through the network service. 

Service Routes Example
Figure 9. Service Routes Example

At a high level, the steps to enable service chaining are as follows:

  • One or multiple WAN edge routers advertise a network service to the vSmart controller using an OMP service route. In figure 9, vEdge-3 advertises the firewall service via a service route;
  • A policy that redirects the traffic from remote sites through the FW service is then defined on vSmart. Once processed by the FW, the traffic is forwarded to its final destination.

We can check the network services on a vEdge router using the show omp services command, as shown in the output below:

vEdge-3# show omp services
ADDRESS                                                   PATH
FAMILY   VPN    SERVICE   ORIGINATOR       FROM PEER      ID    LABEL    STATUS
-------------------------------------------------------------------------------
ipv4     50     FW        3.3.3.3          10.1.1.30      67    1004     C,I,R