LAB 10 - Dynamic On-demand Tunnels

This lesson will explore a Cisco SD-WAN functionality called Dynamic On-demand Tunnels. We will see when this feature could come in handy when designing the overlay topology and how we configure on-demand tunnels.

The Business Need

It is very common for organizations to deploy IPsec tunnels in a full-mesh SD-WAN overlay to provide a low-latency path to voice and video services. In most cases, this leads to many idle tunnels between remote sites that rarely or never communicate with each other. Additionally, when an organization uses less-powerful WAN edge platforms such as Cisco ISR 1100 series on spoke sites, the full-mesh topology becomes a scaling limitation because, at some point, the routers can't handle the thousands of IPsec tunnels required.

There are three solutions to this scaling problem:

Changing the topology to hub-and-spoke: This is the easiest way to scale up the overlay fabric. However, spokes aren't able to communicate with other spokes via low-latency direct tunnels (all traffic goes through the hub).
Manually adjusting the topology to custom partial mesh: This option would require a complex control-plane policy and won't scale efficiently. Additionally, it is hard to predict which spokes communicate with each other in many scenarios, so it is very hard to determine which sites to set up direct tunnels between.
Using dynamic on-demand tunnels: The most efficient option to both scale up the overlay and keep having a direct low-latency path between remote sites is by using on-demand tunnels. The topology is hub-and-spoke, but spoke-to-spoke traffic triggers the WAN edge routers to set up a temporary direct IPsec tunnel between the two sites. (similarly to DMVPN Phase 3)

Figure 1 illustrates a typical use-case of the Cisco SD-WAN Dynamic On-demand Tunnels.

Figure 1. Dynamic On-demand Tunnels - A typical use-case

What are On-demand Tunnels?

On-demand Tunnels are a Cisco SD-WAN functionality that triggers the establishment of an IPsec overlay tunnel between two sites only when there is traffic between the WAN edger routers. After the flow of direct traffic stops and the pre-defined idle-timer expires, the tunnel is placed in an inactive state. In this state, the tunnel doesn't consume bandwidth or CPU and therefore does not affect the router's performance. However, the on-demand tunnel can quickly be brought back up again when direct traffic re-occurs.

In summary, dynamic spoke-to-spoke tunnels allow organizations to use less expensive, less powerful WAN edge routers and, at the same time, be able to scale up the overlay and use direct low-latency paths between remote sites.

How do On-demand Tunnels work?

We could break the process of enabling this functionality within an SD-WAN overlay into four distinct configuration steps:

The control-plane

The first step in setting up on-demand tunnels is to configure the control plane properly. There are two essential pre-requisites when enabling the functionality in Cisco SD-WAN:

All spoke sites must receive all TLOCs and vroutes of all other spokes. This practically means that the initial topology (before enabling on-demand functionality) must be a full-mesh and not hub-and-spoke.
All spoke vRoutes must have a backup path set up through tloc-action in a centralized control policy, as shown in the output below. The ultimate tloc represents the potential direct path to reach this prefix directly.

Figure 2 illustrates both pre-requisites in a simple example. Notice that the topology is full-mesh (there are tunnels between vEdge-4 and vEdge-5) and also that spokes have a backup path for the prefixes of other spokes (ultimate tloc with tloc-action backup).

On-demand Tunnels Control Plane — Figure 2. Control Plane Requirements

The output below shows an example of a vRoute that has a backup next-hop enforced via a centralized control policy. The next-hop tloc points to the HUB. The ultimate tloc points directly to the remote spoke and represents the direct data plane path to the destination prefix.

---------------------------------------------------
omp route entries for vpn 4 route 10.5.4.0/24
---------------------------------------------------
            RECEIVED FROM:                   
peer            1.1.1.30
path-id         53
label           1010
status          C,I,R
loss-reason     not set
lost-to-peer    not set
lost-to-path-id not set
    Attributes:
     originator       5.5.5.5
     type             installed
     tloc             1.1.1.1, mpls, ipsec
     ultimate-tloc    5.5.5.5, mpls, ipsec -- backup
     domain-id        not set
     overlay-id        1
     site-id          5
     preference       not set
     tag              not set
     origin-proto     connected
     origin-metric    0
     as-path          not set
     community        not set
     unknown-attr-len not set

We will see how to configure the control plane in the configuration portion of this lesson.

The data-plane

Okay, we said that the initial topology before implementing the on-demand functionality must be full-mesh (all spokes knowing all TLOCs). But we want to have a hub-and-spoke topology with dynamic spoke-to-spoke tunnels, right? Well, how's that going to happen?

The magic happens in the data plane - once enabled for on-demand, WAN edge routers stop forming tunnels to other routers that are also enabled with the on-demand functionality, as shown in figure 3 above. Hence, if all spokes are enabled with on-demand and the hubs are not, the topology becomes hub-and-spoke because spokes won't form tunnels to other spokes (all are configured with on-demand) but at the same time spokes will form tunnels with the hub (hubs are not enabled for on-demand).

Triggering the on-demand tunnel

The initial traffic between two spoke sites is routed through the static next-hop tloc in the vroute which is pointing to the hub. This forces the traffic to go through the hub.

When a vEdge router is enabled for on-demand tunnels and receives traffic from a prefix that has an ultimate-tloc-backup, it responds back via the hub and simultaneously triggers the provisioning of direct on-demand tunnels to the site sourcing the traffic (not to the individual vEdge!). This is very important - in dual-homed spoke sites, when traffic triggers the on-demand functionality, direct tunnels are formed to all WAN edge routers on the souring spoke site.

First Traffic goes through the HUB — Figure 4. First traffic goes through the HUB

For spoke sites with multiple WAN edge routers, it is very important to configure the on-demand functionality on all routers; otherwise, the on-demand feature may behave unpredictably.

Tunnel States

Once traffic triggers the setting up on-demand tunnels on both spoke sites, the vEdge routers start to monitor the state of the tunnels.

Then vEdges set up an on-demand tunnel — Figure 5. Direct spoke-to-spoke tunnel

An on-demand tunnel has two states:

Active - Once an on-demand tunnel is established, it is placed in an active state. This means that traffic is actively traversing the direct tunnel and BFD is up. In this state, the tunnel activity is constantly observed. When the traffic stops flowing, a pre-defined idle-timer starts (default one is ten minutes). When this timer expires, the tunnel is placed in an Inactive state.
Inactive - When an on-demand tunnel is placed in an inactive state, it basically means that the tunnel is removed and no BFD probes are being sent for detection. Inactive tunnels do not use any bandwidth or CPU resources of the router.

On-demand Tunnels Benefits

Establishing dynamic spoke-to-spoke tunnels offers many performance advantages over the pure-vanilla hub-and-spoke topology:

Significantly improved performance in scenarios with less-powerful routers working in a full-mesh overlay.
Significantly improved latency between spokes.
Reduced bandwidth usage.
Reduced CPU and memory usage.

Configuring On-demand Tunnels

Let's now jump into the configuration portion of this lesson and see how we can enable spoke-to-spoke tunnels in our lab topology.

The initial state

For this lab example, we will use the topology shown in figure 6 below. For the purpose of this example, I have reverted the topology back to full-mesh and deleted all centralized control policies applied in previous lessons. The overlay topology is just a default full-mesh SD-WAN fabric.

To verify this, let's check that each router has overlay tunnels to all other sites (full-mesh).

vEdge-4# show bfd sessions | t

SYSTEM   SITE                                    DETECT     TX                 
IP       ID   LOCAL COLOR   COLOR         STATE  MULTIPLIER INTERVAL UPTIME    
-------------------------------------------------------------------------------
1.1.1.1  1    mpls          mpls          up     7          1000     0:00:00:14
1.1.1.1  1    biz-internet  biz-internet  up     7          1000     0:00:00:19
1.1.1.2  1    mpls          mpls          up     7          1000     0:00:00:14
1.1.1.2  1    biz-internet  biz-internet  up     7          1000     0:00:00:19
3.3.3.3  3    mpls          mpls          up     7          1000     0:00:00:14
3.3.3.3  3    biz-internet  biz-internet  up     7          1000     0:00:00:19
5.5.5.5  5    mpls          mpls          up     7          1000     0:00:00:14
5.5.5.5  5    biz-internet  biz-internet  up     7          1000     0:00:00:18
6.6.6.6  6    mpls          mpls          up     7          2000     0:00:00:14
6.6.6.6  6    biz-internet  biz-internet  up     7          1000     0:00:00:19

Additionally, each prefix is reachable via the direct overlay path to the respective remote site. Therefore, the overlay routing is also "as it is by default".

vEdge-4# show ip routes vpn 4 | t

     ADDRESS               PATH             NEXTHOP                             
VPN  FAMILY   PREFIX       ID    PROTOCOL   IFNAME  TLOC IP  COLOR         ENCAP
--------------------------------------------------------------------------------
4    ipv4     10.1.4.0/24  0     omp        -       1.1.1.1  mpls          ipsec
4    ipv4     10.1.4.0/24  1     omp        -       1.1.1.1  biz-internet  ipsec
4    ipv4     10.1.4.0/24  2     omp        -       1.1.1.2  mpls          ipsec
4    ipv4     10.1.4.0/24  3     omp        -       1.1.1.2  biz-internet  ipsec
4    ipv4     10.3.4.0/24  0     omp        -       3.3.3.3  mpls          ipsec
4    ipv4     10.3.4.0/24  1     omp        -       3.3.3.3  biz-internet  ipsec
4    ipv4     10.4.4.0/24  0     connected  ge0/4   -        -             -    
4    ipv4     10.5.4.0/24  0     omp        -       5.5.5.5  mpls          ipsec
4    ipv4     10.5.4.0/24  1     omp        -       5.5.5.5  biz-internet  ipsec
4    ipv4     10.6.4.0/24  0     omp        -       6.6.6.6  mpls          ipsec
4    ipv4     10.6.4.0/24  1     omp        -       6.6.6.6  biz-internet  ipsec

There is nothing fancy about the initial topology, so let's go ahead and demonstrate the Cisco SD-WAB on-demand functionality.

Step 1: Configuring the control plane

The first step in enabling the feature is to configure the control plane. We will need to provision a new centralized control policy that includes the tloc-action backup. This will, later on, ensure that spokes know that a direct path to the destination prefix exists so they can trigger the establishment of on-demand tunnels.

Full Content Access is for Registered Users Only (it's FREE)...

Learn any CCNA, DevNet or Network Automation topic with animated explanation.
We focus on simplicity. Networking tutorials and examples written in simple, understandable language for beginners.

Comments

rh1116

Fri, 02/10/2023 - 13:55

vManage supports this feature since v20.3

r.casagrande

Wed, 12/06/2023 - 13:56

Hi,
excuse me but I didn't understand how to set a match condition only TLOC accept with vmanage. The system required site or other match conditions

sequence 1
match tloc
!
action accept

johanpsmith

Thu, 03/07/2024 - 07:59

How does per tunnel QoS fit into a topology with on demand tunnels enabled? Seeing as per tunnel QoS is only supported on a hub and spoke topology

boris.kudr

Fri, 04/26/2024 - 03:58

Should at the "Figure 2" Primary and Backup labels be switched around?

Asif1597

Sun, 06/09/2024 - 08:58

I think no. Because TLOC Action Backup means go to directly spoke devices instead of Hub if destination prefix resides at spoke site. But because we are enabling on-demand, when there is not any connection spoke to spoke, we don't need to up BFD session. That's why it will go to Hub by default. The only difference is if we don't write on-demand, all BFDs will be up, just will use spoke-to-spoke when needed as always. But main point in here turning of BFDs.

ext-joallyson.castro

Fri, 08/09/2024 - 18:12

Hi Ivan! Well, I think there is an error, figure 2 says backup path is 5.5.5.5 (spoke 5), the output of the command bellow shows "ultimate-tloc" as 5.5.5.5 (backup), the problem is on step 1, you set-up HUB's TLOC on seq 21 and 31 as a backup, shouldn't it be SPOKES? I appreciate if you could explain it, thanks!

gab.agl

Tue, 09/10/2024 - 00:14

Hi Ivan.

It is possible to have a on-demand tunnel topology on some VPN and others with full mesh?

rohanahmed

Sun, 12/01/2024 - 20:26

If anyone is going through this doc and having the need to understand on demand tunnel in detail. I have a question to those people + to Ivan as well. Is there really a need for having same colors on cEdge, because this document seems to be for vEdge specifically and there is no such thing mentioned on the Cisco docs to have match color in the control policy.

wanke2024

Thu, 01/30/2025 - 14:45

if i use the on-demand fuction in cedge ,all of edge have no bfd connection, so i cann't continue my lab..why?