Scenario #1 - Preferred DC in a dual-DC design

This lesson will show a scenario with several tasks that we will solve using different SD-WAN techniques. Along the way, we will learn how to approach different topology and traffic engineering requirements.

Scenario Overview

A client has an SD-WAN network consisting of two data centers and multiple branches. Each site connects to two transport colors- biz-internet and mpls(restricted). The client has the following requirements, illustrated in figure 1 below:

Topology Requirements
- DC1 must be configured as a primary hub.
- DC2 must be configured as a backup hub.
- Branches must connect to the two data centers in a hub-and-spoke fashion.
Traffic Engineering Requirements
- In normal circumstances:
  - Traffic between spokes must go through the primary hub for all VPNs.
  - Traffic to DC2 must also go through the primary hub for all VPNs.
- In case the primary hub is down:
  - Branches should automatically fail over to the backup hub when the primary hub goes down.
- In case DC1's biz-internet and DC2's mpls colors are down at the same time:
  - Branches must be able to communicate with both data centers.

Lab topology and initial configs

Figure 2 below illustrates the initial topology we will use for this lab scenario. All system-IPs, site-ids, and IP addresses are according to the diagram.

All devices have joined the overlay fabric and have a VPN0 configuration as per figure 2. There is no policy applied on vSmart and everything else is by default. There are 4 service-side VPNs - 3 through 6. Each WAN edge router has the following service-side IP address scheme:

10.[site-id].[vpn-id].0/24

For example, vEdge-4 's VPN5's address is 10.4.5.0/24, and so on.

The initial configuration of each SD-WAN device can be found on the section page here.

Scenario requirements

Okay, let's now readjust the scenario requirements according to our lab topology

vEdges 1 and 2 will be the primary hub.
vEdge-6 will be the backup hub.
vEdges 3,4 and 5 will be spokes
The overlay topology must be hub-and-spoke with branches connecting to both data centers.
Traffic between spokes must go through the primary hubs (vEdges 1 and 2) for all VPNs.
If the primary hubs go down or lose all transport colors, spokes must automatically fall back to the backup hub (vEdge-6).
If the tunnels between DC1 and DC2 go down, spokes must still be able to communicate with DC2.

Figure 3 shows the required overlay topology in the context of our lab.

Okay, stop here and try to solve the requirements by yourself. Think of how you would approach each requirement and what technologies you would use. Try to imagine all failure scenarios and how your solution fits in. Then return to the lesson and check our solution.

Solution

Let's now see how we can approach each requirement and what technologies we will use.

Step 1: Creating the necessary site-lists

Typically, when we plan a new Cisco SD-WAN deployment, the first thing is always to group the sites with the same roles into site lists. The site-ID is basically the most important parameter in the entire Cisco SD-WAN solution.

A site-ID is a unique location identifier in the overlay network with a value starting from 1 through 4294967295 (2³²-1). All vEdge routers that reside at the same site must be configured with the same site-id. By default, overlay tunnels are not formed between vEdge routers that share the same site-id.

Each organization develops its own site-id convention depending on its specific design. Figure 4 below shows a simple example of a site-id convention using 7 digits (of max 10). The first digit denotes the site region, the second digit denotes the site role, the third digit denotes the site type, and the 4-7 digits denote the unique site number.

Figure 4. Cisco SD-WAN site-id convention

Using this site-id convention, for example, network administrators can easily match all prefixes of branches located in Europe - site ids 2000000 - 2999999. Or all hubs located in North America - site-ids 1100000 - 1199999. Or all branches located in Europe that have single color - site-ids 2210000 - 2219999, and so on.

Remember - The better the site-id design the easier to design and implement policies!

In our particular lesson, we use the simplest possible site-id values from 1 through 6. However, let's group the sites by their role into different site-list:

policy
 lists
  site-list PRIMARY-HUB
   site-id 1
  !
  site-list BACKUP-HUB
   site-id 6
  !
  site-list SPOKES
   site-id 3-5
!

Step 2: Adjusting the overlay topology

According to the requirements, we must achieve the hub-and-spoke topology, as illustrated in figure 3. In Cisco SD-WAN, we control the overlay topology by controlling which tloc routes are advertised to which particular sites. The logic is the following: if a site doesn't receive the tlocs of another site, overlay tunnels between the two sites will never be established (because the routers don't know the remote routers' tlocs).

In our lab scenario, we will need to ensure that spokes do not receive tlocs of other spokes. This will prevent the establishment of overlay tunnels between branches; therefore, the topology will become hub-and-spoke.

To implement this logic, we will create a control policy and apply it in an outbound direction to the SPOKES site list. A general rule of thumb is that we use an outbound control policy if we want to influence the OMP advertisements to a particular site list. If we're going to influence the OMP routing of the entire overlay fabric, we use an inbound control policy. If you don't feel confident about policy directions, check out our lesson that explains the difference in great detail.

In our case, we want to influence to tloc advertisement to spokes, therefore, we will configure a new control policy and will apply it to the SPOKES site list in an outbound direction.

policy
 control-policy VSMART-TO-SPOKES
  // allowing spokes to receive primary-hub's tlocs
  sequence 11
   match tloc
    site-list PRIMARY-HUB
   !
   action accept
  !
  // allowing spokes to receive backup-hub's tlocs
  sequence 21
   match tloc
    site-list BACKUP-HUB
   !
   action accept
  !
  // all spokes' tlocs will hit the default reject
  default-action reject
!

Now let's apply the policies in the outbound direction (from the perspective of vSmart) and see the results.

apply-policy
 site-list SPOKES
  control-policy VSMART-TO-SPOKES out
 !

Let's check the topology on any of the spokes (vEdge-3, for example); we can see that each spoke has overlay tunnels to the primary hub (highlighted in yellow) and to the backup hub (in orange). There are no spoke-to-spoke tunnels.

vEdge-3# show bfd sessions | t

SYSTEM   SITE                                     DETECT      TX                  
IP       ID    LOCAL COLOR   COLOR         STATE  MULTIPLIER  INTERVAL  UPTIME   
---------------------------------------------------------------------------------
1.1.1.1  1     mpls          mpls          up     7           1000     0:00:09:54
1.1.1.1  1     biz-internet  biz-internet  up     7           1000     0:00:09:54
2.2.2.2  1     mpls          mpls          up     7           1000     0:00:09:54
2.2.2.2  1     biz-internet  biz-internet  up     7           1000     0:00:09:54.
6.6.6.6  6     mpls          mpls          up     7           2000     0:00:09:25
6.6.6.6  6     biz-internet  biz-internet  up     7           1000     0:00:09:25

If we check the primary hubs, we will see that the primary hub has overlay tunnels to spokes (in yellow) and to the backup hub (in green).

vEdge-1# show bfd sessions | t

SYSTEM   SITE                                     DETECT      TX                  
IP       ID    LOCAL COLOR   COLOR         STATE  MULTIPLIER  INTERVAL  UPTIME    
----------------------------------------------------------------------------------
3.3.3.3  3     mpls          mpls          up     7           1000      0:01:25:59
3.3.3.3  3     biz-internet  biz-internet  up     7           1000      0:01:25:59
4.4.4.4  4     mpls          mpls          up     7           1000      0:01:25:59
4.4.4.4  4     biz-internet  biz-internet  up     7           1000      0:01:25:59
5.5.5.5  5     mpls          mpls          up     7           1000      0:01:26:00
5.5.5.5  5     biz-internet  biz-internet  up     7           1000      0:01:26:00.
6.6.6.6  6     mpls          mpls          up     7           2000      0:01:25:53
6.6.6.6  6     biz-internet  biz-internet  up     7           1000      0:01:25:53

Therefore, the topology is as intended. However, there is no OMP routing at the moment because all vRoutes hit the default reject clause at the end of the control policy applied to spokes. So let's move on and approach the traffic engineering requirements.

Step 3. Traffic Engineering

The topology is hub-and-spoke, with both data centers being hubs. Now, we want to ensure that spokes communicate with each other via the primary hubs and that spokes communicate with DC2 via the primary hubs as well.

In normal circumstances, the traffic between spokes, and the traffic to DC1 and DC2 must go through the primary hub, as illustrated in figure 5 below.

However, in case DC1 becomes down, the traffic between spokes must go through the backup hub, as illustrated in figure 6 below.

Figure 6. TE when primary hub is down — Figure 6. TE when the primary hub is down

Okay, when it comes to OMP routing, there is something important to remember about the Cisco SD-WAN solution (basically about any sd-wan solution):

There is no auto-discovery of routing peers and next-hops as in traditional routing protocols such as EIGRP and OSPF. All nodes peer with vSmart and not between each other.
Subsequently, the routing topology never changes (because the topology is an overlay).
What actually changes is only the reachability to the next hops.

Okay, what does that mean?

Practically speaking, this means that if we want to have a primary next-hop to a prefix and a backup next-hop (in case the primary fails), we must have overlay tunnels to both next-hops. Additionally, we must have OMP routes for that prefix via both next-hops.

There is no way to conditionally change the topology, for example, to establish tunnels to the backup hub only when the primary hub is down.
There is no way to conditionally change routes' next-hops, for example, changing the next-hop of spokes routes to point to the backup hub only when the primary hub is down.

This means we must have overlay tunnels to both data centers (primary hub and backup hub) and OMP routes pointing to both data centers' tlocs. Then in normal circumstances, spokes must prefer the primary hub's tlocs. In case these tlocs become unreachable, spokes will install the less-preferred routes to the backup hub's tlocs.

Okay, having this logic in mind, let's create a tloc-list with the tlocs of both data centers.

policy
 lists
  tloc-list PRIMARY-AND-BACKUP-TLOCS
   tloc 1.1.1.1 color mpls encap ipsec
   tloc 1.1.1.1 color biz-internet encap ipsec
   tloc 2.2.2.2 color mpls encap ipsec
   tloc 2.2.2.2 color biz-internet encap ipsec
   tloc 6.6.6.6 color mpls encap ipsec
   tloc 6.6.6.6 color biz-internet encap ipsec
  !

We will apply this tloc-list as next-hop for each spoke prefix when adverting OMP routes to spokes. Then each prefix will point to both data centers' tocs. Notice that each spokes prefix will be reachable via six tlocs. Therefore, there will be six OMP routes for each prefix. But recall that by default, vSmart advertises only the best four OMP routes (send-path-limit 4) and that routers install only the best four routes into their VPN routing tables (ecmp-limit 4).

Therefore, we need to adjust the send-path-limit value on vSmart. Let's make it 16.

// on vSmart
omp
 send-path-limit  16
!

And we need to adjust the ecmp-limit value of vEdges accordingly.

// on all vEdges
omp
 ecmp-limit  16
!

Now let's configure the OMP routing in the control-policy applied to spokes.

policy
 control-policy VSMART-TO-SPOKES
  sequence 11
   match tloc
    site-list PRIMARY-HUB
   !
   action accept
  !
  sequence 21
   match tloc
    site-list BACKUP-HUB
   !
   action accept
  !
// matches all spokes' prefixes
  sequence 31
   match route
    site-list SPOKES
   !
// changes the next-hop to point to both data centers
   action accept
    set
     tloc-list PRIMARY-AND-BACKUP-TLOCS
  !
// matches all dc2's prefixes
  sequence 41
   match route
    site-list BACKUP-HUB
   !
// change the next-hop to point to both data centers
   action accept
    set
     tloc-list PRIMARY-AND-BACKUP-TLOCS
  !
// matches all dc1's prefixes
  sequence 51
   match route
    site-list PRIMARY-HUB
   !
// accepts them without changing the next-hop
   action accept
  !
  default-action reject
!

Now, if we check the routing on one of the spokes (vEdge-5, for example) for any other spokes prefix, we will see that the routing points to both data centers. In yellow are highlighted the routes pointing to the primary hub, and in orange are the ones pointing to the backup hub.

vEdge-5# show omp routes 10.3.5.0/24 | t 

                              PATH                 ATTRIBUTE                                             
VPN  PREFIX       FROM PEER   ID    LABEL  STATUS  TYPE       TLOC IP   COLOR         ENCAP  PREFERENCE  
---------------------------------------------------------------------------------------------------------
5    10.3.5.0/24  1.1.1.30    26    1006   C,I,R   installed  1.1.1.1   mpls          ipsec  -           
                  1.1.1.30    27    1006   C,I,R   installed  1.1.1.1   biz-internet  ipsec  -           
                  1.1.1.30    28    1016   C,I,R   installed  2.2.2.2   mpls          ipsec  -           
                  1.1.1.30    29    1016   C,I,R   installed  2.2.2.2   biz-internet  ipsec  -           .
                  1.1.1.30    30    1006   C,I,R   installed  6.6.6.6   mpls          ipsec  -           
                  1.1.1.30    31    1006   C,I,R   installed  6.6.6.6   biz-internet  ipsec  -

If we also check the routing to DC2, we will also see that it points to both data centers as well.

vEdge-5# sh omp route 10.6.5.0/24 | t

                                PATH              ATTRIBUTE                                            
VPN    PREFIX        FROM PEER  ID   LABEL STATUS TYPE       TLOC IP   COLOR          ENCAP  PREFERENCE
-------------------------------------------------------------------------------------------------------
5      10.6.5.0/24   1.1.1.30   10   1006  C,I,R  installed  1.1.1.1   mpls           ipsec  -         
                     1.1.1.30   11   1006  C,I,R  installed  1.1.1.1   biz-internet   ipsec  -         
                     1.1.1.30   12   1016  C,I,R  installed  2.2.2.2   mpls           ipsec  -         
                     1.1.1.30   13   1016  C,I,R  installed  2.2.2.2   biz-internet   ipsec  -         .
                     1.1.1.30   14   1006  C,I,R  installed  6.6.6.6   mpls           ipsec  -         
                     1.1.1.30   15   1006  C,I,R  installed  6.6.6.6   biz-internet   ipsec  -

At this point, spokes will be able to reach other spokes using both data centers. However, by requirement, we must ensure that DC1 is used as primary hub and DC2 is used as backup hub only in case the primary fails.

Making DC1 primary hub

Now that the overlay topology is as intended and there is hub-and-spoke routing between the branches and both data centers, we want to make DC1 a preferred next-hop in normal circumstances.

Typically, there are a few different ways to make spokes prefer one hub over another. The key point here is whether the chosen solution must apply to all VPNs or to a particular VPN-id only.

If we want to influence the routing for a particular VPN-id, we manipulate the OMP preference of that VPN's routes.
If we want to influence the routing decision for all VPNs, we manipulate the TLOC preference of the next hops.

In our example, we want to make the primary hub preferred next-hop for all VPNs, and therefore we will use higher TLOC preference on DC1's tlocs.

vSmart# conf t
Entering configuration mode terminal
vSmart(config)# 
vSmart(config)# policy control-policy VSMART-TO-SPOKES sequence 11 action accept set preference 110
vSmart(config-set)# commit and-quit
Commit complete.

Now, if we check the routing, we will see that only the routes pointing to the primary hub are chosen as best (C flag) and installed in the routing table (I flag), because they have higher TLOC preference than the ones of DC2.

vEdge-5# sh omp route 10.6.5.0/24 | t --> routing from a spoke to DC2
                                    PATH               ATTRIBUTE                                           
VPN    PREFIX        FROM PEER      ID   LABEL  STATUS TYPE       TLOC IP   COLOR         ENCAP  PREFERENCE
-----------------------------------------------------------------------------------------------------------
5      10.6.5.0/24   1.1.1.30       10   1006   C,I,R  installed  1.1.1.1   mpls          ipsec  -         
                     1.1.1.30       11   1006   C,I,R  installed  1.1.1.1   biz-internet  ipsec  -         
                     1.1.1.30       12   1016   C,I,R  installed  2.2.2.2   mpls          ipsec  -         
                     1.1.1.30       13   1016   C,I,R  installed  2.2.2.2   biz-internet  ipsec  -         
                     1.1.1.30       14   1006   R      installed  6.6.6.6   mpls          ipsec  -         
                     1.1.1.30       15   1006   R      installed  6.6.6.6   biz-internet  ipsec  -         

vEdge-5# show omp routes 10.3.5.0/24 | t --> routing from a spoke to a spoke

                                    PATH               ATTRIBUTE                                           
VPN    PREFIX        FROM PEER      ID   LABEL  STATUS TYPE       TLOC IP   COLOR         ENCAP  PREFERENCE
-----------------------------------------------------------------------------------------------------------
5      10.3.5.0/24   1.1.1.30       26   1006   C,I,R  installed  1.1.1.1   mpls          ipsec  -         
                     1.1.1.30       27   1006   C,I,R  installed  1.1.1.1   biz-internet  ipsec  -         
                     1.1.1.30       28   1016   C,I,R  installed  2.2.2.2   mpls          ipsec  -         
                     1.1.1.30       29   1016   C,I,R  installed  2.2.2.2   biz-internet  ipsec  -         
                     1.1.1.30       30   1006   R      installed  6.6.6.6   mpls          ipsec  -         
                     1.1.1.30       31   1006   R      installed  6.6.6.6   biz-internet  ipsec  -

The ultimate test would be to check the data path from a spoke (vEdge-5) to DC2. We can see that the traffic goes through the primary hub.

vEdge-5(SPOKE)# traceroute vpn 5 10.6.5.1
Traceroute  10.6.5.1 in VPN 5
traceroute to 10.6.5.1 (10.6.5.1), 30 hops max, 60 byte packets
 1  10.1.5.2 (10.1.5.2)  31.802 ms  31.410 ms -->> PRIMARY-DC (vEdge-1)
 2  10.6.5.1 (10.6.5.1)  51.854 ms  72.212 ms -->> DR-DC (vEdge-6)

Lastly, we need to verify that the solution works for all VPNs. If we check the routing to VPN4, for example, we can see that the routes pointing to the primary hubs are best.

vEdge-5# show omp routes vpn 4 10.3.4.0/24 | t

                 PATH                      ATTRIBUTE                                                       
FROM PEER        ID     LABEL    STATUS    TYPE       TLOC IP          COLOR            ENCAP  PREFERENCE  
-----------------------------------------------------------------------------------------------------------
1.1.1.30         34     1005     C,I,R     installed  1.1.1.1          mpls             ipsec  -           
1.1.1.30         35     1005     C,I,R     installed  1.1.1.1          biz-internet     ipsec  -           
1.1.1.30         36     1015     C,I,R     installed  2.2.2.2          mpls             ipsec  -           
1.1.1.30         37     1015     C,I,R     installed  2.2.2.2          biz-internet     ipsec  -           
1.1.1.30         38     1009     R         installed  6.6.6.6          mpls             ipsec  -           
1.1.1.30         39     1009     R         installed  6.6.6.6          biz-internet     ipsec  -

So far, so good. Let's now see how the network behaves in different failure scenarios.

Failure Scenario: No overlay tunnels between DC1 and DC2

At the moment, DC2's prefixes have next-hop pointing to the primary hubs. What do you think will happen if we shut down DC1's biz-interface color and DC2's mpls color?

Let's find out.

//shutting down the ge0/0 interfaces of vEdges 1 and 2
vEdge-1/2# conf t
Entering configuration mode terminal
vEdge-1/2(config)# vpn 0 interface ge0/0 shutdown 
vEdge-1/2(config-interface-ge0/0)# commit and-quit 
Commit complete.
!
//shutting down the ge0/1 interface of vEdge-6
vEdge-6# conf t
Entering configuration mode terminal
vEdge-6(config)# vpn 0 interface ge0/1 shutdown 
vEdge-6(config-interface-ge0/1)# commit and-quit 
Commit complete.

You can see that the routing for DC2's prefix still points to the primary hub, with the only exception being the biz-internet color is gone.

vEdge-5# show omp routes vpn 5 10.6.5.0/24 | t -->checking the routing from a spoke to DC2

                 PATH                      ATTRIBUTE                                                       
FROM PEER        ID     LABEL    STATUS    TYPE       TLOC IP          COLOR            ENCAP  PREFERENCE  
-----------------------------------------------------------------------------------------------------------
1.1.1.30         185    1006     C,I,R     installed  1.1.1.1          mpls             ipsec  -           
1.1.1.30         186    1016     C,I,R     installed  2.2.2.2          mpls             ipsec  -           
1.1.1.30         187    1006     R         installed  6.6.6.6          biz-internet     ipsec  -

So the traffic from spokes to DC2 still goes to the primary. But the primary hub does not have overlay tunnels to DC2! And if we try to ping the DC2's prefix, we will see that spokes aren't able to reach it.

vEdge-5# ping 10.6.5.1 vpn 5
Ping in VPN 5
PING 10.6.5.1 (10.6.5.1) 56(84) bytes of data.
^C
--- 10.6.5.1 ping statistics ---
52 packets transmitted, 0 received, 100% packet loss, time 52237ms

You can see that, even though spokes have direct overlay tunnels to DC2, they cannot reach it, because their OMP routing points to the primary hub, and the primary hub doesn't have active tunnels to DC2.

Okay, no shutdown DC1's biz-interfaces and DC2's mpls interface, and check out how we can solve this problem.

Failure Scenario: The solution

The solution to the failure scenario is to enable the vSmart controller to perform end-to-end path reachability tracking to the complete overlay path from spokes through DC1 to DC2.

Full Content Access is for Registered Users Only (it's FREE)...

Learn any CCNA, DevNet or Network Automation topic with animated explanation.
We focus on simplicity. Networking tutorials and examples written in simple, understandable language for beginners.

Comments

akoussan

Mon, 10/10/2022 - 12:22

Excellent !! Thanks

mziimerlbourne…

Sun, 03/26/2023 - 20:58

These articles are helping me with real world solutions, redundancy is a pain 90% of the time.

Thank you so so much.

Packet Sniffer

Sun, 05/07/2023 - 12:01

Brilliant article, thanks!

swapnilrane4000

Sun, 07/23/2023 - 04:20

Hi Thank you so much for the article. Are there any other payment ways to buy you cofee, e.g like using Gpay/wallet methods, where i can scan upid and pay to you for buying you 2/3 coffes. Also Somehow the book buying option in India is not working for me, but I will try somehow from Amazon to buy your book .Thanks again so much for the hard work!

Prashant.mimi

Tue, 12/05/2023 - 12:47

Hi Ivan,
Thanks for the article .Can you explain a senario if we have single MPLS for remote site and in DC-DR , How can we design the solution .

muhibullahkhal…

Mon, 12/11/2023 - 05:15

HI Ivan,

I do not see the restrict keyword in the config.
Why wont the vEdge-1 establish tunnel to vEdge-6 from the MPLS to biz-Internet

vEdge-1 (MPLS) -- vEdge-6 (biz-Int)

Sunnyaneja

Mon, 05/13/2024 - 14:32

Control policies are unidirectional, We made DC1 the primary hub for DC2 destination. forward traffic from spoke to DC2 is fine but how does reverse work from DC2 to spoke?

usotor

Thu, 07/17/2025 - 05:25

You put that: 1 10.1.5.2 (10.1.5.2) 31.802 ms 31.410 ms -->> PRIMARY-DC (vEdge-1) and that IP corresponds to vEdge-2