Profile 14 (Partitioned MDT)

Load basic.startup.config.with.cpim.and.bgp.cfg

#IOS-XE
config replace flash:basic.startup.config.with.cpim.and.bgp.cfg
Y

#IOS-XR
configure
load bootflash:basic.startup.config.with.cpim.and.bgp.cfg
commit replace
y

The basic IP addresses, L3VPN, and C-PIM between the PEs and CEs is pre-configured.

Configure multicast VPN using mLDP with P2MP.
Use a partitioned default MDT tree instead of default MDT tree.
Use BGP for auto-discovery of participating PEs.
CE1 is configured as the BSR for the C-PIM.
Use BGP as the overlay - PEs are not allowed to form C-PIM adjacencies with each other.

See answer below (scroll down).

Answer

#PE1, PE2
vrf def CUSTOMER
 add ipv4
  mdt partitioned mldp p2mp
  mdt auto-discovery mldp
  mdt overlay use-bgp
  mdt strict-rpf interface
!
router bgp 100
 add ipv4 mvpn
  neighbor 10.10.10.10 activate

#P1
mpls ldp mldp add ipv4
!
router bgp 100
 add ipv4 mvpn
 neighbor-group IBGP
  add ipv4 mvpn
   route-reflector-client

#P2
mpls ldp mldp add ipv4

#PE3
mpls ldp mldp add ipv4
!
router bgp 100
 add ipv4 mvpn
 neighbor 10.10.10.10
  add ipv4 mvpn
 vrf CUSTOMER
  add ipv4 mvpn
!
route-policy USE_MLDP_PARITIONED
 set core-tree mldp-partitioned-p2mp
end-policy
!
router pim vrf CUSTOMER add ipv4
 rpf top route-policy USE_MLDP_PARITIONED
 mdt c-multicast-routing bgp
!
multicast-routing add ipv4 int lo0 en
multicast-routing vrf CUSTOMER add ipv4
 mdt so lo0
 mdt partitioned mldp ipv4 p2mp
 bgp auto-discovery mldp

Before we get into the verification, let’s explore the theory behind partitioned MDT.

Theory/Verification

The idea behind partitioned MDT is to reduce state in the core. With “regular” default MDT using mLDP P2MP, we would have a P2MP tree rooted at every PE participating in the multicast VPN. This full mesh of P2MP trees is setup by default by:

First discovering all PEs participating in the multicast VPN using the type 1 BGP ipv4/mvpn route
Each PE joins a P2MP mLDP tree rooted at every PE learned via BGP, using the FEC (PE lo0, opaque value) found in the BGP route

In contrast, when using paritioned MDTs, the only MDTs that are created in the core are created “on-demand.” When you bring up all PEs, none of the PEs will join any P2MP trees unless they specifically want to receive traffic from a certain PE.

A PE knows it wants to receive traffic from another PE usually based on PIM Joins from the customer site. The RPF interface points to a nexthop PE, and the local PE then joins a P2MP rooted at the nexthop PE.

Interestingly, with partitioned MDT, we still have data MDT switchover, or S-PMSI tunnels. But we now longer have any I-PMSI tunnels. The I-PMSI is replaced by a “partitioned” PMSI.

So how does a PE know what FEC to use to receive traffic on a “partitioned” PMSI for a given PE? The FEC is learned via the type 3 route.

Examine the current routing table on PE1.

First we can notice that each PE originates a type 1 and type 3 route. We have seen the auto-generated type 1 routes before, however they do not appear to be useful in this profile. Because we are using partitioned MDTs instead of a default MDT, which means there is no I-PMSI, the type 1 routes are not really necessary.

The type 3 routes are of interest though. The partitioned MDT is essentially created by a wildcard S-PMSI tree rooted at each PE. We have seen the type 3 route before, which is used to indicate that “I have customer S, G traffic, so if you want to receive it on my new S-PMSI, use this FEC.”

With a wildcard (*) for both the Customer S and Customer G, it essentially means “I have an S-PMSI for any traffic (default traffic). If you wish to receive traffic on the partitioned MDT rooted at me, use this FEC.” Inspecting the details of any single type 3 route shows us that a FEC is provided as we would expect:

Notice that there is one additional entry , which is *, 224.0.0.13 advertised by PE1. This happens because CE1 is the BSR. PE1 advertises the BSR using PIM, which uses 224.0.0.13. So PE1 is essentially advertising “if you want to receive any PIM traffic, I have some, so use this FEC.” Let’s inspect the details to see the opaque value.

As you can guess, all PEs are interested in receiving PIM traffic, so they all join the P2MP tree rooted at PE1, using that given opaque value. We can see this on P2. The only P2MP tree so far is this one tree.

Using this tree, all CEs will learn the BSR. Verify that CE3 has the correct RP mapping.

Next, let’s join a (S, G) on a customer receiver using SSM. First we enable SSM in the C-PIM (not shown) and then join (10.1.1.10, 232.1.2.3) on C2 (not shown). SSM will be easier to verify before we move on to ASM.

As we would expect, this (S, G) Join prompts a type 7 route from PE2 and is “sent to” PE1 using PE1’s RT.

However, what you might not have expected, is that now PE2 has joined PE1’s paritioned MDT, which PE2 learned from PE1’s (*, *) type 3 route. We can verify this on P2:

We now see two P2MP trees on P2:

The P2MP tree for (*, 224.0.0.13) rooted at PE1 as we saw before
The paritioned P2MP tree for (*, *) rooted at PE1 with only PE2 as the downstream client

Now we can appreciate how the paritioned MDT saves state in the core. So far only PE2 is interested in receiving data traffic from PE1, so that is the only P2MP tree in the core right now (besides the tree for the BSR messages). If we used default MDT, we would have three P2MP trees right now, with all PEs as downstream clients of all other PEs.

Note that PE2 joined PE1’s partitioned MDT before any traffic was even sent.

If we send traffic from C1 to 232.1.2.3, it flows down the P2MP paritioned tree rooted at PE1. Because we have not configured data MDTs, the traffic will not switch over to its own S-PMSI tunnel.

Let’s now look at ASM traffic. On C3 we join (*, 239.1.2.3) (not shown). PE3 creates a type 6 route (*, G Join) which only PE1 imports (as PE1 is the PE connecting to the RP).

As we can guess this time, PE3 joins the paritioned P2MP rooted at PE1 which it learned via PE1’s (*, *) type 3 route. Now P2 still has state for only two trees, but PE3 has been added as a downstream client to the paritioned MDT rooted at PE1.

When we ping 239.1.2.3 from C1, we see the same process as we have seen before with SPT-switchover. The PE infront of the source (PE1 here) originates a type 5 source active message. The LHR joins the (S, G) tree rooted at this source.

IOS-XE Strict-RPF interface

By default, a single LSPvif interface is used for all partitioned MDTs. This leads to the possibility of the egress PE receiving duplicate traffic for the same stream. See this article for details: https://www.cisco.com/c/en/us/support/docs/ip/multicast/118677-technote-mvpn-00.html

The summary is that PE3 joins (S1, G) rooted at PE1 on its partitioned MDT and (S2, G) rooted at PE2 on its partitioned MDT. PE4 joins (S1, G) but rooted at PE2 on its paritioned MDT.

PE3 will receive the stream (S1, G) on both the PE1 and PE2 partitioned MDTs. PE3 will accept both streams because they both arrive via the RPF interface, Lspvif0. PE3 is now forwarding duplicate traffic onto the customer site.

To prevent this, the command mdt strict-rpf interface can be used. This causes a Lspvif interface per-PE for each VRF, allowing the RPF check to be done on a per RPF neighbor basis (the neighbor is the PE), instead of per-VRF basis.

Back to our lab, PE2 has joined three P2MP trees:

The (*, 224.0.0.13) tree rooted at PE1
The (*, *) tree rooted at PE1
The (*, *) tree rooted at PE3

PE2 has created three Lspvif interfaces, one to receive traffic for each tree:

We can see which interface is used for which (S, G) entry in the mroute table:

Because we only want PE2 to accept traffic from each Lspvif individually, per (S, G) entry, we use the command mdt strict-rpf interface. I don’t exactly understand why this can’t be a default command that IOS-XE implements automatically, as it seems to be the case for IOS-XR. But just remember this command when implementing partitioned MDT.

Note, upon looking at this again, I’m not clear whether IOS-XR implements this or not. I find that only a single Lmdt interface appears to be used for the VRF, not per-PE.

Summary

Profile 14 is now the preferred mVPN profile. You achieve maximal optimization of state in the core by using only P2MP trees and only on demand. The MP2MP tree is inefficient (all traffic must go through the arbitrary node) and prone to failure, requiring root node redundancy techniques. A default MDT built using P2MP trees requires a full mesh as in profile 12. This full mesh of P2MP trees uses up state in the core and is maintained even when no mVPN traffic is being forwarded!

Profile 14’s use of on-demand (partitioned) P2MP trees is much more efficient. Only PEs that are forwarding traffic will have P2MP trees rooted at themselves being built in the core.

Note that the following is supported with profile 14:

Extranet (sources and receivers in different VRFs)
Inter-AS
CSC (only with IOS-XR)
- mLDP runs on the VRF interface
Profile 14 in global context (on IOS-XR)

As another note, LFA and TI-LFA will automatically provide protection for mLDP trees. You can also use RSVP-TE for FRR as well.

Because many operators are now using SR-only in the core, they have two options for mVPN:

Use mLDP for only multicast
- You can disable IPv4 label bindings by using mpls ldp capabilities sac mldp-only
  - sac means State Advertisement Control
Use Tree-SID
- Requires SR-PCE

PreviousProfile 12 w/ PE Anycast RP NextProfile 14 with Extranet option #1

Last updated 5 months ago