Profile 14 (Partitioned MDT)
Load basic.startup.config.with.cpim.and.bgp.cfg
The basic IP addresses, L3VPN, and C-PIM between the PEs and CEs is pre-configured.
Configure multicast VPN using mLDP with P2MP.
Use a partitioned default MDT tree instead of default MDT tree.
Use BGP for auto-discovery of participating PEs.
CE1 is configured as the BSR for the C-PIM.
Use BGP as the overlay - PEs are not allowed to form C-PIM adjacencies with each other.
See answer below (scroll down).
Answer
Before we get into the verification, let’s explore the theory behind partitioned MDT.
Theory/Verification
The idea behind partitioned MDT is to reduce state in the core. With “regular” default MDT using mLDP P2MP, we would have a P2MP tree rooted at every PE participating in the multicast VPN. This full mesh of P2MP trees is setup by default by:
First discovering all PEs participating in the multicast VPN using the type 1 BGP ipv4/mvpn route
Each PE joins a P2MP mLDP tree rooted at every PE learned via BGP, using the FEC (PE lo0, opaque value) found in the BGP route
In contrast, when using paritioned MDTs, the only MDTs that are created in the core are created “on-demand.” When you bring up all PEs, none of the PEs will join any P2MP trees unless they specifically want to receive traffic from a certain PE.
A PE knows it wants to receive traffic from another PE usually based on PIM Joins from the customer site. The RPF interface points to a nexthop PE, and the local PE then joins a P2MP rooted at the nexthop PE.
Interestingly, with partitioned MDT, we still have data MDT switchover, or S-PMSI tunnels. But we now longer have any I-PMSI tunnels. The I-PMSI is replaced by a “partitioned” PMSI.
So how does a PE know what FEC to use to receive traffic on a “partitioned” PMSI for a given PE? The FEC is learned via the type 3 route.
Examine the current routing table on PE1.
First we can notice that each PE originates a type 1 and type 3 route. We have seen the auto-generated type 1 routes before, however they do not appear to be useful in this profile. Because we are using partitioned MDTs instead of a default MDT, which means there is no I-PMSI, the type 1 routes are not really necessary.
The type 3 routes are of interest though. The partitioned MDT is essentially created by a wildcard S-PMSI tree rooted at each PE. We have seen the type 3 route before, which is used to indicate that “I have customer S, G traffic, so if you want to receive it on my new S-PMSI, use this FEC.”
With a wildcard (*) for both the Customer S and Customer G, it essentially means “I have an S-PMSI for any traffic (default traffic). If you wish to receive traffic on the partitioned MDT rooted at me, use this FEC.” Inspecting the details of any single type 3 route shows us that a FEC is provided as we would expect:
Notice that there is one additional entry , which is *, 224.0.0.13 advertised by PE1. This happens because CE1 is the BSR. PE1 advertises the BSR using PIM, which uses 224.0.0.13. So PE1 is essentially advertising “if you want to receive any PIM traffic, I have some, so use this FEC.” Let’s inspect the details to see the opaque value.
As you can guess, all PEs are interested in receiving PIM traffic, so they all join the P2MP tree rooted at PE1, using that given opaque value. We can see this on P2. The only P2MP tree so far is this one tree.
Using this tree, all CEs will learn the BSR. Verify that CE3 has the correct RP mapping.
Next, let’s join a (S, G) on a customer receiver using SSM. First we enable SSM in the C-PIM (not shown) and then join (10.1.1.10, 232.1.2.3) on C2 (not shown). SSM will be easier to verify before we move on to ASM.
As we would expect, this (S, G) Join prompts a type 7 route from PE2 and is “sent to” PE1 using PE1’s RT.
However, what you might not have expected, is that now PE2 has joined PE1’s paritioned MDT, which PE2 learned from PE1’s (*, *) type 3 route. We can verify this on P2:
We now see two P2MP trees on P2:
The P2MP tree for (*, 224.0.0.13) rooted at PE1 as we saw before
The paritioned P2MP tree for (*, *) rooted at PE1 with only PE2 as the downstream client
Now we can appreciate how the paritioned MDT saves state in the core. So far only PE2 is interested in receiving data traffic from PE1, so that is the only P2MP tree in the core right now (besides the tree for the BSR messages). If we used default MDT, we would have three P2MP trees right now, with all PEs as downstream clients of all other PEs.
Note that PE2 joined PE1’s partitioned MDT before any traffic was even sent.
If we send traffic from C1 to 232.1.2.3, it flows down the P2MP paritioned tree rooted at PE1. Because we have not configured data MDTs, the traffic will not switch over to its own S-PMSI tunnel.
Let’s now look at ASM traffic. On C3 we join (*, 239.1.2.3) (not shown). PE3 creates a type 6 route (*, G Join) which only PE1 imports (as PE1 is the PE connecting to the RP).
As we can guess this time, PE3 joins the paritioned P2MP rooted at PE1 which it learned via PE1’s (*, *) type 3 route. Now P2 still has state for only two trees, but PE3 has been added as a downstream client to the paritioned MDT rooted at PE1.
When we ping 239.1.2.3 from C1, we see the same process as we have seen before with SPT-switchover. The PE infront of the source (PE1 here) originates a type 5 source active message. The LHR joins the (S, G) tree rooted at this source.
IOS-XE Strict-RPF interface
The summary is that PE3 joins (S1, G) rooted at PE1 on its partitioned MDT and (S2, G) rooted at PE2 on its partitioned MDT. PE4 joins (S1, G) but rooted at PE2 on its paritioned MDT.
PE3 will receive the stream (S1, G) on both the PE1 and PE2 partitioned MDTs. PE3 will accept both streams because they both arrive via the RPF interface, Lspvif0. PE3 is now forwarding duplicate traffic onto the customer site.
To prevent this, the command mdt strict-rpf interface can be used. This causes a Lspvif interface per-PE for each VRF, allowing the RPF check to be done on a per RPF neighbor basis (the neighbor is the PE), instead of per-VRF basis.
Back to our lab, PE2 has joined three P2MP trees:
The (*, 224.0.0.13) tree rooted at PE1
The (*, *) tree rooted at PE1
The (*, *) tree rooted at PE3
PE2 has created three Lspvif interfaces, one to receive traffic for each tree:
We can see which interface is used for which (S, G) entry in the mroute table:
Because we only want PE2 to accept traffic from each Lspvif individually, per (S, G) entry, we use the command mdt strict-rpf interface. I don’t exactly understand why this can’t be a default command that IOS-XE implements automatically, as it seems to be the case for IOS-XR. But just remember this command when implementing partitioned MDT.
Note, upon looking at this again, I’m not clear whether IOS-XR implements this or not. I find that only a single Lmdt interface appears to be used for the VRF, not per-PE.
Summary
Profile 14 is now the preferred mVPN profile. You achieve maximal optimization of state in the core by using only P2MP trees and only on demand. The MP2MP tree is inefficient (all traffic must go through the arbitrary node) and prone to failure, requiring root node redundancy techniques. A default MDT built using P2MP trees requires a full mesh as in profile 12. This full mesh of P2MP trees uses up state in the core and is maintained even when no mVPN traffic is being forwarded!
Profile 14’s use of on-demand (partitioned) P2MP trees is much more efficient. Only PEs that are forwarding traffic will have P2MP trees rooted at themselves being built in the core.
Note that the following is supported with profile 14:
Extranet (sources and receivers in different VRFs)
Inter-AS
CSC (only with IOS-XR)
mLDP runs on the VRF interface
Profile 14 in global context (on IOS-XR)
As another note, LFA and TI-LFA will automatically provide protection for mLDP trees. You can also use RSVP-TE for FRR as well.
Because many operators are now using SR-only in the core, they have two options for mVPN:
Use mLDP for only multicast
You can disable IPv4 label bindings by using mpls ldp capabilities sac mldp-only
sac means State Advertisement Control
Use Tree-SID
Requires SR-PCE
Last updated