MST-AG

A customer of an L2 service may want redundant connections to two PEs in a single site. This creates the possibility of a loop. Traditionally STP is used to allow for redundant L2 connections but block some of the ports to produce a loop-free topology in which BUM traffic will not cause a flooding storm.

Let’s take a simple VPLS example. Below is a single site that connects to two PEs:

We cannot simply run STP, because the two PEs are not directly connected. Additionally, BPDUs are not transported across pseudowires. So this allows for the possibility of looping in the VPLS network. Imagine that a CE at another site sends a BUM frame. PE1 on the left and PE2 will receive it and send it to the access devices. These will flood the traffic back to the PEs (BUM frame from PE1 floods back to PE2), which will again flood the traffic out the pseudowires to other PEs. We need one link in this access network to be blocked to break this loop.

Possible solutions

One solution could be for the PEs to run STP and have a direct connection between them, or have a virtual connection over a pseudowire which transports BPDUs. This would break the loop, but it requires the extra pseudowire just for BPDU propagation, and it also requires a lot of state on the PEs if they have a lot of access domains they connect to. The PEs would have to maintain STP states for every single VPLS domain. This solution isn’t very scalable.

Another solution could be for the PEs to tunnel the BPDUs without participating in STP. They could tunnel the access network’s BPDUs between each other over a special pseudowire. This would result in a loop free topology, but if a PE-CE link goes down, failover takes a full 6 seconds (for RSTP/MST) because 3 hellos need to be missed. This is because the PEs aren’t participating in the protocol, just transporting the BPDUs transparently. Additionally, this means that after a topology change, traffic can be blackholed for up to 5 minutes since the PEs are not flushing the MAC table in response to a TCN.

MST-AG

The best solution is called MST Access Gateway. The PEs simply send statically configured BPDUs every hello period into the access network. Both PEs are configured with an identical virtual root bridge. This is scalable because the PEs are not actually participating in MST. They do not need to run a MST state machine. The only additional functionality they need is to respond to received TCNs by flushing the MAC table and withdrawing MACs. The PEs send MAC withdrawls within the VPLS domain to the remote PEs in response to a received TCN.

Above, each PE statically sends a BPDU reporting connectivity of 0 cost to a root bridge with priority 0 and a bridge ID of 0. This can never be beaten. Each PE has a different priority, which can be configured per-instance, so that a different topology is used for each instance in the access network.

This setup forces the access network to block a link. The PEs will never block a link. Their PE-CE links will always be DP, since the PEs are closest to the virtual root bridge.

PEs can only enable MST-AG on physical interfaces or untagged subinterfaces. The MST protocol only allows for sending untagged BPDUs.

Failover time with MST-AG is not sub 50msec. A link failure results in about 100msec outage, and a node failure can be 2-3 seconds outage (I believe with Hello set to 1 second). MST-AG protects against failure of a link in the access network, or a PE-CE link, and failure of a PE or access node. It can also protect against a PE losing connectivity to the core (which is mentioned later and requires extra configuration).

In this setup, there is no special PE-PE pseudowire or link. However, if a topology change happens in the access network and the access network is partitioned, the PE that receives the TCN may have to propagate the TCN to the other PE which is now partitioned from the access network. This can be done by creating a VPWS and adding the untagged interface participating in MST-AG to the VPWS (shown later).

MST Review

Before we take a look at the configuration for MST-AG, let’s review MST. MST is a standards-based variation of STP that is based on RSTP. A group of switches in the same MST region share the following configuration:

  • The same region name

  • The same revision number (administratively configured)

  • The same vlan-to-instance mapping (determined by an MD5 hash that is sent in the BPDU)

MST BPDUs are sent untagged and contain the root bridge/priorities for each instance. VLANs are manually mapped to instances and this mapping is statically set, and must be the same on all bridges in the region. By having multiple instances, this allows you to use a different root for each instance, so that all links are used in the topology. This is a form of load balancing.

The internal MST instance is instance 0. This is used to speak SPT with bridges outside of the MST region. The entire MST region is presented as if its a single switch to the outside world. This can get complex, but I don’t think this would come up in the CCIE-SP exam, since it seems outside of the scope of MST-AG.

Configuring MST-AG

#PE1
int Gi0/0/0/0.1 l2transport
 encap untagged
!
spanning-tree mstag DOMAIN1
 int Gi0/0/0/0.1
  name REGION1
  revision 1
  bridge-id 0000.0000.0001
  port-id 1
  external-cost 10000 ! The external cost of the PE-CE link;
                      ! the cost used when running SPT with a bridge outside the region 
  hello-time 1        ! Available choices are 1 or 2 seconds
  !
  instance 1
   vlan-id 10-100
   priority 0         ! The priority of this bridge
   cost 1             ! The internal cost of the PE-CE link for this instance
   root-bridge 0000.0000.0000
   root-priority 0
   
#PE2
int Gi0/0/0/0.1 l2transport
 encap untagged
!
spanning-tree mstag DOMAIN1
 int Gi0/0/0/0.1
  name REGION1
  revision 1
  bridge-id 0000.0000.0002
  port-id 1
  external-cost 10000 
  hello-time 1        
  !
  instance 1
   vlan-id 10-100
   priority 4096     
   cost 1           
   root-bridge 0000.0000.0000
   root-priority 0

Guidelines for configuring MST-AG:

  • Both PE devices should have a port path cost of 0

    • This is what the doc says, but I don’t see how to configure this - lowest internal cost option is 1

  • One PE should have a higher bridge priority and ID than the other. This allows you to control which redudant link is blocked (when there is only a single CE).

    • One PE should have bridge priority 0

    • One PE should have bridge priority 4096

    • All access devices should have priority greater than or equal to 8192

To configure MST-AG TCN propagation, you simply put the AC in a VPWS with the other PE. TCNs will be flooded out the VPWS.

#PE1
l2vpn xconnect group VPWS p2p MSTAG
 int Gi0/0/0/0.1
 neighbor 2.2.2.2 pw-id 100

#PE2
l2vpn xconnect group VPWS p2p MSTAG
 int Gi0/0/0/0.1
 neighbor 1.1.1.1 pw-id 100

MST-AG Uplink Tracking

If a PE loses connectivity to the core, it should stop sending BPDUs indicating it is directly connected to the virtual root. You can configure this as follows. All core interfaces that are defined must go down for the router to consider the tracked object down.

spanning-tree mstag DOMAIN1
 preempt delay for 10 seconds
 int Gi0/0/0/0
 track
  int Gi0/0/0/1
  int Gi0/0/0/2

Above, if both Gi0/0/0/1 and Gi0/0/0/2 go down, the router will start sending “startup BPDUs”. These indicate a worse priority than under normal circumstances. You can set the startup root priority/bridge ID as follows:

spanning-tree mstag DOMAIN1
 int Gi0/0/0/0
  instance 1 root-id 0.0.0 startup-value 0.0.1111
  instance 1 root-priority 0 startup-value 8192

When the links come back up, the router will continue sending these startup BPDUs for the preempt delay period (10 seconds above). This is also used to delay sending BPDUs when the access circuit comes up as well. (When the AC first comes up, startup BPDUs are sent for the preempt delay period).

MST-AG Edge Mode

If the PEs have a L3 VLAN mixed with other L2 VLANs, the L3 VLANs must be in their own instance, in which the PEs are set as “edge-mode.”

spanning-tree mstag DOMAIN1
 interface GigabitEthernet0/0/0/0
  instance 2
   vlan-ids 200
   edge-mode

When the PEs terminate the L2 domain as a L3 router, the layer 2 loop is broken. (The PE will not continue to flood a L2 BUM frame when it has an L3 IP address on that VLAN). The PEs typically participate in a gateway redundancy mechanism, such as VRRP. The problem is that if you leave the L3 VLAN in the same instances as other L2 VLANs, one link in the access network will be blocking, and the PEs will not see each other over VRRP.

Edge-mode configures the PE to stop listening for TCNs, and also advertise the worst possible path to the worst possible root. (It’s not clear whether this happens automatically or you need to manually configure high priorities for this instance). Some device in the access network ends up becoming the root, and no links in the access network will be blocking. (Unless there is a loop within the access network itself).

Below, the L2 VLANs are in their own MSTI, in which the PEs report connectivity to a virtual root. The L3 VLANs use a different MSTI, and the access device at the bottom becomes the root, and no links are blocked.

MST-AG Provider Bridge

spanning-tree mstag DOMAIN1
 interface GigabitEthernet0/0/0/0
  provider-bridge

This runs MSTP in 802.1ad mode, in which a different MAC address is used, and BPDUs with 802.1Q MACs are forwarded transparently.

PVSTAG

PVST-AG involves normal per-vlan spanning-tree but with access gateway. This is very similar to MST-AG. The router sends static BPDUs on a per-VLAN basis which report connectivity to a virtual root.

One difference with PVST-AG compared to MST-AG is that topology change propagation is not supported in PVST-AG. Also, TCNs received on a single VLAN will affect all VLANs and BDs on that physical interface.

Additionally, only a single access device (CE) can be attached to the PEs in PVST-AG. This is because TCN propagation is not possible. TCNs can arrive with any vlan tag in PVST, so you can’t simply put all subinterfaces into a VPWS to the remote PE. For this reason, you cannot have more than one CE, as that creates the possibility of having a partitioned access network with no TCN propagation capability.

Configuration can be quite intensive, as each VLAN needs to be manually defined. Also, Q-in-Q subinterfaces are not supported. Only single-tagged dot1q interfaces are allowed. Physical interfaces and L2 interfaces with encap default are not allowed.

int Gi0/0/0/0.10 l2transport
 encap dot1q 10
!
spanning-tree pvstag NAME
 int Gi0/0/0/0.10
  vlan 10
   root-priority 0
   root-id 0.0.0
   root-cost 0
   priority 0
   bridge-id 0.0.1
   port-priority 0
   port-id 1
   max age 20
   hello-time 1

Further Reading

https://www.cisco.com/c/en/us/td/docs/iosxr/ncs5xx/l2vpn/75x/b-l2vpn-cg-75x-ncs540/m-configure-multiple-spanning-tree-protocol-ncs5xx.html

https://community.cisco.com/t5/service-providers-knowledge-base/asr9000-xr-using-mst-ag-mst-access-gateway-mst-and-vpls/ta-p/3110290

https://community.cisco.com/t5/service-providers-knowledge-base/asr9000-xr-using-pvstag-with-cluster-and-satellite-example/ta-p/3148378

https://www.cisco.com/c/en/us/support/docs/routers/asr-9000-series-aggregation-services-routers/116453-technote-ios-xr-l2vpn-00.html

Last updated