Troubleshoot #2 (OSPF)
Last updated
Last updated
Load vpnv4.intra-as.tshoot2.init.cfg
R1 and XR2 are running OSPF as the PE-CE protocol for both IPv4 and IPv6. They have a direct link between them but it is low bandwidth. They’d like to use this as a backup link in case their WAN circuits go down.
They set the OSPF metric high for this backup link, but traffic still prefers this link instead of the WAN. Explain the problem and find a solution.
Sham links are similar to OSPF virtual links, in that they allow you to extend an area over another OSPF area. Virtual links allow you to extend area 0 over a transit area. But sham links are used to extend any OSPF area over the MPLS OSPF “super backbone” area.
To configure a sham link, we first need to setup dedicated loopbacks on the two PEs.
These must be advertised into BGP, but prevented from being redistributed into OSPF. The only purpose of these is to form the sham link. The PEs should never be allowed to learn a route to the remote PE’s sham link loopback via OSPF. The remote loopback must resolve via iBGP so that the MPLS label can be gleaned from the associated VPNv4/v6 route for each OSPF route.
Finally, once the PEs have reachability to each other’s loopback, we can simply form the sham link.
On an IOS-XE PE, the routes to the remote CE will appear via the sham link from the perpsective of OSPF. (SPF on the PE resolves the route using the sham link in the graph).
However, when installing into the RIB, the router knows to use the BGP nexthop for the sham link endpoint, instead of literally trying to forward traffic over the sham link. The MPLS service label needs to be derived from the BGP VPNv4 update sent by XR1.
The output states that it is redistributing this OSPF route into BGP, but this is not actually the case. Below, we can see that we only have the route via XR1 in the VPNv4 table. Interestingly, this route counts as rib failure because we are using the “OSPF” route via the sham link in the RIB. It gets a little confusing, but the end result is that traffic is properly labeled using the VPNv4 label.
A similar process happens on IOS-XR. The route to CE1 is known via OSPF through the sham link.
However, on IOS-XR, the router more logically installs the BGP route into the RIB instead of the OSPF route.
The same process happens for OSPFv3 on IOS-XE. The route to the remote CE is via the sham link.
But strangely, something is broken. This route is via Null0, directly connected.
I found that I could fix this by removing the include-connected keyword when redistributing OSPFv3 into BGP.
The issue appears to be that the route is learned by R2 via R1 first. Then for some reason, when it is learned through the sham link as well, it must be considered connected somehow. If the R1-XR2 link is shutdown, the problem on R2 goes away also.
IOS-XR’s behavior for OSPFv3 is the same as OSPFv2 and is not worth showing.
Overall, the goal of the sham link is to produce an adjacency in the given OSPF area in the customer’s OSPF network. The result is that routes can be learned via remote CEs as intra-area, because the graph on each remote site is connected via this sham link. We verify this in more depth in the following section.
You should see two new adjacencies on the PEs:
This causes LSAs originated by each CE to be flooded end-to-end. Because the LSAs are flooded over the sham link, the LSAs will appear in each CE’s LSDB.
Also notice above that the sham link runs as a demand-circuit. This is the reason for the DNA LSAs originated by XR1.
Traffic from R1 to XR2 will now prefer the lower-cost intra-area path over the MPLS WAN:
If one of the WAN circuits has a fault, the OSPF PE-CE adjacency will break, and traffic will failover to the backup link.
This path is only used as a backup path due to the higher cost. And the higher cost only makes a difference because the MPLS WAN path is now intra-area as well, so the WAN path can be preferred over the direct backup link. (Without the sham link, the WAN path is inter-area and cannot be preferred over the intra-area backup link).