TI-LFA
Last updated
Last updated
Load ti-lfa.init.cfg
The cost of the R4-R10 link has been increased to 100. Remote LFA on R3 no longer provides protection of the 4.4.4.1/32 prefix. Use TI-LFA to achieve 100% protection.
Segment routing’s use of Adj-SIDs allows a node to create a stateless LSP along any arbitrary path by programming a SID list at the source node. LDP does not have this concept of a label representing a forwarding action out a specific link. LDP is always tied to the IGP shortest path. SR’s idea of an Adj-SID is very powerful and allows SR to always find a LFA path, even when no PQ node exists, by using TI-LFA.
First we’ll briefly examine RSVP-TE FRR to compare IPFRR (LFA and TI-LFA). With RSVP-TE, we use a facility-based FRR. A single backup path is created on the PLR which excludes the facility (link or node). The problem with this is that all traffic must be stitched back to the NHOP, even if that means hairpining traffic because the final destination is actually within the repair path LSP. This is required because RSVP-TE is circuit based, so the repair LSP must stitch back onto the primary LSP.
In comparison, IPFRR is prefix-based instead of facility-based. Backup paths are computed on a per-destination basis, instead of a per-LSP basis. These do not need to be signaled as with RSVP-TE FRR. The backup paths are also steered along an optimum backup path without the need to stitch back to the nexthop node. This is because the IGP prefixes are protected, not individual circuit-based LSPs.
TI-LFA (Topology Independent LFA) is named as such because it can provide 100% coverage of repair paths no matter what the topology looks like. TI-LFA additionally always provides the post-convergence path (assuming that the link fails and not the node). To do this, TI-LFA simply removes the protected facility (link, node, or interfaces sharing the same SRLG) and then runs CSPF over this topology. This finds the post-convergence path. Then the router intelligently generates a SID list that steers traffic over this path with the minimum number of labels in the stack. This process is repeated for every IGP destination, so that an optimum post-convergence repair path is calculated on a per-prefix basis.
These repair paths also naturally support ECMP, because often a prefix SID is used in the repair path, which itself has the instruction to forward to the given node along all ECMP paths. Additionally, if the router has multiple ECMP paths that have different directly-connected nexthops, or has ECMP paths with multiple different P/Q nodes, the router will do load sharing by assigning each repair path to alternating prefixes.
Another thing to note about TI-LFA, is that there are rare cases where many labels must be pushed. When more than 3 labels need to be pushed, TI-LFA internally generates an SR-TE policy and steers the repair path out this policy. This is a workaround to be able to support such a large label stack. The FIB cannot push 5 labels for instance on a repair path, but the FIB can steer out an SR-TE policy. However, 99% of the time no more than two labels need to be used.
TI-LFA can also protect plain IP and LDP traffic. You can implement SR, but leave “sr-prefer” turned off. Then if you implement TI-LFA, SR will simply be used for fast reroute. When you use SR in this manner for LDP, the destination prefix must have a Prefix SID. (The PLR cannot run tLDP with the Q node in order to learn its label for the destination, so the PLR relies on the destination having a gloal prefix SID instead). Additionally, intermediate nodes that are used for steering traffic will need to have SR enabled, possibly for a prefix SID or Adj-SIDs. If you simply have SR enabled everywhere then you can fully protect LDP without any issues.
TI-LFA requires that intermediate nodes have Prefix SIDs in order to properly steer backup paths. By default, the ISIS RID in the TE ID TLV (134) is used to determine which prefix SID to use for a node. If TLV 134 doesn’t exist, then TLV 132 (IP interface address) is used, and that prefix must have an associated prefix SID. OSPF instead looks at the RID and sees if it is a reachable prefix with a prefix SID. If not, it next looks for the highest reachable host address that has the N-flag set. For these reasons, using an Anycast SID but not clearing the N-flag can break TI-LFA. In general, if you assign a prefix SID to each loopback and use the loopback as the RID, you shouldn’t have any issues. But if the PLR cannot find a suitable prefix SID for a given node, that node cannot be used for TI-LFA and this may break certain repair paths.
In our topology, there is no PQ node for the 4.4.4.1/32 prefix from R3’s perspective. R10 routes via the “long way” around the ring. The P space is all nodes besides R4. And the Q space is only R4. (R4 is the only node that can reach its own prefix, 4.4.4.1/32, without traversing the protected link).
To achieve an LFA, we need segment routing so that an LSP can use R10’s Adj-SID for its link to R4, forcing R10 to forward traffic out that interface even though the IGP metric is so high.
First we enable segment-routing on all nodes and configure the prefix SID on each router’s loopback to give each node a node SID.
Next, we must enable TI-LFA on R3. This is simply done by adding the fast-reroute per-prefix ti-lfa command to the link under ISIS. Note that both this command and fast-reroute per-prefix are required.
We can see that R3 has successfully protected 4.4.4.1/32. The output shows us that the P node is R10 and the Q node is R4. So R10’s Adj-SID value of 24005 is used.
This is programmed into the FIB as a repair route:
We can verify that R10’s Adj-SID of 24005 represents its adjacency to R4 in the ISIS LSPDB.
Notice above that R10 has only advertised one single Adjacency SID. Let’s enable TI-LFA on R10’s Gi0/0/0/4 link so that it advertises both the protected and non-protected Adj-SIDs.
We now see a FRR Adj-SID and a non-FRR Adj-SID. Which will R3 use for its LFA?
I’ve removed and re-added TI-LFA on R3. R3 still uses the unprotected Adj-SID. Nick Russo states in his blog series that the idea is that TI-LFA is providing the protection, so R3 does not use R10’s protected Adj-SID. If R3 did use R10’s protected Adj-SID, and the R4-R10 link went down at the same time, it seems that we would enter a temporary looping condition (microloop) while the IGP reconverges. Instead, if R3 uses the unprotected Adj-SID, it seems that we don’t have risk of microloops.