SR basic inter-AS using BGP
Last updated
Last updated
Load sr.inter.as.init.cfg
ISIS and OSPF are already pre-configured as shown in the diagram below:
Using BGP, achieve an end-to-end LSP between R2 and R8. This is essentially inter-AS option C but without the VPN services. We are only concerned with the labeled transport between the two PEs.
First we’ll review classic BGP-LU. SR for BGP simply uses BGP-LU with an additional attribute.
I have configured BGP-LU (without SR) in the lab. We’ll follow the advertisement of R2’s prefix to R8. All routers are running LDP only. First R2 injects its prefix with a label into BGP. It allocates a local label value of 3 (imp-null) because it is the ultimate hop for the prefix.
R2 advertises this to R3. R3 receives label 3, and attempts to allocate a dynamic label. R3 already has label 24003 from LDP, so this is used for the BGP ipv4/LU route:
R3 advertises this to R5 with the BGP LU label as 24003. We can see the BGP update in the pcap below. The label is included as part of the NLRI in the MP_REACH_NLRI path attribute:
R5 receives this prefix with R3’s local label. R5 does not have this prefix in its RIB, so BGP allocates a dynamic label, 24003. (This just happens to be the same as R3’s local label).
R5 advertises this to R8 with next-hop-self. R8 also allocates a dynamic local label for this prefix and installs it into the RIB/LFIB.
R8 programs a CEF entry for 2.2.2.1/32 with two labels: a top label representing transport to the BGP nexthop (R5), and a subsequent label of R5’s label for the prefix.
The process happens for R8’s prefix as well, resulting in and end-to-end LSP between R2 and R8:
BGP-LU using SR is extremely similar to classic BGP-LU. The difference is that at each hop, the router uses the prefix SID attribute as a “hint” to allocate a local label from the SRGB instead of a dynamic label. This is a “hint,” because if the router doesn’t understand SR, it is still free to allocate a dynamic label if it wishes. Classic BGP-LU and BGP-LU with SR can interwork. This is because an SR node will honor the BGP-LU label received from a peer. This is how a router can control PHP behavior. (In IGP, a PHP-Off flag is used, but with BGP, the local label as value 3 is simply used to control PHP). Simply put, whatever label is received in the BGP-LU update is what is programmed as the outgoing label in the LFIB.
Configuring BGP-LU for SR is quite easy. First, each router that will allocate a label in response to BGP-LU prefixes must have the SRGB explicitly defined under the global config.
Second, routers that inject their local prefix must use a route-policy that sets the prefix SID attribute to the correct index value. Without this, the BGP Prefix-SID attribute will not be carried in the Update.
That is all there is to it. Let’s walk through the advertisement of the 2.2.2.1/32 prefix again when SR is used.
R2 injects 2.2.2.1/32 with a prefix SID label index of 2. It still uses a BGP-LU label of 3 to signal PHP behavior:
R3 receives this route and sees the label index attribute. This attribute can be seen in the pcap below:
R3 honors this attribute and allocates label 16002 (although this is already programmed in the LFIB from IGP).
R3 advertises this to R5 with the BGP-LU label set to 16002. Whether or not R5 uses SR does not matter, as R5 will program 16002 as the outgoing label either way. Since R5 does have a SRGB globally defined, it honors the label index and allocates its own local label of 16002.
R5 advertises this to R8 with next-hop-self. R8 also honors the label index and allocates label 16002. (This local label does not really matter to us because R8 is the headend for an LSP to R2, not a midpoint. However, R8 does need a local label for R2 in order to be able to recurse VPN routes with a nexthop of R2 to a /32 labeled FIB entry).
R8 programs a CEF entry for 2.2.2.1/32 with the top label as R5’s prefix and subsequent label as the received prefix on the BGP-LU update.
The same process happens for R8’s loopback. We now have an end-to-end LSP that uses a global label:
In summary:
SR for BGP is essentially just regular BGP-LU but with a “hint” to allocate a label from the SRGB using the received prefix SID index value instead of a random dynamic label.
SR for BGP interworks with regular BGP-LU because the received BGP-LU label value is always used for the LFIB entry.
SR for BGP requires two configuration settings:
The SRGB must be globally defined for routers receiving BGP-LU prefixes. This is used to allocate a label from the SRGB instead of dynamically.
Prefixes must be injected with the prefix index attribute using a route-policy.
If you try to use regular BGP-LU (without SR) but you are using SR with IGP, you will run into a problem trying to allocate labels with BGP. BGP will allocate a dynamic label, but it won’t be able to be installed in the LFIB because an existing SR label is already installed for that same prefix.
To demonstrate this, I’ve removed the SRGB on R3 and reset BGP:
I’ve also removed the index value RPL on R2:
We see the following syslog on R3:
R3’s BGP process allocated a dynamic label for 2.2.2.1/32. However, this cannot be installed in the LFIB because an existing entry already exists for 2.2.2.1/32 with label 16002.
The end-to-end LSP is now broken.
One way to solve this is to remove SR in the IGP and run LDP. In this case, BGP will find a dynamic label already exists and re-use the LDP label. Of course, a better solution is to just run SR for BGP!
After going through this again, I found either the PE or the ASBR could have a label allocation problem. This might be a bug, or it might be expected behavior and a quirk of using BGP-SR. Even the PE, for example R2, might show that the BGP IPv4/LU prefix for 8.8.8.1/32 has a local and remote label of 16008, but pings wouldn’t work. I found that the router had a locally allocated dynamic label for 8.8.8.1 as well. By removing BGP completely and re-adding it, it seemed to fix the problem on the PEs and ASBRs.
Additionally, if the PE is not advertising its own loopback (the ASBR is instead), the PE still needs to do allocate-label all so that it installs the remote PE in the LFIB. Otherwise, recursive lookup for the VPN routes don’t work.
Going through this one more time, I found another issue. I was trying to peer R2 and R8 directly to test that VPN routes were working. But I found that when R2 and R8 form a direct eBGP session, the router always allocates a local label for the neighbor, even though an SR label exists via BGP-to-IGP redistribution. So you will always need to peer with the local RR instead of directly with the remote PE. This problem also occurs even if the label is learned via BGP-LU. When doing eBGP peering with a labeled address family (VPNv4, or LU), the router allocates a local label with “Pop” outgoing label for the peer. Perhaps to support inter-AS Option B/C.
There is actually another solution to this problem. By using ebgp-multihop mpls on the neighbor, it disables the local label allocation of the /32 peer address.