QoS on IOS-XE
Last updated
Last updated
Topology: ine-spv4
Load basic.l3vpn.fully.setup.cfg
Shape traffic outbound on the CE, R1, as follows:
Traffic should be shaped to a rate of 50M with a Tc of 10ms and an equal burst excess allowed
EF traffic should be treated as priority and limited to 10M of bandwidth during congestion
Signaling traffic is marked as CS3 and should be given 5M of gauranteed bandwidth
All other traffic should use WRED based on DSCP to prevent TCP global sync
On the PE router, R2, police and shape traffic as follows:
Traffic inbound should be rate limited to 50M. Anything between 50M and 75M should be re-marked as CS1. Anything above 75M should be dropped.
Traffic outbound should be shaped to 50M. Traffic should not incur more than 25ms of delay if the queue fills up. If it does, the customer would rather the traffic is dropped instead. Ensure EF traffic is always prioritized up to 10M.
QoS is the process of classifying traffic into different groups and allowing packets to reorder in times of congestion. This gives preferential treatment to one group of traffic over another. QoS is needed anywhere that the input rate can exceed the output rate of an interface, which is called oversubscription.
For example, in this lab, R1 might have a 1G LAN interface. It can receive traffic from the LAN at 1Gbps, but its WAN interface has a CIR of 50Mbps. This is oversubscription, so QoS is needed to prioritize certain traffic over less-important traffic during times of congestion. Note that congestion can even be at the micro-burst level. If a very short burst of line-rate traffic comes in at 1Gbps, some of this might need to be dropped to output at 50Mbps, because it can overflow the queue. But overall, you might still see a very low utilization on the interface (for example only 10Mbps utilization over 30seconds).
QoS invovles the following processes:
Traffic is classified based on aspects of the traffic (i.e. src/dst IP, existing QoS marking, or application inspection).
Traffic is marked at the edge so that subsequent devices don’t need to re-classify the traffic using complex methods (ACLs or app inspection). Subsequent devices can simply classify based on DSCP marking, EXP marking, etc.
Every device along the path performs an independent QoS action, which is based on the class of the traffic (the DSCP marking).
Traffic can be given a minimum bandwidth guarantee (bandwidth statement)
Traffic can be given priority, meaning its queue is always serviced first during times of congestion. Typically real-time traffic, such as VOIP or video traffic, should be given this treatment.
This traffic should be policed so that it cannot starve out other traffic
Traffic can be given a maximum bandwidth (police statement or shape statement)
Traffic can be randomly dropped from the end of the queue to avoid TCP global synchronization (WRED).
In this lab, we are asked to configure R1 so that EF is always given priority, but policed up to 10M, and CS3 should be given a minimum bandwidth gaurantee of 5M. To do this, we first configure class-maps to match this traffic:
A class-map can have multiple match statements. By default, a class-map is match-all which means all match statements must match. You can alternatively use class-map match-any <name> to consider a packet a match if only one match statement is a match. Also note that you can nest class-maps within each other using match class-map <name>. This can be used to create more complex matching logic (i.e. ”match all three of these items <or> match this item”).
Next, we configure our policies in a child policy, which will later be applied to a parent policy. This is needed in order to apply a 50M shaper overall to the entire policy. This technique is known as hierarchical QoS, because we can nest policy-maps within each other. Below we give EF traffic priority but limit it to 10Mbps (in times of congestion), and we give CS3 a minimum bandwidth of 5M. If there is no congestion, CS3 traffic can use more than 5Mbps. It is just a minimum reservation so that CS3 traffic always has that much bandwidth available. Additionally, EF traffic can use more than 10Mbps if there is no congestion.
Next we are asked to use WRED for all other traffic. WRED drops packets as the queue fills up to avoid TCP global sync, in which all TCP streams synchronously scale up and down their sending rates as they experience packet loss and recover from packet loss. WRED drops packets proportionally based on the DSCP values. For example, AFx2 traffic is dropped more often than AFx1.
As a reminder, DSCP uses the following values:
EF (to indicate expedited forwarding)
DF (to indicate default forwarding, value = 0)
CS1-7 which is backwards compatible with IPP
AFxy, where x=1-4 (precendence) and y=1-3 (drop priority). A higher y value means “more likely to drop.”
We can activate DSCP-based WRED as follows. Since all traffic that doesn’t hit the EF or CS3 classes will hit the class-default class, we can simply activate DSCP-based WRED on this class.
Finally we associate this CHILD policy with the shaper policy. We are asked to use a Tc of 10ms. This means that the shaper runs every 10ms. A lower value reduces delay for packets in the queue, while a higher value allows more bursting at a given moment. This is because once all traffic is “used up” in a given time interval, the router must wait until the next time interval to send packets again. So the higher the Tc value, the more traffic can be burst at once, but the more delay a packet can potentially incur in the queue.
We cannot directly set the Tc value. Instead we must set this indirectly using the Bc value. The shaper rate is equal to the Bc divided by Tc. So if we take our shaper rate (50M) multiplied by 10ms (.010) we get 500K, which is our Bc value.
Additionally we are asked to use an excess burst bucket that is the same as our Bc bucket size. During periods where no traffic is sent, excess tokens from the Bc bucket can spill into the Be bucket. Then during subsequent periods of traffic being sent, if the Bc bucket is emptied, the Be bucket can be used. Overall the CIR does not change, as the Be bucket is only populated with tokens that weren’t spent in previous time periods.
Lastly we apply this policy to the interface in the outbound direction:
We can check hits on our classes using show policy-map interface
Notice above that the class-default has a “chart” of DSCP values that have been seen and the min/max threshold where packets will start being dropped, and the probability of drops. Currently the default is a 1-in-10 probability. As packets pass the minimum threshold, they begin getting randomly dropped. As they approach the max threshold, they are dropped at a rate approaching 1-in-10. Once the max threshold is reached, any subsequent packets are tail dropped.
Notice that each DSCP value has a different min threshold. This is how weighted RED works to drop traffic proportionally based on its DSCP value. Also note that DSCP values only show up here for traffic that was seen by the policy-map. I did this in the lab by sending extended pings with specific DSCP values to get those values to populate.
Above, each class has a (queue depth/total drops/no-buffer drops) line. Total drops mean that this particular class’s queue is full. No-buffer drops means that the shared packet memory pool is full, but this class's queue still had room. This is quite rare, so you typically should not see “no-buffer drops.”
You may need to adjust queue limits per class depending on if you see total drops or no-buffer drops. If you see no-buffer drops, you need to make queues shorter because they are over-running the shared packet memory pool. If you see tail drops, you can consider making the queue longer. However, there is a trade-off: the longer a queue is, the less tail dropping you will see, but the more latency will be incurred by packets at the end of a queue. If you see that packets are incurring too much latency, you can reduce the queue size to make the router drop a packet rather than have it incur queuing delay.
Next we’ll examine the PE QoS configuration. The PE’s job is mostly to enforce the CIR that the customer is paying for. In this special circumstance, the PE is also prioritizing voice traffic for the customer. Perhaps this is an extra service the customer is paying for.
To shape outbound at 50M but also prioritize voice traffic, we once again need to use hierarchical QoS. We configure a class-map that matches EF and a child policy that rate limits this to 10M. (Remember that when the police rate is applied in the same line as the priority command, the police rate only takes affect in times of congestion. If you instead were to apply a separate police 10 m statement, then voice traffic would never be allowed above 10M, even when there is available bandwidth).
Additionally we are asked to ensure that packets don’t wait in a queue for more than 25msec. (With voice traffic, this is not applicable, because voice traffic is always emptied first, so it does not sit in the queue anyway). Traffic shaping delays traffic in order to adhere to the CIR. Any traffic that exceeds the CIR is queued and delayed in order to smooth out the rate. (In comparison, a policer drops or re-marks violating traffic and never queues it). The shaper in some situations may introduce more delay that desired. This is why we are limiting the queue length to a specific time value, so that we are guarantee traffic will not be delayed more than 25msec.
We apply the 25msec queue limit to the default class under the child policy. Note that you cannot configure this on the parent policy because the child policy has a class that is using a priority queue.
The “queue-limit” command can take three values: a packet limit, byte limit, or time (msec) limit. The time limit is the most useful, because it gives you a deterministic amount of max delay any packet will incur. The way it works is the policy-map will determine the parent bandwidth from the parent policy or bandwidth statement on the parent interface. It then converts the msec to a byte amount and uses the bytes amount for the queue-limit.
We finally apply the child policy to a parent policy that overall shapes to 50M, and apply that to the interface in the outbound direction.
If we check the show policy-map interface output, we can see that the 25msec queue-limit has been converted to 156250 bytes. This is 50M times .025 and divided by 8 (to convert bits to bytes).
The benefit of this style of queue-limit is that this child policy can be applied to any parent policy and the queue limit will dynamically be set based on the parent policy’s shaper rate, ensuring that traffic will never incur more than 25msec of queuing delay.
Next we configure the policer. We are asked to use a two-rate three-color policer. This means that there are two separate rates (CIR and PIR) and three levels that traffic can be “colored” with (conform, exceed, and violate). We are asked to mark down traffic exceeding the CIR, but within the PIR, to CS1. The idea is that a device more downstream will have a QoS policy to drop CS1 traffic in times of congestion.
Notice above that the time interval by default is 250msec. This can be adjusted to allow more or less bursting by directly setting the Bc and Be values in bytes. For example, if we manually set these values to be half, it means the policer evaluates the rate based on 125msec. This would allow less bursting but the policer would run more often.