Congestion Avoidance
Inner Mongolia University Objectives Upon completing this module, you will be able to: Describe random early detection (RED) Describe and configure weighted random early detection (WRED)
Inner Mongolia University TCP Review
Inner Mongolia University 4 Transmission Control Protocol - TCP
Inner Mongolia University 5 IP Best-Effort Design Philosophy Best-effort delivery Let everybody send Try to deliver what you can … and just drop the rest source destination IP network
Inner Mongolia University 6 Congestion is Unavoidable Two packets arrive at the same time The node can only transmit one … and either buffer or drop the other If many packets arrive in short period of time The node cannot keep up with the arriving traffic … and the buffer may eventually overflow
Inner Mongolia University 7 The Problem of Congestion What is congestion? Load is higher than capacity What do IP routers do? Drop the excess packets Why is this bad? Wasted bandwidth for retransmissions Load Goodput “congestion collapse” Increase in load that results in a decrease in useful work done.
Inner Mongolia University 8 Ways to Deal With Congestion Ignore the problem Many dropped (and retransmitted) packets Can cause congestion collapse Reservations, like in circuit switching Pre-arrange bandwidth allocations Requires negotiation before sending packets Pricing Don’t drop packets for the high-bidders Requires a payment model Dynamic adjustment (TCP) Every sender infers the level of congestion And adapts its sending rate, for the greater good
Inner Mongolia University 9 Many Important Questions How does the sender know there is congestion? Explicit feedback from the network? Inference based on network performance? How should the sender adapt? Explicit sending rate computed by the network? End host coordinates with other hosts? End host thinks globally but acts locally? What is the performance objective? Maximizing goodput, even if some users suffer more? Fairness? (Whatever the heck that means!) How fast should new TCP senders send?
Inner Mongolia University 10 Inferring From Implicit Feedback ? What does the end host see? What can the end host change?
Inner Mongolia University 11 Where Congestion Happens: Links Simple resource allocation: FIFO queue & drop-tail Link bandwidth: first-in first-out queue Packets transmitted in the order they arrive Buffer space: drop-tail queuing If the queue is full, drop the incoming packet
Inner Mongolia University 12 How it Looks to the End Host Packet delay Packet experiences high delay Packet loss Packet gets dropped along the way How does TCP sender learn this? Delay Round-trip time estimate Loss Timeout Triple-duplicate acknowledgment
Inner Mongolia University 13 What Can the End Host Do? Upon detecting congestion Decrease the sending rate (e.g., divide in half) End host does its part to alleviate the congestion But, what if conditions change? Suppose there is more bandwidth available Would be a shame to stay at a low sending rate Upon not detecting congestion Increase the sending rate, a little at a time And see if the packets are successfully delivered
Inner Mongolia University 14 TCP Congestion Window Each TCP sender maintains a congestion window Maximum number of bytes to have in transit I.e., number of bytes still awaiting acknowledgments Adapting the congestion window Decrease upon losing a packet: backing off Increase upon success: optimistically exploring Always struggling to find the right transfer rate Both good and bad Pro: avoids having explicit feedback from network Con: under-shooting and over-shooting the rate
Inner Mongolia University 15 Additive Increase, Multiplicative Decrease How much to increase and decrease? Increase linearly, decrease multiplicatively A necessary condition for stability of TCP Consequences of over-sized window are much worse than having an under-sized window Over-sized window: packets dropped and retransmitted Under-sized window: somewhat lower throughput Multiplicative decrease On loss of packet, divide congestion window in half Additive increase On success for last window of data, increase linearly
Inner Mongolia University 16 Leads to the TCP “Sawtooth” t Window halved Loss
Inner Mongolia University 17 Practical Details Congestion window Represented in bytes, not in packets (Why?) Packets have MSS (Maximum Segment Size) bytes Increasing the congestion window Increase by MSS on success for last window of data Decreasing the congestion window Never drop congestion window below 1 MSS
Inner Mongolia University 18 Receiver Window vs. Congestion Window Flow control Keep a fast sender from overwhelming a slow receiver Congestion control Keep a set of senders from overloading the network Different concepts, but similar mechanisms TCP flow control: receiver window TCP congestion control: congestion window TCP window: min{congestion window, receiver window}
Inner Mongolia University 19 How Should a New Flow Start t Window But, could take a long time to get started! Need to start with a small CWND to avoid overloading the network.
Inner Mongolia University 20 “Slow Start” Phase Start with a small congestion window Initially, CWND is 1 Max Segment Size (MSS) So, initial sending rate is MSS/RTT That could be pretty wasteful Might be much less than the actual bandwidth Linear increase takes a long time to accelerate Slow-start phase (really “fast start”) Sender starts at a slow rate (hence the name) … but increases the rate exponentially … until the first loss event
Inner Mongolia University 21 Slow Start in Action Double CWND per round-trip time D A DDAADD AA D A Src Dest D A
Inner Mongolia University 22 Slow Start and the TCP Sawtooth Loss Exponential “slow start” t Window Why is it called slow-start? Because TCP originally had no congestion control mechanism. The source would just start by sending a whole receiver window’s worth of data.
Inner Mongolia University 23 Two Kinds of Loss in TCP Timeout Packet n is lost and detected via a timeout E.g., because all packets in flight were lost After the timeout, blasting away for the entire CWND … would trigger a very large burst in traffic So, better to start over with a low CWND Triple duplicate ACK Packet n is lost, but packets n+1, n+2, etc. arrive Receiver sends duplicate acknowledgments … and the sender retransmits packet n quickly Do a multiplicative decrease and keep going
Inner Mongolia University 24 Repeating Slow Start After Timeout t Window Slow-start restart: Go back to CWND of 1, but take advantage of knowing the previous value of CWND. Slow start in operation until it reaches half of previous cwnd. timeout
Inner Mongolia University 25 Repeating Slow Start After Idle Period Suppose a TCP connection goes idle for a while E.g., Telnet session where you don’t type for an hour Eventually, the network conditions change Maybe many more flows are traversing the link E.g., maybe everybody has come back from lunch! Dangerous to start transmitting at the old rate Previously-idle TCP sender might blast the network … causing excessive congestion and packet loss So, some TCP implementations repeat slow start Slow-start restart after an idle period
Inner Mongolia University 26 TCP Achieves Some Notion of Fairness Effective utilization is not the only goal We also want to be fair to the various flows … but what the heck does that mean? Simple definition: equal shares of the bandwidth N flows that each get 1/N of the bandwidth? But, what if the flows traverse different paths? E.g., bandwidth shared in proportion to the RTT
Inner Mongolia University 27 What About Cheating? Some folks are more fair than others Running multiple TCP connections in parallel Modifying the TCP implementation in the OS Use the User Datagram Protocol What is the impact Good guys slow down to make room for you You get an unfair share of the bandwidth Possible solutions? Routers detect cheating and drop excess packets? Peer pressure? ???
© 2001, Cisco Systems, Inc. Random Early Detection QOS v1.0—5-28
Inner Mongolia University Objectives Upon completing this lesson, you will be able to: Explain the need for congestion avoidance mechanisms Explain how RED works and how it can prevent congestion Describe the benefits and drawbacks of RED
Inner Mongolia University 30 Router Interface Congestion Router interfaces congest when the output queue is full: Additional incoming packets are dropped. Dropped packets may cause significant application performance degradation. By default, routers perform tail dropping. Tail dropping has significant drawbacks. WFQ, if configured, has a more intelligent dropping scheme.
Inner Mongolia University 31 Tail-Drop Flaws Simple tail dropping has significant flaws: TCP synchronization TCP starvation High delay and jitter No differentiated drop Poor feedback to TCP
Inner Mongolia University 32 TCP Synchronization Multiple TCP sessions start at different times. TCP window sizes are increased. Tail drops cause many packets of many sessions to be dropped at the same time. TCP sessions restart at the same time (synchronization). Flow A Flow B Flow C Average link use
Inner Mongolia University 33 TCP Starvation, Delay, and Jitter Constant high buffer use (long queue) causes delay. More aggressive flows can cause other flows to starve. Variable buffer use causes jitter. There is no differentiated dropping. Prec. 0 Prec. 0 Prec. 0 Prec. 0 Prec. 0 Prec. 0 Prec. 0 Prec. 0 Prec. 3 Prec. 3 Queue Packets of Aggressive Flows Prec. 3 Packets of Starving Flows Delay Packets experience long delay if the interface is constantly congested. Prec. 3 Prec. 3 TCP does not react well if multiple packets are dropped. Tail dropping does not look at IP Precedence.
Inner Mongolia University 34 Conclusion Tail dropping should be avoided. Tail dropping can be avoided if congestion is prevented. Congestion can be prevented if TCP sessions (which still make up more than 80% of average Internet traffic) can be slowed down. TCP sessions can be slowed down if some packets are occasionally dropped. Therefore, packets should be dropped when an interface is nearing congestion.
Inner Mongolia University 35 Random Early Detection Random early detection (RED) is a mechanism that randomly drops packets even before a queue is full. RED drops packets with increasing probability. RED result: TCP sessions slow down to the approximate rate of output-link bandwidth. Average queue size is small (much less than the maximum queue size). IP Precedence can be used to drop lower-Precedence packets more aggressively than higher-Precedence packets.
Inner Mongolia University 36 RED Profile Average Queue Size Drop Probability 10% 100% Minimum Threshold Maximum Threshold Maximum Drop Probability No dropRandom dropFull drop
Inner Mongolia University 37 RED Modes RED has three modes: No drop—when the average queue size is between 0 and the minimum threshold Random drop—when the average queue size is between the minimum and the maximum threshold Full drop (tail drop)—when the average queue size is at maximum threshold or above Random drops should prevent congestion (prevent tail drops).
Inner Mongolia University 38 Before RED TCP synchronization prevents average link utilization close to the link bandwidth. Tail drops cause TCP sessions to go into slow-start. Flow A Flow B Flow C Average link use
Inner Mongolia University 39 After RED Average link use is much closer to link bandwidth. Random drops cause TCP sessions to reduce window sizes. Average link use Flow A Flow B Flow C
Inner Mongolia University Summary Upon completing this lesson, you should be able to: Explain the need for congestion avoidance mechanisms Explain how RED works and how it can prevent congestion Describe the benefits and drawbacks of RED
Inner Mongolia University Lesson Review 1.What are the main drawbacks of using tail dropping as a means of congestion control? 2.What does RED do to prevent TCP synchronization? 3.What are the three modes of RED?
Weighted Random Early Detection © 2001, Cisco Systems, Inc. QOS v1.0—5-42
Inner Mongolia University Objectives Upon completing this lesson, you will be able to: Describe the weighted random early detection (WRED) mechanism Configure WRED on Cisco routers Monitor and troubleshoot WRED on Cisco routers
Inner Mongolia University 44 Weighted Random Early Detection WRED uses a different RED profile for each weight. Each profile is identified by: Minimum threshold Maximum threshold Maximum drop probability Weight can be: IP Precedence (8 profiles) DSCP (64 profiles) WRED drops less important packets more aggressively than more important packets.
Inner Mongolia University 45 WRED Profiles WRED profiles can be manually set. WRED has 8 default value sets for IP Precedence–based WRED. WRED has 64 default value sets for DSCP–based WRED. Average Queue Size Drop Probability 10% 100%
Inner Mongolia University 46 IP Precedence and Class Selector Profiles Average Queue Size Drop Probability 10% 100% 2040 RSVP IP Precedence
Inner Mongolia University 47 DSCP-Based WRED (Expedited Forwarding) Average Queue Size Drop Probability 10% 100% 2040 EF 36
Inner Mongolia University 48 DSCP-Based WRED (Assured Forwarding) Average Queue Size Drop Probability 10% 100% 2040 Assured Forwarding High Drop Assured Forwarding Medium Drop Assured Forwarding Low Drop
Inner Mongolia University 49 WRED Building Blocks IP Packet WRED Calculate Average Queue Size Calculate Average Queue Size FIFO Queue Select WRED Profile Select WRED Profile Current Queue Size IP Precedence or DSCP Minimum Threshold Maximum Threshold Mark Probability Denominator Queue Full? Queue Full? No Yes Tail DropRandom Drop
Inner Mongolia University 50 Configuring WRED and DWRED random-detect Router(config-if)# Enables IP Precedence–based WRED Default service profile is used Nondistributed WRED cannot be combined with fancy queuing—FIFO queuing has to be used WRED can run distributed on VIP-based interfaces (DWRED) DWRED can be combined with DWFQ
Inner Mongolia University 51 Changing the WRED Profile random-detect precedence precedence min-threshold max-threshold mark-prob-denominator Router(config-if)# Changes RED profile for specified IP Precedence value Packet drop probability at maximum threshold is 1 / mark-prob-denominator Nonweighted RED is achieved by using the same RED profile for all precedence values
Inner Mongolia University 52 Changing WRED Sensitivity to Bursts random-detect exponential-weighting-constant n Router(config-if)# WRED takes the average queue size to determine the current WRED mode (no drop, random drop, full drop). High values of n allow short bursts. Low values of n make WRED more burst-sensitive. Default value (9) should be used in most scenarios. Average output queue size with n =9 is average t+1 = average t * queue_size t * Current Queue Size Previous Average Queue Size New Average Queue size
Inner Mongolia University 53 Configuring DSCP-Based WRED random-detect {prec-based | dscp-based} Router(config-if)# Selects WRED mode Precedence-based WRED is the default mode DSCP-based command uses 64 profiles
Inner Mongolia University 54 Changing the WRED Profile random-detect dscp dscp min-threshold max-threshold mark-prob- denominator Router(config-if)# Changes RED profile for specified DSCP value Packet drop probability at maximum threshold is 1 / mark-prob-denominator
Inner Mongolia University 55 WRED Case Study WRED is applied to a core link in a network with these IP Precedence definitions: IP Prec. Meaning 0 High-drop, best-effort traffic Low-drop, best-effort traffic 1 3 Premium traffic in the contract 2 Premium traffic outside of the contract 4Unused 5 Voice over IP 6 Routing protocol traffic 7
Inner Mongolia University 56 WRED Case Study Guidelines Best-effort traffic should be dropped before premium traffic. Out-of-contract or high-drop, best-effort traffic should be dropped very aggressively. Voice traffic should be dropped only under extreme congestion. Routing protocol traffic should be less drop resistant than VoIP (depends on the routing protocol and control over amount of VoIP traffic). Configure WRED with default values on an interface first and tune the per-precedence parameters based on default values.
Inner Mongolia University 57 Sample WRED Profile Packet Discard Probability Average Queue Size 0.1 RSVP Precedence 2 Precedence 0 Precedence 3 Precedence 1 VoIP Routing
Inner Mongolia University WRED Configuration interface Serial 0/1/0 ip address random-detect random-detect precedence random-detect precedence random-detect precedence random-detect precedence random-detect precedence random-detect precedence random-detect precedence random-detect precedence interface Serial 0/1/0 ip address random-detect random-detect precedence random-detect precedence random-detect precedence random-detect precedence random-detect precedence random-detect precedence random-detect precedence random-detect precedence
Inner Mongolia University 59 Monitoring WRED show interface Displays the queuing/dropping mechanism in use Displays WRED parameters (VIP only) show queueing Displays the RED profile for each interface show queue Displays the interfaces output queue show interface random-detect Displays RED statistics (VIP only)
Inner Mongolia University Interface Parameters Router#show interface serial 1/0 Serial1/0 is up, line protocol is up Hardware is CD2430 in sync mode Internet address is /30 MTU 1500 bytes, BW 128 Kbit, DLY 200 usec, rely 255/ Encapsulation HDLC, loopback not set, keepalive set (10 sec) Last input 00:00:07, output 00:00:07, output hang never Last clearing of "show interface" counters never Input queue: 2/75/0 (size/max/drops); Total output drops: 0 Queueing strategy: random early detection (WRED) 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec packets input, bytes, 0 no buffer Received broadcasts, 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort... rest deleted... Router#show interface serial 1/0 Serial1/0 is up, line protocol is up Hardware is CD2430 in sync mode Internet address is /30 MTU 1500 bytes, BW 128 Kbit, DLY 200 usec, rely 255/ Encapsulation HDLC, loopback not set, keepalive set (10 sec) Last input 00:00:07, output 00:00:07, output hang never Last clearing of "show interface" counters never Input queue: 2/75/0 (size/max/drops); Total output drops: 0 Queueing strategy: random early detection (WRED) 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec packets input, bytes, 0 no buffer Received broadcasts, 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort... rest deleted... show interface intf Router# Displays interface parameters
Inner Mongolia University WRED Parameters and Statistics Router#show queueing random-detect Current random-detect configuration: Serial1/0 Queueing strategy: random early detection (WRED) Exp-weight-constant: 9 (1/512) Mean queue depth: 38 Class Random Tail Minimum Maximum Mark drop drop threshold threshold probability / / / / / / / /10 rsvp /10 Router#show queueing random-detect Current random-detect configuration: Serial1/0 Queueing strategy: random early detection (WRED) Exp-weight-constant: 9 (1/512) Mean queue depth: 38 Class Random Tail Minimum Maximum Mark drop drop threshold threshold probability / / / / / / / /10 rsvp /10 show queueing random-detect Router# Displays per-interface parameters WRED statistics
Inner Mongolia University DWRED Parameters and Statistics Router#show interfaces random-detect FastEthernet1/0/0 queue size 0 packets output 29692, drops 0 WRED: queue average 0 weight 1/512 Precedence 0: 109 min threshold, 218 max threshold, 1/10 mark weight 1 packets output, drops: 0 random, 0 threshold Precedence 1: 122 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Precedence 2: 135 min threshold, 218 max threshold, 1/10 mark weight packets output, drops: 0 random, 0 threshold Precedence 3: 148 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Precedence 4: 161 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Precedence 5: 174 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Precedence 6: 187 min threshold, 218 max threshold, 1/10 mark weight packets output, drops: 0 random, 0 threshold Precedence 7: 200 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Router#show interfaces random-detect FastEthernet1/0/0 queue size 0 packets output 29692, drops 0 WRED: queue average 0 weight 1/512 Precedence 0: 109 min threshold, 218 max threshold, 1/10 mark weight 1 packets output, drops: 0 random, 0 threshold Precedence 1: 122 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Precedence 2: 135 min threshold, 218 max threshold, 1/10 mark weight packets output, drops: 0 random, 0 threshold Precedence 3: 148 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Precedence 4: 161 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Precedence 5: 174 min threshold, 218 max threshold, 1/10 mark weight (no traffic) Precedence 6: 187 min threshold, 218 max threshold, 1/10 mark weight packets output, drops: 0 random, 0 threshold Precedence 7: 200 min threshold, 218 max threshold, 1/10 mark weight (no traffic)
Inner Mongolia University Queue Details Router#show queue serial 1/0 Output queue for Serial1/0 is 65/0 Packet 1, linktype: ip, length: 1504, flags: 0x48 source: , destination: , id: 0x001A, ttl: 255, prot: 1 data: 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD Packet 2, linktype: ip, length: 1504, flags: 0x48 source: , destination: , id: 0x001A, ttl: 255, prot: 1 data: 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD Packet 3, linktype: ip, length: 1504, flags: 0x48 source: , destination: , id: 0x001A, ttl: 255, prot: 1 data: 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD... rest deleted... Router#show queue serial 1/0 Output queue for Serial1/0 is 65/0 Packet 1, linktype: ip, length: 1504, flags: 0x48 source: , destination: , id: 0x001A, ttl: 255, prot: 1 data: 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD Packet 2, linktype: ip, length: 1504, flags: 0x48 source: , destination: , id: 0x001A, ttl: 255, prot: 1 data: 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD Packet 3, linktype: ip, length: 1504, flags: 0x48 source: , destination: , id: 0x001A, ttl: 255, prot: 1 data: 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD 0xABCD... rest deleted... show queue intf Router# Displays queue contents
Inner Mongolia University 64 WRED Caveats and Restrictions Because the same policy is applied to all flows, a single nonadaptive flow can monopolize the buffer resources at an interface: WRED is suitable when TCP represents at least 80% of the traffic. Non-TCP traffic should be rate limited. Non distributed WRED implementation is mutually exclusive with PQ, CQ, and WFQ.
Inner Mongolia University Summary Upon completing this lesson, you should be able to: Describe the weighted random early detection (WRED) mechanism Configure WRED on Cisco routers Monitor and troubleshoot WRED on Cisco routers
Inner Mongolia University Module Summary Upon completing this module, you should be able to: Describe random early detection (RED) Describe and configure weighted random early detection (WRED)
Inner Mongolia University © 2001, Cisco Systems, Inc. IP QoS Traffic Shaping and Policing-67