Download presentation
Published byCameron Dean Modified over 9 years ago
1
Advanced Computer Networking Internet Congestion Control
2
Principles of Congestion Control
informally: “too many sources sending too much data too fast for network to handle” manifestations: lost packets (buffer overflow at routers) long delays (queuing in router buffers) a highly important problem! H1 H2 R1 H3 A1(t) 10Mb/s D(t) 1.5Mb/s A2(t) 100Mb/s behnam shafagaty
3
Causes/costs of congestion: scenario 1
two senders, two receivers one router, infinite buffers no retransmission behnam shafagaty
4
Causes/costs of congestion: scenario 1
Throughput increases with load Maximum total load C (Each session C/2) Large delays when congested The load is stochastic behnam shafagaty
5
Causes/costs of congestion: scenario 2
one router, finite buffers sender retransmission of lost packet behnam shafagaty
6
Causes/costs of congestion: scenario 2
l in out = always: (goodput) Like to maximize goodput! “perfect” retransmission: retransmit only when loss: Actual retransmission of delayed (not lost) packet makes larger (than perfect case) for same l in out > l in l out behnam shafagaty
7
Causes/costs of congestion: scenario 2
out out out ’in ’in “costs” of congestion: more work (retrans) for given “goodput” unneeded retransmissions: link carries (and delivers) multiple copies of pkt behnam shafagaty
8
Packet delay and throughput as functions of load
behnam shafagaty
9
Congestion Control Congestion control involves two tasks:
-Detect congestion -Limit sending rate behnam shafagaty
10
TCP & AQM Example congestion measure pl(t) Loss (Reno)
DropTail RED REM,PI,AVQ xi(t) TCP: Reno Vegas Example congestion measure pl(t) Loss (Reno) Queuing delay (Vegas) behnam shafagaty
11
TCP Congestion Control
End-End control (no network assistance) Assumes long delays (packet loss) is due to congestion behnam shafagaty
12
Congestion Control II TCP uses slow start and Additive Increase/multiplicative decrease (AIMD) to deal with congestion Van Jacobson 1988 outlined these ideas slow-start roughly: whenever starting traffic or recovering from congestion, start cwnd at the size of a single segment and increase it (up to a point) as ACKs show up behnam shafagaty
13
AIMD (Additive Increase / Multiplicative Decrease)
CongestionWindow (cwnd) is a variable held by the TCP source for each connection. cwnd is set based on the perceived level of congestion. The Host receives implicit (packet drop) or explicit (packet mark) indications of internal congestion. MaxWindow :: min (CongestionWindow, AdvertisedWindow) EffectiveWindow = MaxWindow – (LastByteSent -LastByteAcked) behnam shafagaty
14
Additive Increase Additive Increase is a reaction to perceived available capacity. Linear Increase basic idea:: For each “cwnd’s worth” of packets sent, increase cwnd by 1 packet. In practice, cwnd is incremented fractionally for each arriving ACK. increment = (MSS /cwnd) cwnd = cwnd + increment behnam shafagaty
15
Additive Increase Add one packet each RTT behnam shafagaty Source
Destination Add one packet each RTT Additive Increase behnam shafagaty
16
Multiplicative Decrease
The key assumption is that a dropped packet and the resultant timeout are due to congestion at a router or a switch. Multiplicate Decrease:: TCP reacts to a timeout by halving cwnd. cwnd is not allowed below the size of a single packet. behnam shafagaty
17
AIMD: Some Notes It has been shown that AIMD is a necessary condition for TCP congestion control to be stable. Because the simple CC mechanism involves timeouts that cause retransmissions, it is important that hosts have an accurate timeout mechanism. Timeouts set as a function of average RTT and standard deviation of RTT. behnam shafagaty
18
Typical TCP Congestion window Evolution
behnam shafagaty
19
AIMD: Two users, One link
Fairness Rate of User 2 BW limit Rate of User 1 behnam shafagaty
20
Slow Start Linear additive increase takes too long to ramp up a new TCP connection from cold start. Beginning with TCP Tahoe, the slow start mechanism was added to provide an initial exponential increase in the size of cwnd. behnam shafagaty
21
Slow Start 1- The source starts with cwnd = 1.
2- Every time an ACK arrives, cwnd is incremented. cwnd is effectively doubled per RTT “epoch”. Two slow start situations: At the very beginning of a connection {cold start}. When the connection goes dead waiting for a timeout to occur (i.e, the advertized window goes to zero!) behnam shafagaty
22
Slow Start Slow Start Add one packet per ACK behnam shafagaty Source
Destination Slow Start Add one packet per ACK Slow Start behnam shafagaty
23
Fast Retransmit Fast Retransmit
Basic Idea:: use duplicate ACKs to signal lost packet. Fast Retransmit Upon receipt of three duplicate ACKs, the TCP Sender retransmits the lost packet. behnam shafagaty
24
Fast Retransmit Generally, fast retransmit eliminates about half timeouts. This yields roughly a 20% improvement in throughput. Note – fast retransmit does not eliminate all the timeouts due to small window sizes at the source. behnam shafagaty
25
Fast Retransmit Fast Retransmit Based on three duplicate ACKs
behnam shafagaty
26
TCP Congestion Window Trace
behnam shafagaty
27
Fast Recovery Fast Recovery
Fast recovery was added with TCP Reno. Fast Recovery In congestion avoidance mode, if duplicate acks are received, reduce cwnd to half. If n successive duplicate acks are received, we know that receiver got n segments after lost segment: Advance cwnd by that number. behnam shafagaty
28
Adaptive Retransmissions
RTT:: Round Trip Time between a pair of hosts on the Internet. How to set the TimeOut value? The timeout value is set as a function of the expected RTT. Consequences of a bad choice? behnam shafagaty
29
Original Algorithm Keep a running average of RTT and compute TimeOut as a function of this RTT. Send packet and keep timestamp ts . When ACK arrives, record timestamp ta . SampleRTT = ta - ts behnam shafagaty
30
Original Algorithm Compute a weighted average:
EstimatedRTT = α x EstimatedRTT (1- α) x SampleRTT Original TCP spec: α in range (0.8,0.9) TimeOut = 2 x EstimatedRTT behnam shafagaty
31
Karn/Partidge Algorithm
An obvious flaw in the original algorithm: Whenever there is a retransmission it is impossible to know whether to associate the ACK with the original packet or the retransmitted packet. behnam shafagaty
32
Associating the ACK? behnam shafagaty
33
Karn/Partidge Algorithm
Do not measure SampleRTT when sending packet more than once. For each retransmission, set TimeOut to double the last TimeOut. { Note – this is a form of exponential backoff based on the believe that the lost packet is due to congestion.} behnam shafagaty
34
Jaconson/Karels Algorithm
The problem with the original algorithm is that it did not take into account the variance of SampleRTT. Difference = SampleRTT – EstimatedRTT EstimatedRTT = EstimatedRTT + (δ x Difference) Deviation = δ (|Difference| - Deviation) where δ is a fraction between 0 and 1. behnam shafagaty
35
Jaconson/Karels Algorithm
TCP computes timeout using both the mean and variance of RTT TimeOut = µ x EstimatedRTT + Φ x Deviation where based on experience µ = 1 and Φ = 4. behnam shafagaty
36
Algorithms behnam shafagaty
37
Early TCP Pre-1988 Go-back-N ARQ Receiver window flow control
Detects loss from timeout Retransmits from lost packet onward Receiver window flow control Prevent overflows at receive buffer Flow control: self-clocking behnam shafagaty
38
Why Flow Control? October 1986, Internet had its first congestion collapse Link LBL to UC Berkeley 400 yards, 3 hops, 32 Kbps throughput dropped to 40 bps factor of ~1000 drop! 1988, Van Jacobson proposed TCP flow control behnam shafagaty
39
Effect of Congestion Packet loss Retransmission Reduced throughput
Congestion collapse due to Unnecessarily retransmitted packets Undelivered or unusable packets Congestion may continue after the overload! throughput behnam shafagaty load
40
Window Flow Control ~ W packets per RTT
Source 1 2 W 1 2 W time data ACKs Destination 1 2 W 1 2 W time ~ W packets per RTT Lost packet detected by missing ACK behnam shafagaty
41
Window flow control Limit the number of packets in the network to window W Source rate = bps If W too small then rate « capacity If W too big then rate > capacity => congestion Adapt W to network (and conditions) W = BW x RTT behnam shafagaty
42
Congestion Control TCP seeks to Window flow control
Achieve high utilization Avoid congestion Share bandwidth Window flow control Source rate = packets/sec Adapt W to network (and conditions) W = BW x RTT behnam shafagaty
43
TCP Window Flow Controls
Receiver flow control Avoid overloading receiver Set by receiver awnd: receiver (advertised) window Network flow control Avoid overloading network Set by sender Infer available network capacity cwnd: congestion window Set W = min (cwnd, awnd) behnam shafagaty
44
Receiver Flow Control Receiver advertises awnd with each ACK
Window awnd closed when data is received and ack’d opened when data is read Size of awnd can be the performance limit (e.g. on a LAN) sensible default ~16kB behnam shafagaty
45
Network Flow Control Source calculates cwnd from indication of network congestion Congestion indications Losses Delay Marks Algorithms to calculate cwnd Tahoe, Reno, Vegas, RED, REM … behnam shafagaty
46
TCP Congestion Controls
Tahoe (Jacobson 1988) Slow Start Congestion Avoidance Fast Retransmit Reno (Jacobson 1990) Fast Recovery Vegas (Brakmo & Peterson 1994) New Congestion Avoidance RED (Floyd & Jacobson 1993) Probabilistic marking REM (Athuraliya & Low 2000) Clear buffer, match rate behnam shafagaty
47
Variants Tahoe & Reno AQM NewReno SACK Rate-halving
Mod.s for high performance AQM RED, ARED, FRED, SRED BLUE, SFB REM, PI, AVQ behnam shafagaty
48
TCP Tahoe (Jacobson 1988) window time SS CA SS: Slow Start
CA: Congestion Avoidance behnam shafagaty
49
Slow Start Start with cwnd = 1 (slow start)
On each successful ACK increment cwnd cwnd cnwd + 1 Exponential growth of cwnd each RTT: cwnd 2 x cwnd Enter CA when cwnd >= ssthresh behnam shafagaty
50
Slow Start sender receiver cwnd cwnd + 1 (for each ACK) cwnd 1 RTT
data packet 1 RTT ACK 2 3 4 5 6 7 8 cwnd cwnd + 1 (for each ACK) behnam shafagaty
51
Congestion Avoidance Starts when cwnd ssthresh
On each successful ACK: cwnd cwnd + 1/cwnd Linear growth of cwnd each RTT: cwnd cwnd + 1 behnam shafagaty
52
Congestion Avoidance sender receiver
cwnd 1 data packet ACK 2 1 RTT 3 4 cwnd cwnd + 1 (for each cwnd ACKS) behnam shafagaty
53
Packet Loss Assumption: loss indicates congestion
Packet loss detected by Retransmission TimeOuts (RTO timer) Duplicate ACKs (at least 3) 1 2 3 4 5 6 Packets Acknowledgements 7 behnam shafagaty
54
Fast Retransmit Wait for a timeout is quite long
Immediately retransmits after 3 dupACKs without waiting for timeout Adjusts ssthresh flightsize = min(awnd, cwnd) ssthresh max(flightsize/2, 2) Enter Slow Start (cwnd = 1) behnam shafagaty
55
Successive Timeouts When there is a timeout, double the RTO
Keep doing so for each lost retransmission Exponential back-off Max 64 seconds1 Max 12 restransmits1 1 - Net/3 BSD behnam shafagaty
56
Summary: Tahoe Basic ideas Gently probe network for spare capacity
Drastically reduce rate on congestion Windowing: self-clocking Other functions: round trip time estimation, error recovery for every ACK { if (W < ssthresh) then W++ (SS) else W += 1/W (CA) } for every loss { ssthresh = W/2 W = 1 behnam shafagaty
57
TCP Tahoe behnam shafagaty
58
Fast retransmission/fast recovery
TCP Reno (Jacobson 1990) SS CA Fast retransmission/fast recovery behnam shafagaty
59
Fast recovery Motivation: prevent `pipe’ from emptying after fast retransmit Idea: each dupACK represents a packet having left the pipe (successfully received) Enter FR/FR after 3 dupACKs Set ssthresh max(flightsize/2, 2) Retransmit lost packet Set cwnd ssthresh + ndup (window inflation) Wait till W=min(awnd, cwnd) is large enough; transmit new packet(s) On non-dup ACK (1 RTT later), set cwnd ssthresh (window deflation) Enter CA After FR/FR, when CA is entered, cwnd is half of the window when lost was detected. So the effect of lost is halving the window. [Source: RFC 2581, Fall & Floyd, “Simulation based Comparison of Tahoe, Reno, and SACK TCP”] behnam shafagaty
60
Example: FR/FR Fast retransmit Fast recovery Retransmit on 3 dupACKs
1 2 3 4 5 6 8 7 1 7 4 9 4 4 11 10 time Exit FR/FR 4 time R 8 cwnd 8 ssthresh Fast retransmit Retransmit on 3 dupACKs Fast recovery Inflate window while repairing loss to fill pipe behnam shafagaty
61
Summary: Reno Basic ideas Fast recovery avoids slow start
dupACKs: fast retransmit + fast recovery Timeout: fast retransmit + slow start dupACKs congestion avoidance FR/FR timeout slow start retransmit behnam shafagaty
62
NewReno: Motivation 1 8 FR/FR 8 unack’d pkts 2 5 S 1 2 3 4 5 6 7 8 9 3 timeout time 9 D time On 3 dupACKs, receiver has packets 2, 4, 6, 8, cwnd=8, retransmits pkt 1, enter FR/FR Next dupACK increment cwnd to 9 After a RTT, ACK arrives for pkts 1 & 2, exit FR/FR, cwnd=5, 8 unack’ed pkts No more ACK, sender must wait for timeout Example: Cwnd = 10. Sender sends packets 1, 2, …, 10. Packets 1, 3, …, 9 are lost, packets 2, 4, …, 10 are received. When 3 dupACK are received, receiver has (at least) received packets 2, 4, 6, 8. Sender retransmits packet 1, and waits, until dupACK due to arrival of packet 10 has been arrived, and then ACK due to retransmitted packet 1 has arrived, acknowledging packets 1 and 2. This last ACK takes Reno out of Fast Recovery, with cwnd = 5. There are now 8 outstanding packets: 3, 4, …, 10. So sender cannot transmit any packet. Note that the sender will not receive any more dupACK since the window has been exhausted. It must wait, until timer expires for packet 3, and then retransmit and goes to slow start. behnam shafagaty
63
NewReno Fall & Floyd ‘96, (RFC 2583)
Motivation: multiple losses within a window Partial ACK acknowledges some but not all packets outstanding at start of FR Partial ACK takes Reno out of FR, deflates window Sender may have to wait for timeout before proceeding Idea: partial ACK indicates lost packets Stays in FR/FR and retransmits immediately Retransmits 1 lost packet per RTT until all lost packets from that window are retransmitted Eliminates timeout behnam shafagaty
64
SACK Mathis, Mahdavi, Floyd, Romanow ’96 (RFC 2018, RFC 2883)
Motivation: Reno & NewReno retransmit at most 1 lost packet per RTT Pipe can be emptied during FR/FR with multiple losses Idea: SACK provides better estimate of packets in pipe SACK TCP option describes received packets On 3 dupACKs: retransmits, halves window, enters FR Updates pipe = packets in pipe Increment when lost or new packets sent Decrement when dupACK received Transmits a (lost or new) packet when pipe < cwnd Exit FR when all packets outstanding when FR was entered are acknowledged [Sources: M. Mathis, J. Mahdavi, S. Floyd and A. Romanow, “TCP Selective Acknowledgement Options”, RFC 2018, Oct. 1996 K. Fall and S. Floyd, “Simulation-based comparisons of Tahoe, Reno and SACK TCP”, Computer Communication Review, July 1996 ] behnam shafagaty
65
TCP Vegas (Brakmo & Peterson 1994)
window time SS CA Reno with a new congestion avoidance algorithm Converges (provided buffer is large) ! behnam shafagaty
66
Congestion avoidance Each source estimates number of its own packets in pipe from RTT Adjusts window to maintain estimate between ad and bd for every RTT { if W/RTTmin – W/RTT < a then W ++ if W/RTTmin – W/RTT > b then W -- } for every loss W := W/2 behnam shafagaty
67
Implications Congestion measure = end-to-end queueing delay
At equilibrium Zero loss Stable window at full utilization Approximately weighted proportional fairness Nonzero queue, larger for more sources Convergence to equilibrium Converges if sufficient network buffer Oscillates like Reno otherwise behnam shafagaty
68
Wireless TCP Reno uses loss as congestion measure
In wireless, significant losses due to Fading Interference Handover Not buffer overflow (congestion) Halving window too drastic Small throughput, low utilization behnam shafagaty
69
Proposed solutions Ideas Approaches
Hide from source noncongestion losses Inform source of noncongestion losses Approaches Link layer error control Split TCP Snoop agent SACK+ELN (Explicit Loss Notification) Sources: Balakrishnan, Padmanabhan, Seshan and Katz, “A comparison of mechanisms for improving TCP performance over wireless links”, ToN, 5(6): , Dec 1997 behnam shafagaty
70
Third approach Problem Reno uses loss as congestion measure
Two types of losses Congestion loss: retransmit + reduce window Noncongestion loss: retransmit Previous approaches Hide noncongestion losses Indicate noncongestion losses Our approach Eliminates congestion losses (buffer overflows) behnam shafagaty
71
Third approach Router REM capable Host
Do not use loss as congestion measure Vegas REM Idea REM clears buffer Only noncongestion losses Retransmits lost packets without reducing window behnam shafagaty
72
Performance Goodput behnam shafagaty
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.