Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Computer Networking Internet Congestion Control

Similar presentations


Presentation on theme: "Advanced Computer Networking Internet Congestion Control"— Presentation transcript:

1 Advanced Computer Networking Internet Congestion Control

2 Principles of Congestion Control
informally: “too many sources sending too much data too fast for network to handle” manifestations: lost packets (buffer overflow at routers) long delays (queuing in router buffers) a highly important problem! H1 H2 R1 H3 A1(t) 10Mb/s D(t) 1.5Mb/s A2(t) 100Mb/s behnam shafagaty

3 Causes/costs of congestion: scenario 1
two senders, two receivers one router, infinite buffers no retransmission behnam shafagaty

4 Causes/costs of congestion: scenario 1
Throughput increases with load Maximum total load C (Each session C/2) Large delays when congested The load is stochastic behnam shafagaty

5 Causes/costs of congestion: scenario 2
one router, finite buffers sender retransmission of lost packet behnam shafagaty

6 Causes/costs of congestion: scenario 2
l in out = always: (goodput) Like to maximize goodput! “perfect” retransmission: retransmit only when loss: Actual retransmission of delayed (not lost) packet makes larger (than perfect case) for same l in out > l in l out behnam shafagaty

7 Causes/costs of congestion: scenario 2
out out out ’in ’in “costs” of congestion: more work (retrans) for given “goodput” unneeded retransmissions: link carries (and delivers) multiple copies of pkt behnam shafagaty

8 Packet delay and throughput as functions of load
behnam shafagaty

9 Congestion Control Congestion control involves two tasks:
-Detect congestion -Limit sending rate behnam shafagaty

10 TCP & AQM Example congestion measure pl(t) Loss (Reno)
DropTail RED REM,PI,AVQ xi(t) TCP: Reno Vegas Example congestion measure pl(t) Loss (Reno) Queuing delay (Vegas) behnam shafagaty

11 TCP Congestion Control
End-End control (no network assistance) Assumes long delays (packet loss) is due to congestion behnam shafagaty

12 Congestion Control II TCP uses slow start and Additive Increase/multiplicative decrease (AIMD) to deal with congestion Van Jacobson 1988 outlined these ideas slow-start roughly: whenever starting traffic or recovering from congestion, start cwnd at the size of a single segment and increase it (up to a point) as ACKs show up behnam shafagaty

13 AIMD (Additive Increase / Multiplicative Decrease)
CongestionWindow (cwnd) is a variable held by the TCP source for each connection. cwnd is set based on the perceived level of congestion. The Host receives implicit (packet drop) or explicit (packet mark) indications of internal congestion. MaxWindow :: min (CongestionWindow, AdvertisedWindow) EffectiveWindow = MaxWindow – (LastByteSent -LastByteAcked) behnam shafagaty

14 Additive Increase Additive Increase is a reaction to perceived available capacity. Linear Increase basic idea:: For each “cwnd’s worth” of packets sent, increase cwnd by 1 packet. In practice, cwnd is incremented fractionally for each arriving ACK. increment = (MSS /cwnd) cwnd = cwnd + increment behnam shafagaty

15 Additive Increase Add one packet each RTT behnam shafagaty Source
Destination Add one packet each RTT Additive Increase behnam shafagaty

16 Multiplicative Decrease
The key assumption is that a dropped packet and the resultant timeout are due to congestion at a router or a switch. Multiplicate Decrease:: TCP reacts to a timeout by halving cwnd. cwnd is not allowed below the size of a single packet. behnam shafagaty

17 AIMD: Some Notes It has been shown that AIMD is a necessary condition for TCP congestion control to be stable. Because the simple CC mechanism involves timeouts that cause retransmissions, it is important that hosts have an accurate timeout mechanism. Timeouts set as a function of average RTT and standard deviation of RTT. behnam shafagaty

18 Typical TCP Congestion window Evolution
behnam shafagaty

19 AIMD: Two users, One link
Fairness Rate of User 2 BW limit Rate of User 1 behnam shafagaty

20 Slow Start Linear additive increase takes too long to ramp up a new TCP connection from cold start. Beginning with TCP Tahoe, the slow start mechanism was added to provide an initial exponential increase in the size of cwnd. behnam shafagaty

21 Slow Start 1- The source starts with cwnd = 1.
2- Every time an ACK arrives, cwnd is incremented. cwnd is effectively doubled per RTT “epoch”. Two slow start situations: At the very beginning of a connection {cold start}. When the connection goes dead waiting for a timeout to occur (i.e, the advertized window goes to zero!) behnam shafagaty

22 Slow Start Slow Start Add one packet per ACK behnam shafagaty Source
Destination Slow Start Add one packet per ACK Slow Start behnam shafagaty

23 Fast Retransmit Fast Retransmit
Basic Idea:: use duplicate ACKs to signal lost packet. Fast Retransmit Upon receipt of three duplicate ACKs, the TCP Sender retransmits the lost packet. behnam shafagaty

24 Fast Retransmit Generally, fast retransmit eliminates about half timeouts. This yields roughly a 20% improvement in throughput. Note – fast retransmit does not eliminate all the timeouts due to small window sizes at the source. behnam shafagaty

25 Fast Retransmit Fast Retransmit Based on three duplicate ACKs
behnam shafagaty

26 TCP Congestion Window Trace
behnam shafagaty

27 Fast Recovery Fast Recovery
Fast recovery was added with TCP Reno. Fast Recovery In congestion avoidance mode, if duplicate acks are received, reduce cwnd to half. If n successive duplicate acks are received, we know that receiver got n segments after lost segment: Advance cwnd by that number. behnam shafagaty

28 Adaptive Retransmissions
RTT:: Round Trip Time between a pair of hosts on the Internet. How to set the TimeOut value? The timeout value is set as a function of the expected RTT. Consequences of a bad choice? behnam shafagaty

29 Original Algorithm Keep a running average of RTT and compute TimeOut as a function of this RTT. Send packet and keep timestamp ts . When ACK arrives, record timestamp ta . SampleRTT = ta - ts behnam shafagaty

30 Original Algorithm Compute a weighted average:
EstimatedRTT = α x EstimatedRTT (1- α) x SampleRTT Original TCP spec: α in range (0.8,0.9) TimeOut = 2 x EstimatedRTT behnam shafagaty

31 Karn/Partidge Algorithm
An obvious flaw in the original algorithm: Whenever there is a retransmission it is impossible to know whether to associate the ACK with the original packet or the retransmitted packet. behnam shafagaty

32 Associating the ACK? behnam shafagaty

33 Karn/Partidge Algorithm
Do not measure SampleRTT when sending packet more than once. For each retransmission, set TimeOut to double the last TimeOut. { Note – this is a form of exponential backoff based on the believe that the lost packet is due to congestion.} behnam shafagaty

34 Jaconson/Karels Algorithm
The problem with the original algorithm is that it did not take into account the variance of SampleRTT. Difference = SampleRTT – EstimatedRTT EstimatedRTT = EstimatedRTT + (δ x Difference) Deviation = δ (|Difference| - Deviation) where δ is a fraction between 0 and 1. behnam shafagaty

35 Jaconson/Karels Algorithm
TCP computes timeout using both the mean and variance of RTT TimeOut = µ x EstimatedRTT + Φ x Deviation where based on experience µ = 1 and Φ = 4. behnam shafagaty

36 Algorithms behnam shafagaty

37 Early TCP Pre-1988 Go-back-N ARQ Receiver window flow control
Detects loss from timeout Retransmits from lost packet onward Receiver window flow control Prevent overflows at receive buffer Flow control: self-clocking behnam shafagaty

38 Why Flow Control? October 1986, Internet had its first congestion collapse Link LBL to UC Berkeley 400 yards, 3 hops, 32 Kbps throughput dropped to 40 bps factor of ~1000 drop! 1988, Van Jacobson proposed TCP flow control behnam shafagaty

39 Effect of Congestion Packet loss Retransmission Reduced throughput
Congestion collapse due to Unnecessarily retransmitted packets Undelivered or unusable packets Congestion may continue after the overload! throughput behnam shafagaty load

40 Window Flow Control ~ W packets per RTT
Source 1 2 W 1 2 W time data ACKs Destination 1 2 W 1 2 W time ~ W packets per RTT Lost packet detected by missing ACK behnam shafagaty

41 Window flow control Limit the number of packets in the network to window W Source rate = bps If W too small then rate « capacity If W too big then rate > capacity => congestion Adapt W to network (and conditions) W = BW x RTT behnam shafagaty

42 Congestion Control TCP seeks to Window flow control
Achieve high utilization Avoid congestion Share bandwidth Window flow control Source rate = packets/sec Adapt W to network (and conditions) W = BW x RTT behnam shafagaty

43 TCP Window Flow Controls
Receiver flow control Avoid overloading receiver Set by receiver awnd: receiver (advertised) window Network flow control Avoid overloading network Set by sender Infer available network capacity cwnd: congestion window Set W = min (cwnd, awnd) behnam shafagaty

44 Receiver Flow Control Receiver advertises awnd with each ACK
Window awnd closed when data is received and ack’d opened when data is read Size of awnd can be the performance limit (e.g. on a LAN) sensible default ~16kB behnam shafagaty

45 Network Flow Control Source calculates cwnd from indication of network congestion Congestion indications Losses Delay Marks Algorithms to calculate cwnd Tahoe, Reno, Vegas, RED, REM … behnam shafagaty

46 TCP Congestion Controls
Tahoe (Jacobson 1988) Slow Start Congestion Avoidance Fast Retransmit Reno (Jacobson 1990) Fast Recovery Vegas (Brakmo & Peterson 1994) New Congestion Avoidance RED (Floyd & Jacobson 1993) Probabilistic marking REM (Athuraliya & Low 2000) Clear buffer, match rate behnam shafagaty

47 Variants Tahoe & Reno AQM NewReno SACK Rate-halving
Mod.s for high performance AQM RED, ARED, FRED, SRED BLUE, SFB REM, PI, AVQ behnam shafagaty

48 TCP Tahoe (Jacobson 1988) window time SS CA SS: Slow Start
CA: Congestion Avoidance behnam shafagaty

49 Slow Start Start with cwnd = 1 (slow start)
On each successful ACK increment cwnd cwnd  cnwd + 1 Exponential growth of cwnd each RTT: cwnd  2 x cwnd Enter CA when cwnd >= ssthresh behnam shafagaty

50 Slow Start sender receiver cwnd  cwnd + 1 (for each ACK) cwnd 1 RTT
data packet 1 RTT ACK 2 3 4 5 6 7 8 cwnd  cwnd + 1 (for each ACK) behnam shafagaty

51 Congestion Avoidance Starts when cwnd  ssthresh
On each successful ACK: cwnd  cwnd + 1/cwnd Linear growth of cwnd each RTT: cwnd  cwnd + 1 behnam shafagaty

52 Congestion Avoidance sender receiver
cwnd 1 data packet ACK 2 1 RTT 3 4 cwnd  cwnd + 1 (for each cwnd ACKS) behnam shafagaty

53 Packet Loss Assumption: loss indicates congestion
Packet loss detected by Retransmission TimeOuts (RTO timer) Duplicate ACKs (at least 3) 1 2 3 4 5 6 Packets Acknowledgements 7 behnam shafagaty

54 Fast Retransmit Wait for a timeout is quite long
Immediately retransmits after 3 dupACKs without waiting for timeout Adjusts ssthresh flightsize = min(awnd, cwnd) ssthresh  max(flightsize/2, 2) Enter Slow Start (cwnd = 1) behnam shafagaty

55 Successive Timeouts When there is a timeout, double the RTO
Keep doing so for each lost retransmission Exponential back-off Max 64 seconds1 Max 12 restransmits1 1 - Net/3 BSD behnam shafagaty

56 Summary: Tahoe Basic ideas Gently probe network for spare capacity
Drastically reduce rate on congestion Windowing: self-clocking Other functions: round trip time estimation, error recovery for every ACK { if (W < ssthresh) then W++ (SS) else W += 1/W (CA) } for every loss { ssthresh = W/2 W = 1 behnam shafagaty

57 TCP Tahoe behnam shafagaty

58 Fast retransmission/fast recovery
TCP Reno (Jacobson 1990) SS CA Fast retransmission/fast recovery behnam shafagaty

59 Fast recovery Motivation: prevent `pipe’ from emptying after fast retransmit Idea: each dupACK represents a packet having left the pipe (successfully received) Enter FR/FR after 3 dupACKs Set ssthresh  max(flightsize/2, 2) Retransmit lost packet Set cwnd  ssthresh + ndup (window inflation) Wait till W=min(awnd, cwnd) is large enough; transmit new packet(s) On non-dup ACK (1 RTT later), set cwnd  ssthresh (window deflation) Enter CA After FR/FR, when CA is entered, cwnd is half of the window when lost was detected. So the effect of lost is halving the window. [Source: RFC 2581, Fall & Floyd, “Simulation based Comparison of Tahoe, Reno, and SACK TCP”] behnam shafagaty

60 Example: FR/FR Fast retransmit Fast recovery Retransmit on 3 dupACKs
1 2 3 4 5 6 8 7 1 7 4 9 4 4 11 10 time Exit FR/FR 4 time R 8 cwnd 8 ssthresh Fast retransmit Retransmit on 3 dupACKs Fast recovery Inflate window while repairing loss to fill pipe behnam shafagaty

61 Summary: Reno Basic ideas Fast recovery avoids slow start
dupACKs: fast retransmit + fast recovery Timeout: fast retransmit + slow start dupACKs congestion avoidance FR/FR timeout slow start retransmit behnam shafagaty

62 NewReno: Motivation 1 8 FR/FR 8 unack’d pkts 2 5 S 1 2 3 4 5 6 7 8 9 3 timeout time 9 D time On 3 dupACKs, receiver has packets 2, 4, 6, 8, cwnd=8, retransmits pkt 1, enter FR/FR Next dupACK increment cwnd to 9 After a RTT, ACK arrives for pkts 1 & 2, exit FR/FR, cwnd=5, 8 unack’ed pkts No more ACK, sender must wait for timeout Example: Cwnd = 10. Sender sends packets 1, 2, …, 10. Packets 1, 3, …, 9 are lost, packets 2, 4, …, 10 are received. When 3 dupACK are received, receiver has (at least) received packets 2, 4, 6, 8. Sender retransmits packet 1, and waits, until dupACK due to arrival of packet 10 has been arrived, and then ACK due to retransmitted packet 1 has arrived, acknowledging packets 1 and 2. This last ACK takes Reno out of Fast Recovery, with cwnd = 5. There are now 8 outstanding packets: 3, 4, …, 10. So sender cannot transmit any packet. Note that the sender will not receive any more dupACK since the window has been exhausted. It must wait, until timer expires for packet 3, and then retransmit and goes to slow start. behnam shafagaty

63 NewReno Fall & Floyd ‘96, (RFC 2583)
Motivation: multiple losses within a window Partial ACK acknowledges some but not all packets outstanding at start of FR Partial ACK takes Reno out of FR, deflates window Sender may have to wait for timeout before proceeding Idea: partial ACK indicates lost packets Stays in FR/FR and retransmits immediately Retransmits 1 lost packet per RTT until all lost packets from that window are retransmitted Eliminates timeout behnam shafagaty

64 SACK Mathis, Mahdavi, Floyd, Romanow ’96 (RFC 2018, RFC 2883)
Motivation: Reno & NewReno retransmit at most 1 lost packet per RTT Pipe can be emptied during FR/FR with multiple losses Idea: SACK provides better estimate of packets in pipe SACK TCP option describes received packets On 3 dupACKs: retransmits, halves window, enters FR Updates pipe = packets in pipe Increment when lost or new packets sent Decrement when dupACK received Transmits a (lost or new) packet when pipe < cwnd Exit FR when all packets outstanding when FR was entered are acknowledged [Sources: M. Mathis, J. Mahdavi, S. Floyd and A. Romanow, “TCP Selective Acknowledgement Options”, RFC 2018, Oct. 1996 K. Fall and S. Floyd, “Simulation-based comparisons of Tahoe, Reno and SACK TCP”, Computer Communication Review, July 1996 ] behnam shafagaty

65 TCP Vegas (Brakmo & Peterson 1994)
window time SS CA Reno with a new congestion avoidance algorithm Converges (provided buffer is large) ! behnam shafagaty

66 Congestion avoidance Each source estimates number of its own packets in pipe from RTT Adjusts window to maintain estimate between ad and bd for every RTT { if W/RTTmin – W/RTT < a then W ++ if W/RTTmin – W/RTT > b then W -- } for every loss W := W/2 behnam shafagaty

67 Implications Congestion measure = end-to-end queueing delay
At equilibrium Zero loss Stable window at full utilization Approximately weighted proportional fairness Nonzero queue, larger for more sources Convergence to equilibrium Converges if sufficient network buffer Oscillates like Reno otherwise behnam shafagaty

68 Wireless TCP Reno uses loss as congestion measure
In wireless, significant losses due to Fading Interference Handover Not buffer overflow (congestion) Halving window too drastic Small throughput, low utilization behnam shafagaty

69 Proposed solutions Ideas Approaches
Hide from source noncongestion losses Inform source of noncongestion losses Approaches Link layer error control Split TCP Snoop agent SACK+ELN (Explicit Loss Notification) Sources: Balakrishnan, Padmanabhan, Seshan and Katz, “A comparison of mechanisms for improving TCP performance over wireless links”, ToN, 5(6): , Dec 1997 behnam shafagaty

70 Third approach Problem Reno uses loss as congestion measure
Two types of losses Congestion loss: retransmit + reduce window Noncongestion loss: retransmit Previous approaches Hide noncongestion losses Indicate noncongestion losses Our approach Eliminates congestion losses (buffer overflows) behnam shafagaty

71 Third approach Router REM capable Host
Do not use loss as congestion measure Vegas REM Idea REM clears buffer Only noncongestion losses Retransmits lost packets without reducing window behnam shafagaty

72 Performance Goodput behnam shafagaty


Download ppt "Advanced Computer Networking Internet Congestion Control"

Similar presentations


Ads by Google