Advanced Computer Networking Internet Congestion Control

Slides:



Advertisements
Similar presentations
CSCI-1680 Transport Layer II Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Rodrigo Fonseca.
Advertisements

TCP Congestion Control
Cs/ee 143 Communication Networks Chapter 7 Transport Text: Walrand & Parakh, 2010 Steven Low CMS, EE, Caltech.
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
Computer Networks: TCP Congestion Control 1 TCP Congestion Control Lecture material taken from “Computer Networks A Systems Approach”, Fourth Edition,Peterson.
Introduction 1 Lecture 14 Transport Layer (Transmission Control Protocol) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer.
School of Information Technologies TCP Congestion Control NETS3303/3603 Week 9.
Introduction to Congestion Control
Chapter 3 Transport Layer slides are modified from J. Kurose & K. Ross CPE 400 / 600 Computer Communication Networks Lecture 12.
Transport Layer 3-1 Fast Retransmit r time-out period often relatively long: m long delay before resending lost packet r detect lost segments via duplicate.
TCP Variations Naveen Manicka CISC 856 – Fall 2005 Computer & Information Sciences University of Delaware Nov 10, 2005 Most slides are borrowed from J.
Transport Layer3-1 Congestion Control. Transport Layer3-2 Principles of Congestion Control Congestion: r informally: “too many sources sending too much.
1 Congestion Control Outline Queuing Discipline Reacting to Congestion Avoiding Congestion.
Computer Networks: TCP Congestion Control 1 TCP Congestion Control Lecture material taken from “Computer Networks A Systems Approach”, Third Ed.,Peterson.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #7 TCP New Reno Vs. Reno.
1 Internet Networking Spring 2002 Tutorial 10 TCP NewReno.
Week 9 TCP9-1 Week 9 TCP 3 outline r 3.5 Connection-oriented transport: TCP m segment structure m reliable data transfer m flow control m connection management.
1 Lecture 9: TCP and Congestion Control Slides adapted from: Congestion slides for Computer Networks: A Systems Approach (Peterson and Davis) Chapter 3.
Computer Networks : TCP Congestion Control1 TCP Congestion Control.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
Data Communication and Networks
1 Internet Networking Spring 2004 Tutorial 10 TCP NewReno.
Networks : TCP Congestion Control1 TCP Congestion Control.
Networks : TCP Congestion Control1 TCP Congestion Control Presented by Bob Kinicki.
Advanced Computer Networks: TCP Congestion Control 1 TCP Congestion Control Lecture material taken from “Computer Networks A Systems Approach”, Fourth.
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 A note on the use of these.
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley Chapter3_3.
Principles of Congestion Control Congestion: informally: “too many sources sending too much data too fast for network to handle” different from flow control!
EE 122: Congestion Control and Avoidance Kevin Lai October 23, 2002.
1 TCP - Part II Relates to Lab 5. This is an extended module that covers TCP data transport, and flow control, congestion control, and error control in.
Lecture 9 – More TCP & Congestion Control
What is TCP? Connection-oriented reliable transfer Stream paradigm
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Computer Networking Lecture 18 – More TCP & Congestion Control.
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
1 CS 4396 Computer Networks Lab TCP – Part II. 2 Flow Control Congestion Control Retransmission Timeout TCP:
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
TCP Congestion Control Computer Networks TCP Congestion Control 1.
1 John Magee 20 February 2014 CS 280: Transport Layer: Congestion Control Concepts, TCP Congestion Control Most slides adapted from Kurose and Ross, Computer.
TCP Congestion Control
Advance Computer Networks Lecture#09 & 10 Instructor: Engr. Muhammad Mateen Yaqoob.
TCP. TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already.
CS 6401 Congestion Control in TCP Outline Overview of RENO TCP Reacting to Congestion SS/AIMD example.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
Chapter 3 outline 3.1 transport-layer services
Chapter 6 TCP Congestion Control
CS-1652 Jack Lange University of Pittsburgh
COMP 431 Internet Services & Protocols
Congestion Control.
Introduction to Congestion Control
Chapter 3 outline 3.1 Transport-layer services
TCP and Congestion Control(2)
TCP.
Lecture 19 – TCP Performance
cs/ee/ids 143 Communication Networks Chapter 4 Transport
So far, On the networking side, we looked at mechanisms to links hosts using direct linked networks and then forming a network of these networks. We introduced.
Congestion Control in TCP
Chapter 6 TCP Congestion Control
CS640: Introduction to Computer Networks
If both sources send full windows, we may get congestion collapse
TCP Congestion Control
Transport Layer: Congestion Control
Chapter 3 outline 3.1 Transport-layer services
Chapter 3 Transport Layer
Presentation transcript:

Advanced Computer Networking Internet Congestion Control

Principles of Congestion Control informally: “too many sources sending too much data too fast for network to handle” manifestations: lost packets (buffer overflow at routers) long delays (queuing in router buffers) a highly important problem! H1 H2 R1 H3 A1(t) 10Mb/s D(t) 1.5Mb/s A2(t) 100Mb/s behnam shafagaty

Causes/costs of congestion: scenario 1 two senders, two receivers one router, infinite buffers no retransmission behnam shafagaty

Causes/costs of congestion: scenario 1 Throughput increases with load Maximum total load C (Each session C/2) Large delays when congested The load is stochastic behnam shafagaty

Causes/costs of congestion: scenario 2 one router, finite buffers sender retransmission of lost packet behnam shafagaty

Causes/costs of congestion: scenario 2 l in out = always: (goodput) Like to maximize goodput! “perfect” retransmission: retransmit only when loss: Actual retransmission of delayed (not lost) packet makes larger (than perfect case) for same . l in out > l in l out behnam shafagaty

Causes/costs of congestion: scenario 2 out out out ’in ’in “costs” of congestion: more work (retrans) for given “goodput” unneeded retransmissions: link carries (and delivers) multiple copies of pkt behnam shafagaty

Packet delay and throughput as functions of load behnam shafagaty

Congestion Control Congestion control involves two tasks: -Detect congestion -Limit sending rate behnam shafagaty

TCP & AQM Example congestion measure pl(t) Loss (Reno) DropTail RED REM,PI,AVQ xi(t) TCP: Reno Vegas Example congestion measure pl(t) Loss (Reno) Queuing delay (Vegas) behnam shafagaty

TCP Congestion Control End-End control (no network assistance) Assumes long delays (packet loss) is due to congestion behnam shafagaty

Congestion Control II TCP uses slow start and Additive Increase/multiplicative decrease (AIMD) to deal with congestion Van Jacobson 1988 outlined these ideas slow-start roughly: whenever starting traffic or recovering from congestion, start cwnd at the size of a single segment and increase it (up to a point) as ACKs show up behnam shafagaty

AIMD (Additive Increase / Multiplicative Decrease) CongestionWindow (cwnd) is a variable held by the TCP source for each connection. cwnd is set based on the perceived level of congestion. The Host receives implicit (packet drop) or explicit (packet mark) indications of internal congestion. MaxWindow :: min (CongestionWindow, AdvertisedWindow) EffectiveWindow = MaxWindow – (LastByteSent -LastByteAcked) behnam shafagaty

Additive Increase Additive Increase is a reaction to perceived available capacity. Linear Increase basic idea:: For each “cwnd’s worth” of packets sent, increase cwnd by 1 packet. In practice, cwnd is incremented fractionally for each arriving ACK. increment = (MSS /cwnd) cwnd = cwnd + increment behnam shafagaty

Additive Increase Add one packet each RTT behnam shafagaty Source Destination Add one packet each RTT Additive Increase behnam shafagaty

Multiplicative Decrease The key assumption is that a dropped packet and the resultant timeout are due to congestion at a router or a switch. Multiplicate Decrease:: TCP reacts to a timeout by halving cwnd. cwnd is not allowed below the size of a single packet. behnam shafagaty

AIMD: Some Notes It has been shown that AIMD is a necessary condition for TCP congestion control to be stable. Because the simple CC mechanism involves timeouts that cause retransmissions, it is important that hosts have an accurate timeout mechanism. Timeouts set as a function of average RTT and standard deviation of RTT. behnam shafagaty

Typical TCP Congestion window Evolution behnam shafagaty

AIMD: Two users, One link Fairness Rate of User 2 BW limit Rate of User 1 behnam shafagaty

Slow Start Linear additive increase takes too long to ramp up a new TCP connection from cold start. Beginning with TCP Tahoe, the slow start mechanism was added to provide an initial exponential increase in the size of cwnd. behnam shafagaty

Slow Start 1- The source starts with cwnd = 1. 2- Every time an ACK arrives, cwnd is incremented. cwnd is effectively doubled per RTT “epoch”. Two slow start situations: At the very beginning of a connection {cold start}. When the connection goes dead waiting for a timeout to occur (i.e, the advertized window goes to zero!) behnam shafagaty

Slow Start Slow Start Add one packet per ACK behnam shafagaty Source Destination Slow Start Add one packet per ACK Slow Start behnam shafagaty

Fast Retransmit Fast Retransmit Basic Idea:: use duplicate ACKs to signal lost packet. Fast Retransmit Upon receipt of three duplicate ACKs, the TCP Sender retransmits the lost packet. behnam shafagaty

Fast Retransmit Generally, fast retransmit eliminates about half timeouts. This yields roughly a 20% improvement in throughput. Note – fast retransmit does not eliminate all the timeouts due to small window sizes at the source. behnam shafagaty

Fast Retransmit Fast Retransmit Based on three duplicate ACKs behnam shafagaty

TCP Congestion Window Trace behnam shafagaty

Fast Recovery Fast Recovery Fast recovery was added with TCP Reno. Fast Recovery In congestion avoidance mode, if duplicate acks are received, reduce cwnd to half. If n successive duplicate acks are received, we know that receiver got n segments after lost segment: Advance cwnd by that number. behnam shafagaty

Adaptive Retransmissions RTT:: Round Trip Time between a pair of hosts on the Internet. How to set the TimeOut value? The timeout value is set as a function of the expected RTT. Consequences of a bad choice? behnam shafagaty

Original Algorithm Keep a running average of RTT and compute TimeOut as a function of this RTT. Send packet and keep timestamp ts . When ACK arrives, record timestamp ta . SampleRTT = ta - ts behnam shafagaty

Original Algorithm Compute a weighted average: EstimatedRTT = α x EstimatedRTT + (1- α) x SampleRTT Original TCP spec: α in range (0.8,0.9) TimeOut = 2 x EstimatedRTT behnam shafagaty

Karn/Partidge Algorithm An obvious flaw in the original algorithm: Whenever there is a retransmission it is impossible to know whether to associate the ACK with the original packet or the retransmitted packet. behnam shafagaty

Associating the ACK? behnam shafagaty

Karn/Partidge Algorithm Do not measure SampleRTT when sending packet more than once. For each retransmission, set TimeOut to double the last TimeOut. { Note – this is a form of exponential backoff based on the believe that the lost packet is due to congestion.} behnam shafagaty

Jaconson/Karels Algorithm The problem with the original algorithm is that it did not take into account the variance of SampleRTT. Difference = SampleRTT – EstimatedRTT EstimatedRTT = EstimatedRTT + (δ x Difference) Deviation = δ (|Difference| - Deviation) where δ is a fraction between 0 and 1. behnam shafagaty

Jaconson/Karels Algorithm TCP computes timeout using both the mean and variance of RTT TimeOut = µ x EstimatedRTT + Φ x Deviation where based on experience µ = 1 and Φ = 4. behnam shafagaty

Algorithms behnam shafagaty

Early TCP Pre-1988 Go-back-N ARQ Receiver window flow control Detects loss from timeout Retransmits from lost packet onward Receiver window flow control Prevent overflows at receive buffer Flow control: self-clocking behnam shafagaty

Why Flow Control? October 1986, Internet had its first congestion collapse Link LBL to UC Berkeley 400 yards, 3 hops, 32 Kbps throughput dropped to 40 bps factor of ~1000 drop! 1988, Van Jacobson proposed TCP flow control behnam shafagaty

Effect of Congestion Packet loss Retransmission Reduced throughput Congestion collapse due to Unnecessarily retransmitted packets Undelivered or unusable packets Congestion may continue after the overload! throughput behnam shafagaty load

Window Flow Control ~ W packets per RTT Source 1 2 W 1 2 W time data ACKs Destination 1 2 W 1 2 W time ~ W packets per RTT Lost packet detected by missing ACK behnam shafagaty

Window flow control Limit the number of packets in the network to window W Source rate = bps If W too small then rate « capacity If W too big then rate > capacity => congestion Adapt W to network (and conditions) W = BW x RTT behnam shafagaty

Congestion Control TCP seeks to Window flow control Achieve high utilization Avoid congestion Share bandwidth Window flow control Source rate = packets/sec Adapt W to network (and conditions) W = BW x RTT behnam shafagaty

TCP Window Flow Controls Receiver flow control Avoid overloading receiver Set by receiver awnd: receiver (advertised) window Network flow control Avoid overloading network Set by sender Infer available network capacity cwnd: congestion window Set W = min (cwnd, awnd) behnam shafagaty

Receiver Flow Control Receiver advertises awnd with each ACK Window awnd closed when data is received and ack’d opened when data is read Size of awnd can be the performance limit (e.g. on a LAN) sensible default ~16kB behnam shafagaty

Network Flow Control Source calculates cwnd from indication of network congestion Congestion indications Losses Delay Marks Algorithms to calculate cwnd Tahoe, Reno, Vegas, RED, REM … behnam shafagaty

TCP Congestion Controls Tahoe (Jacobson 1988) Slow Start Congestion Avoidance Fast Retransmit Reno (Jacobson 1990) Fast Recovery Vegas (Brakmo & Peterson 1994) New Congestion Avoidance RED (Floyd & Jacobson 1993) Probabilistic marking REM (Athuraliya & Low 2000) Clear buffer, match rate behnam shafagaty

Variants Tahoe & Reno AQM NewReno SACK Rate-halving Mod.s for high performance AQM RED, ARED, FRED, SRED BLUE, SFB REM, PI, AVQ behnam shafagaty

TCP Tahoe (Jacobson 1988) window time SS CA SS: Slow Start CA: Congestion Avoidance behnam shafagaty

Slow Start Start with cwnd = 1 (slow start) On each successful ACK increment cwnd cwnd  cnwd + 1 Exponential growth of cwnd each RTT: cwnd  2 x cwnd Enter CA when cwnd >= ssthresh behnam shafagaty

Slow Start sender receiver cwnd  cwnd + 1 (for each ACK) cwnd 1 RTT data packet 1 RTT ACK 2 3 4 5 6 7 8 cwnd  cwnd + 1 (for each ACK) behnam shafagaty

Congestion Avoidance Starts when cwnd  ssthresh On each successful ACK: cwnd  cwnd + 1/cwnd Linear growth of cwnd each RTT: cwnd  cwnd + 1 behnam shafagaty

Congestion Avoidance sender receiver cwnd 1 data packet ACK 2 1 RTT 3 4 cwnd  cwnd + 1 (for each cwnd ACKS) behnam shafagaty

Packet Loss Assumption: loss indicates congestion Packet loss detected by Retransmission TimeOuts (RTO timer) Duplicate ACKs (at least 3) 1 2 3 4 5 6 Packets Acknowledgements 7 behnam shafagaty

Fast Retransmit Wait for a timeout is quite long Immediately retransmits after 3 dupACKs without waiting for timeout Adjusts ssthresh flightsize = min(awnd, cwnd) ssthresh  max(flightsize/2, 2) Enter Slow Start (cwnd = 1) behnam shafagaty

Successive Timeouts When there is a timeout, double the RTO Keep doing so for each lost retransmission Exponential back-off Max 64 seconds1 Max 12 restransmits1 1 - Net/3 BSD behnam shafagaty

Summary: Tahoe Basic ideas Gently probe network for spare capacity Drastically reduce rate on congestion Windowing: self-clocking Other functions: round trip time estimation, error recovery for every ACK { if (W < ssthresh) then W++ (SS) else W += 1/W (CA) } for every loss { ssthresh = W/2 W = 1 behnam shafagaty

TCP Tahoe behnam shafagaty

Fast retransmission/fast recovery TCP Reno (Jacobson 1990) SS CA Fast retransmission/fast recovery behnam shafagaty

Fast recovery Motivation: prevent `pipe’ from emptying after fast retransmit Idea: each dupACK represents a packet having left the pipe (successfully received) Enter FR/FR after 3 dupACKs Set ssthresh  max(flightsize/2, 2) Retransmit lost packet Set cwnd  ssthresh + ndup (window inflation) Wait till W=min(awnd, cwnd) is large enough; transmit new packet(s) On non-dup ACK (1 RTT later), set cwnd  ssthresh (window deflation) Enter CA After FR/FR, when CA is entered, cwnd is half of the window when lost was detected. So the effect of lost is halving the window. [Source: RFC 2581, Fall & Floyd, “Simulation based Comparison of Tahoe, Reno, and SACK TCP”] behnam shafagaty

Example: FR/FR Fast retransmit Fast recovery Retransmit on 3 dupACKs 1 2 3 4 5 6 8 7 1 7 4 9 4 4 11 10 time Exit FR/FR 4 time R 8 cwnd 8 ssthresh Fast retransmit Retransmit on 3 dupACKs Fast recovery Inflate window while repairing loss to fill pipe behnam shafagaty

Summary: Reno Basic ideas Fast recovery avoids slow start dupACKs: fast retransmit + fast recovery Timeout: fast retransmit + slow start dupACKs congestion avoidance FR/FR timeout slow start retransmit behnam shafagaty

NewReno: Motivation 1 8 FR/FR 8 unack’d pkts 2 5 S 1 2 3 4 5 6 7 8 9 3 timeout time 9 D time On 3 dupACKs, receiver has packets 2, 4, 6, 8, cwnd=8, retransmits pkt 1, enter FR/FR Next dupACK increment cwnd to 9 After a RTT, ACK arrives for pkts 1 & 2, exit FR/FR, cwnd=5, 8 unack’ed pkts No more ACK, sender must wait for timeout Example: Cwnd = 10. Sender sends packets 1, 2, …, 10. Packets 1, 3, …, 9 are lost, packets 2, 4, …, 10 are received. When 3 dupACK are received, receiver has (at least) received packets 2, 4, 6, 8. Sender retransmits packet 1, and waits, until dupACK due to arrival of packet 10 has been arrived, and then ACK due to retransmitted packet 1 has arrived, acknowledging packets 1 and 2. This last ACK takes Reno out of Fast Recovery, with cwnd = 5. There are now 8 outstanding packets: 3, 4, …, 10. So sender cannot transmit any packet. Note that the sender will not receive any more dupACK since the window has been exhausted. It must wait, until timer expires for packet 3, and then retransmit and goes to slow start. behnam shafagaty

NewReno Fall & Floyd ‘96, (RFC 2583) Motivation: multiple losses within a window Partial ACK acknowledges some but not all packets outstanding at start of FR Partial ACK takes Reno out of FR, deflates window Sender may have to wait for timeout before proceeding Idea: partial ACK indicates lost packets Stays in FR/FR and retransmits immediately Retransmits 1 lost packet per RTT until all lost packets from that window are retransmitted Eliminates timeout behnam shafagaty

SACK Mathis, Mahdavi, Floyd, Romanow ’96 (RFC 2018, RFC 2883) Motivation: Reno & NewReno retransmit at most 1 lost packet per RTT Pipe can be emptied during FR/FR with multiple losses Idea: SACK provides better estimate of packets in pipe SACK TCP option describes received packets On 3 dupACKs: retransmits, halves window, enters FR Updates pipe = packets in pipe Increment when lost or new packets sent Decrement when dupACK received Transmits a (lost or new) packet when pipe < cwnd Exit FR when all packets outstanding when FR was entered are acknowledged [Sources: M. Mathis, J. Mahdavi, S. Floyd and A. Romanow, “TCP Selective Acknowledgement Options”, RFC 2018, Oct. 1996 K. Fall and S. Floyd, “Simulation-based comparisons of Tahoe, Reno and SACK TCP”, Computer Communication Review, July 1996 ] behnam shafagaty

TCP Vegas (Brakmo & Peterson 1994) window time SS CA Reno with a new congestion avoidance algorithm Converges (provided buffer is large) ! behnam shafagaty

Congestion avoidance Each source estimates number of its own packets in pipe from RTT Adjusts window to maintain estimate between ad and bd for every RTT { if W/RTTmin – W/RTT < a then W ++ if W/RTTmin – W/RTT > b then W -- } for every loss W := W/2 behnam shafagaty

Implications Congestion measure = end-to-end queueing delay At equilibrium Zero loss Stable window at full utilization Approximately weighted proportional fairness Nonzero queue, larger for more sources Convergence to equilibrium Converges if sufficient network buffer Oscillates like Reno otherwise behnam shafagaty

Wireless TCP Reno uses loss as congestion measure In wireless, significant losses due to Fading Interference Handover Not buffer overflow (congestion) Halving window too drastic Small throughput, low utilization behnam shafagaty

Proposed solutions Ideas Approaches Hide from source noncongestion losses Inform source of noncongestion losses Approaches Link layer error control Split TCP Snoop agent SACK+ELN (Explicit Loss Notification) Sources: Balakrishnan, Padmanabhan, Seshan and Katz, “A comparison of mechanisms for improving TCP performance over wireless links”, ToN, 5(6):756-769, Dec 1997 behnam shafagaty

Third approach Problem Reno uses loss as congestion measure Two types of losses Congestion loss: retransmit + reduce window Noncongestion loss: retransmit Previous approaches Hide noncongestion losses Indicate noncongestion losses Our approach Eliminates congestion losses (buffer overflows) behnam shafagaty

Third approach Router REM capable Host Do not use loss as congestion measure Vegas REM Idea REM clears buffer Only noncongestion losses Retransmits lost packets without reducing window behnam shafagaty

Performance Goodput behnam shafagaty