Flow Control An Engineering Approach to Computer Networking by S. Keshav /

Slides:



Advertisements
Similar presentations
Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
Advertisements

Slide Set 14: TCP Congestion Control. In this set... We begin Chapter 6 but with 6.3. We will cover Sections 6.3 and 6.4. Mainly deals with congestion.
Flow Control An Engineering Approach to Computer Networking.
Congestion Control Reasons: - too many packets in the network and not enough buffer space S = rate at which packets are generated R = rate at which receivers.
24-1 Chapter 24. Congestion Control and Quality of Service (part 1) 23.1 Data Traffic 23.2 Congestion 23.3 Congestion Control 23.4 Two Examples.
Computer Networks Transport Layer. Topics F Introduction (6.1)  F Connection Issues ( ) F TCP (6.4)
Traffic Shaping Why traffic shaping? Isochronous shaping
Jaringan Komputer Lanjut Traffic Management Aurelio Rahmadian.
1 TCP Congestion Control. 2 TCP Segment Structure source port # dest port # 32 bits application data (variable length) sequence number acknowledgement.
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
Flow Control Mario Gerla, CS 215 W2001. Flow Control - the concept Flow Control: “ set of techniques which allow to match the source offered rate to the.
1 TCP - Part II. 2 What is Flow/Congestion/Error Control ? Flow Control: Algorithms to prevent that the sender overruns the receiver with information.
Computer Networks: TCP Congestion Control 1 TCP Congestion Control Lecture material taken from “Computer Networks A Systems Approach”, Fourth Edition,Peterson.
Introduction 1 Lecture 14 Transport Layer (Transmission Control Protocol) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer.
Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.
1 Congestion Control Outline Queuing Discipline Reacting to Congestion Avoiding Congestion.
Week 9 TCP9-1 Week 9 TCP 3 outline r 3.5 Connection-oriented transport: TCP m segment structure m reliable data transfer m flow control m connection management.
1 Internet Networking Spring 2003 Tutorial 11 Explicit Congestion Notification (RFC 3168)
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
TDC375 Winter 03/04 John Kristoff - DePaul University 1 Network Protocols Transmission Control Protocol (TCP)
Data Communication and Networks
Lecture 5: Congestion Control l Challenge: how do we efficiently share network resources among billions of hosts? n Last time: TCP n This time: Alternative.
CMPE 257 Spring CMPE 257: Wireless and Mobile Networking Spring 2005 E2E Protocols (point-to-point)
1 K. Salah Module 6.1: TCP Flow and Congestion Control Connection establishment & Termination Flow Control Congestion Control QoS.
Flow and Congestion Control CS 218 F2003 Oct 29, 03 Flow control goals Classification and overview of main techniques TCP Tahoe, Reno, Vegas TCP Westwood.
Flow control problem Consider file transfer Sender sends a stream of packets representing fragments of a file Sender should try to match rate at which.
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Wireless TCP Prasun Dewan Department of Computer Science University of North Carolina
CSE 461 University of Washington1 Topic How TCP implements AIMD, part 1 – “Slow start” is a component of the AI portion of AIMD Slow-start.
ACN: RED paper1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug.
Congestion Control - Supplementary Slides are adapted on Jean Walrand’s Slides.
Chapter 12 Transmission Control Protocol (TCP)
CS640: Introduction to Computer Networks Aditya Akella Lecture 20 - Queuing and Basics of QoS.
Lecture 9 – More TCP & Congestion Control
Spring 2003CS 3321 Congestion Avoidance. Spring 2003CS 3322 Congestion Avoidance TCP congestion control strategy: –Increase load until congestion occurs,
ECS5365 Lecture 6 ATM Traffic and Network Management
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Computer Networking Lecture 18 – More TCP & Congestion Control.
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
1 CS 4396 Computer Networks Lab TCP – Part II. 2 Flow Control Congestion Control Retransmission Timeout TCP:
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
1 TCP - Part II. 2 What is Flow/Congestion/Error Control ? Flow Control: Algorithms to prevent that the sender overruns the receiver with information.
TCP. TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already.
Random Early Detection (RED) Router notifies source before congestion happens - just drop the packet (TCP will timeout and adjust its window) - could make.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
 Last Class  Resource Allocation  This Class  Chapter 6.3. ~ 6.4.  TCP congestion control.
Providing QoS in IP Networks
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
Chapter 10 Congestion Control in Data Networks and Internets 1 Chapter 10 Congestion Control in Data Networks and Internets.
Other Methods of Dealing with Congestion
TCP - Part II Relates to Lab 5. This is an extended module that covers TCP flow control, congestion control, and error control in TCP.
Internet Networking recitation #9
Topics discussed in this section:
Chapter 3 outline 3.1 transport-layer services
Introduction to Congestion Control
TCP - Part II Relates to Lab 5. This is an extended module that covers TCP flow control, congestion control, and error control in TCP.
Lecture 19 – TCP Performance
So far, On the networking side, we looked at mechanisms to links hosts using direct linked networks and then forming a network of these networks. We introduced.
Other Methods of Dealing with Congestion
CS640: Introduction to Computer Networks
Internet Networking recitation #10
Congestion Control Reasons:
TCP Congestion Control
Transport Layer: Congestion Control
TCP: Transmission Control Protocol Part II : Protocol Mechanisms
Lecture 6, Computer Networks (198:552)
Presentation transcript:

Flow Control An Engineering Approach to Computer Networking by S. Keshav /

Flow control problem Consider file transfer Consider file transfer Sender sends a stream of packets representing fragments of a file Sender sends a stream of packets representing fragments of a file Sender should try to match rate at which receiver and network can process data Sender should try to match rate at which receiver and network can process data Can’t send too slow or too fast Can’t send too slow or too fast Too slow Too slow  wastes time Too fast Too fast  can lead to buffer overflow How to find the correct rate? How to find the correct rate?

Other considerations Simplicity Simplicity Overhead Overhead Scaling Scaling Fairness Fairness Stability Stability Many interesting tradeoffs Many interesting tradeoffs  overhead for stability  simplicity for unfairness

Where? Usually at transport layer Usually at transport layer Also, in some cases, in datalink layer Also, in some cases, in datalink layer

Model Source, sink, server, service rate, bottleneck, round trip time Source, sink, server, service rate, bottleneck, round trip time

Classification Open loop Open loop  Source describes its desired flow rate  Network admits call  Source sends at this rate Closed loop Closed loop  Source monitors available service rate  Explicit or implicit  Sends at this rate  Due to speed of light delay, errors are bound to occur Hybrid Hybrid  Source asks for some minimum rate  But can send more, if available

Open loop flow control Two phases to flow Two phases to flow  Call setup  Data transmission Call setup Call setup  Network prescribes parameters  User chooses parameter values  Network admits or denies call Data transmission Data transmission  User sends within parameter range  Network polices users  Scheduling policies give user QoS

Hard problems Choosing a descriptor at a source Choosing a scheduling discipline at intermediate network elements Admitting calls so that their performance objectives are met (call admission control).

Traffic descriptors Usually an envelope Usually an envelope  Constrains worst case behavior Three uses Three uses  Basis for traffic contract  Input to regulator  Input to policer The regulator and policer are identical in the way they identify descriptor violations. The difference is that a regulator typically delays excess traffic, while a policer typically drops it. The regulator and policer are identical in the way they identify descriptor violations. The difference is that a regulator typically delays excess traffic, while a policer typically drops it.

Descriptor requirements Representativity Representativity  adequately describes flow, so that network does not reserve too little or too much resource Verifiability Verifiability  verify that descriptor holds Preservability Preservability  Doesn’t change inside the network Usability Usability  Easy to describe and use for admission control

Examples Representative, verifiable, but not useble  Time series of interarrival times Verifiable, preservable, and useable, but not representative Verifiable, preservable, and useable, but not representative  peak rate

Some common descriptors Peak rate Peak rate Average rate Average rate Linear bounded arrival process Linear bounded arrival process

Peak rate Highest ‘rate’ at which a source can send data Highest ‘rate’ at which a source can send data Two ways to compute it Two ways to compute it For networks with fixed-size packets For networks with fixed-size packets  min inter-packet spacing For networks with variable-size packets For networks with variable-size packets  highest rate over all intervals of a particular duration Regulator for fixed-size packets Regulator for fixed-size packets  timer set on packet transmission  if timer expires, send packet, if any Problem Problem  sensitive to extremes

Average rate Rate over some time period (window) Rate over some time period (window) Less susceptible to outliers Less susceptible to outliers Parameters: t and a Parameters: t and a Two types: jumping window and moving window Two types: jumping window and moving window Jumping window Jumping window  over consecutive intervals of length t, only a bits sent  regulator reinitializes every interval Moving window Moving window  over all intervals of length t, only a bits sent  regulator forgets packet sent more than t seconds ago

Linear Bounded Arrival Process Source bounds # bits sent in any time interval by a linear function of time Source bounds # bits sent in any time interval by a linear function of time the number of bits transmitted in any active interval of length t is less than rt + s r is the long term rate s is the burst limit insensitive to outliers

Leaky bucket A regulator for an LBAP A regulator for an LBAP Token bucket fills up at rate r Token bucket fills up at rate r Largest # tokens < s Largest # tokens < s

Variants Token and data buckets Token and data buckets  Sum is what matters (i.e., packet loss rate) Peak rate regulator Peak rate regulator Average rate regulator Average rate regulator Policer: no data buckets Policer: no data buckets

Choosing LBAP parameters Tradeoff between r and s Tradeoff between r and s Minimal descriptor Minimal descriptor  doesn’t simultaneously have smaller r and s  presumably costs less How to choose minimal descriptor? How to choose minimal descriptor? Three way tradeoff Three way tradeoff  choice of s (data bucket size)  loss rate  choice of r

Choosing minimal parameters Keeping loss rate the same Keeping loss rate the same  if s is more, r is less (smoothing)  for each r we have least s Choose knee of curve Choose knee of curve

LBAP Popular in practice and in academia Popular in practice and in academia  sort of representative  verifiable  sort of preservable  sort of usable Problems with multiple time scale traffic Problems with multiple time scale traffic  large burst messes up things

Open loop vs. closed loop Open loop Open loop  describe traffic  network admits/reserves resources  regulation/policing Closed loop Closed loop  can’t describe traffic or network doesn’t support reservation  monitor available bandwidth  adapt to it  if not done properly either  too much loss  unnecessary delay

Taxonomy First generation First generation  ignores network state  only match receiver  E.g., On-off, Stop-and-wait, Static window Second generation Second generation  responsive to state  three choices  State measurement explicit or implicitexplicit or implicit  Control flow control window size or rateflow control window size or rate  Point of control endpoint or within networkendpoint or within network

Explicit vs. Implicit Explicit Explicit  Network tells source its current rate  Better control  More overhead Implicit Implicit  Endpoint figures out rate by looking at network  Less overhead Ideally, want overhead of implicit with effectiveness of explicit Ideally, want overhead of implicit with effectiveness of explicit

Flow control window Recall error control window Recall error control window Largest number of packet outstanding (sent but not acked) Largest number of packet outstanding (sent but not acked) If endpoint has sent all packets in window, it must wait => slows down its rate If endpoint has sent all packets in window, it must wait => slows down its rate Thus, window provides both error control and flow control Thus, window provides both error control and flow control This is called transmission window This is called transmission window Coupling can be a problem Coupling can be a problem  Few buffers are receiver => slow rate!

Window vs. rate In adaptive rate, we directly control rate In adaptive rate, we directly control rate Needs a timer per connection Needs a timer per connection Plusses for window Plusses for window  no need for fine-grained timer  self-limiting Plusses for rate Plusses for rate  better control (finer grain)  no coupling of flow control and error control Rate control must be careful to avoid overhead and sending too much Rate control must be careful to avoid overhead and sending too much

Hop-by-hop vs. end-to-end Hop-by-hop Hop-by-hop  first generation flow control at each link  next server = sink  easy to implement End-to-end End-to-end  sender matches all the servers on its path Plusses for hop-by-hop Plusses for hop-by-hop  simpler  distributes overflow  better control Plusses for end-to-end Plusses for end-to-end  cheaper

On-off Receiver gives ON and OFF signals If ON, send at full speed If OFF, stop OK when RTT is small What if OFF is lost? Bursty Used in serial lines or LANs

Stop and Wait Send a packet Send a packet Wait for ack before sending next packet Wait for ack before sending next packet

Static window Stop and wait can send at most one pkt per RTT Stop and wait can send at most one pkt per RTT Here, we allow multiple packets per RTT (= transmission window) Here, we allow multiple packets per RTT (= transmission window)

What should window size be? Let bottleneck service rate along path = b pkts/sec Let round trip time = R sec Let flow control window = w packet Sending rate is w packets in R seconds = w/R To use bottleneck w/R > b => w > bR This is the bandwidth delay product or optimal window size

Static window Works well if b and R are fixed Works well if b and R are fixed But, bottleneck rate changes with time! But, bottleneck rate changes with time! Static choice of w can lead to problems Static choice of w can lead to problems  too small  too large So, need to adapt window So, need to adapt window Always try to get to the current optimal value Always try to get to the current optimal value

DECbit flow control Intuition Intuition  every packet has a bit in header  intermediate routers set bit if queue has built up => source window is too large  sink copies bit to ack  if bits set, source reduces window size  in steady state, oscillate around optimal size

DECbit When do bits get set? When do bits get set? How does a source interpret them? How does a source interpret them?

DECbit details: router actions Measure demand Measure demand and mean queue length of each source Computed over queue regeneration cycles Balance between sensitivity and stability

Router actions If mean queue length > 1.0 If mean queue length > 1.0  set bits on sources whose demand exceeds fair share If it exceeds 2.0 If it exceeds 2.0  set bits on everyone  panic!

Source actions Keep track of bits Keep track of bits Can’t take control actions too fast! Can’t take control actions too fast! Wait for past change to take effect Wait for past change to take effect Measure bits over past + present window size Measure bits over past + present window size If more than 50% set, then decrease window, else increase If more than 50% set, then decrease window, else increase Additive increase, multiplicative decrease Additive increase, multiplicative decrease

Evaluation Works with FIFO Works with FIFO  but requires per-connection state (demand) Software Software But But  assumes cooperation!  conservative window increase policy

Sample trace

TCP Flow Control Implicit Implicit Dynamic window Dynamic window End-to-end End-to-end Very similar to DECbit, but Very similar to DECbit, but  no support from routers  increase if no loss (usually detected using timeout)  window decrease on a timeout  additive increase multiplicative decrease

TCP details Window starts at 1 Window starts at 1 Increases exponentially for a while, then linearly Increases exponentially for a while, then linearly Exponentially => doubles every RTT Exponentially => doubles every RTT Linearly => increases by 1 every RTT Linearly => increases by 1 every RTT During exponential phase, every ack results in window increase by 1 During exponential phase, every ack results in window increase by 1 During linear phase, window increases by 1 when # acks = window size During linear phase, window increases by 1 when # acks = window size Exponential phase is called slow start Exponential phase is called slow start Linear phase is called congestion avoidance Linear phase is called congestion avoidance

More TCP details On a loss, current window size is stored in a variable called slow start threshold or ssthresh On a loss, current window size is stored in a variable called slow start threshold or ssthresh Switch from exponential to linear (slow start to congestion avoidance) when window size reaches threshold Switch from exponential to linear (slow start to congestion avoidance) when window size reaches threshold Loss detected either with timeout or fast retransmit (duplicate cumulative acks) Loss detected either with timeout or fast retransmit (duplicate cumulative acks) Two versions of TCP Two versions of TCP  Tahoe: in both cases, drop window to 1  Reno: on timeout, drop window to 1, and on fast retransmit drop window to half previous size (also, increase window on subsequent acks)

TCP vs. DECbit Both use dynamic window flow control and additive-increase multiplicative decrease Both use dynamic window flow control and additive-increase multiplicative decrease TCP uses implicit measurement of congestion TCP uses implicit measurement of congestion  probe a black box Operates at the cliff Operates at the cliff Source does not filter information Source does not filter information

Evaluation Effective over a wide range of bandwidths Effective over a wide range of bandwidths A lot of operational experience A lot of operational experience Weaknesses Weaknesses  loss => overload? (wireless)  overload => self-blame, problem with FCFS  ovelroad detected only on a loss  in steady state, source induces loss  needs at least bR/3 buffers per connection

Sample trace

TCP Vegas Expected throughput = transmission_window_size/propagation_delay Expected throughput = transmission_window_size/propagation_delay Numerator: known Numerator: known Denominator: measure smallest Denominator: measure smallest RTT Also know actual throughput Difference = how much to reduce/increase rate Algorithm  send a special packet  on ack, compute expected and actual throughput  (expected - actual)* RTT packets in bottleneck buffer  adjust sending rate if this is too large Works better than TCP Reno Works better than TCP Reno

NETBLT First rate-based flow control scheme Separates error control (window) and flow control (no coupling) So, losses and retransmissions do not affect the flow rate Application data sent as a series of buffers, each at a particular rate Rate = (burst size + burst rate) so granularity of control = burst Initially, no adjustment of rates Later, if received rate < sending rate, multiplicatively decrease rate Change rate only once per buffer => slow

Packet pair Improves basic ideas in NETBLT Improves basic ideas in NETBLT  better measurement of bottleneck  control based on prediction  finer granularity Assume all bottlenecks serve packets in round robin order Assume all bottlenecks serve packets in round robin order Then, spacing between packets at receiver (= ack spacing) = 1/(rate of slowest server) Then, spacing between packets at receiver (= ack spacing) = 1/(rate of slowest server) If all data sent as paired packets, no distinction between data and probes If all data sent as paired packets, no distinction between data and probes Implicitly determine service rates if servers are round-robin-like Implicitly determine service rates if servers are round-robin-like

Packet pair

Packet-pair details Acks give time series of service rates in the past Acks give time series of service rates in the past We can use this to predict the next rate We can use this to predict the next rate Exponential averager, with fuzzy rules to change the averaging factor Exponential averager, with fuzzy rules to change the averaging factor Predicted rate feeds into flow control equation Predicted rate feeds into flow control equation

Packet-pair flow control Let X = # packets in bottleneck buffer S = # outstanding packets R = RTT b = bottleneck rate Then, X = S - Rb (assuming no losses) Let l = source rate l(k+1) = b(k+1) + (setpoint -X)/R

Sample trace

ATM Forum EERC Similar to DECbit, but send a whole cell’s worth of info instead of one bit Similar to DECbit, but send a whole cell’s worth of info instead of one bit Sources periodically send a Resource Management (RM) cell with a rate request Sources periodically send a Resource Management (RM) cell with a rate request  typically once every 32 cells Each server fills in RM cell with current share, if less Each server fills in RM cell with current share, if less Source sends at this rate Source sends at this rate

ATM Forum EERC details Source sends Explicit Rate (ER) in RM cell Source sends Explicit Rate (ER) in RM cell Switches compute source share in an unspecified manner (allows competition) Switches compute source share in an unspecified manner (allows competition) Current rate = allowed cell rate = ACR Current rate = allowed cell rate = ACR If ER > ACR then ACR = ACR + RIF * PCR else ACR = ER If ER > ACR then ACR = ACR + RIF * PCR else ACR = ER If switch does not change ER, then use DECbit idea If switch does not change ER, then use DECbit idea  If CI bit set, ACR = ACR (1 - RDF) If ER < AR, AR = ER If ER < AR, AR = ER Allows interoperability of a sort Allows interoperability of a sort If idle 500 ms, reset rate to Initial cell rate If idle 500 ms, reset rate to Initial cell rate If no RM cells return for a while, ACR *= (1-RDF) If no RM cells return for a while, ACR *= (1-RDF)

Comparison with DECbit Sources know exact rate Sources know exact rate Non-zero Initial cell-rate => conservative increase can be avoided Non-zero Initial cell-rate => conservative increase can be avoided Interoperation between ER/CI switches Interoperation between ER/CI switches

Problems RM cells in data path a mess Updating sending rate based on RM cell can be hard Interoperability comes at the cost of reduced efficiency (as bad as DECbit) Computing ER is hard

Comparison among closed-loop schemes On-off, stop-and-wait, static window, DECbit, TCP, NETBLT, Packet-pair, ATM Forum EERC On-off, stop-and-wait, static window, DECbit, TCP, NETBLT, Packet-pair, ATM Forum EERC Which is best? No simple answer Which is best? No simple answer Some rules of thumb Some rules of thumb  flow control easier with RR scheduling  otherwise, assume cooperation, or police rates  explicit schemes are more robust  hop-by-hop schemes are more resposive, but more comples  try to separate error control and flow control  rate based schemes are inherently unstable unless well- engineered

Hybrid flow control Source gets a minimum rate, but can use more Source gets a minimum rate, but can use more All problems of both open loop and closed loop flow control All problems of both open loop and closed loop flow control Resource partitioning problem Resource partitioning problem  what fraction can be reserved?  how?