Download presentation
Presentation is loading. Please wait.
Published byAsher Benson Modified over 9 years ago
1
Chapter 12 TCP Traffic Control 1 89-850 Communication Networks Traffic Control in TCP Last updated: Thursday, May 14, 2015 Prof. Amir Herzberg Dept of Computer Science, Bar Ilan University http://AmirHerzberg.com
2
Chapter 12 TCP Traffic Control 2 Today Traffic control in TCP Router-based congestion control –RED (Random Early Detect) –ECN (Explicit Congestion Notification) Reference and foils ©: Stallings Hi-Speed Networks and Internets, Ch. 12 Covered in [KMK] (in depth)
3
Chapter 12 TCP Traffic Control 3 TCP Flow and Congestion Control TCP transmit rate is determined by rate of incoming ACKs Rate of Ack arrival determined by round- trip path between source and destination Bottleneck may be destination or internet Sender cannot tell which Only the internet bottleneck can be due to congestion
4
Chapter 12 TCP Traffic Control 4 Figure 12.7 TCP Flow and Congestion Control
5
Chapter 12 TCP Traffic Control 5 Figure 12.6 TCP Segment Pacing
6
Chapter 12 TCP Traffic Control 6 TCP Flow Control Simple yet often critical impact on performance Optimal: Window=Bits-In-Flight =Bandwidth x Delay –Bandwidth: of `bottleneck` (slowest) link on path –Delay: end-to-end (propagation, queuing, processing) If window too small: slows down connection If window too large: –Requires large buffers (or risk buffer overflow) –Many bytes may be sent into `broken connection` –Extra segments queued at routers not good!
7
Chapter 12 TCP Traffic Control 7 TCP Header Traffic Control Fields Sequence number (SN) of first byte in data segment [set by sender] Acknowledgement number (AN) Window (W) –`Credit allocation` for flow control Acknowledgement contains AN = i, W = j: bytes through SN = i - 1 acknowledged Permission is granted to send W = j more bytes, i.e., bytes i through i + j - 1
8
Chapter 12 TCP Traffic Control 8 Figure 12.1 TCP Credit Allocation Mechanism
9
Chapter 12 TCP Traffic Control 9 Credit Allocation is Flexible Suppose last message B issued was AN = i, W = j To increase credit to k (k > j) when no new data, B issues AN = i, W = k To acknowledge segment containing m bytes (m < j), B issues AN = i + m, W = j - m
10
Chapter 12 TCP Traffic Control 10 Figure 12.2 Flow Control Perspectives
11
Chapter 12 TCP Traffic Control 11 Credit Policy Receiver needs a policy for how much credit to give sender Conservative approach: grant credit up to limit of available buffer space –May limit throughput in long-delay situations Optimistic approach: grant credit based on expected free space when data arrives –May cause `overbooking` –But not if arrivals are thru `leaky bucket`
12
Chapter 12 TCP Traffic Control 12 Effect of Window Size W = TCP window size (bytes) R = Data rate (bps) at TCP source D = Propagation delay (seconds) After TCP source begins transmitting, it takes D seconds for first byte to arrive, and D seconds for acknowledgement to return TCP source could transmit at most 2RD bits, or RD/4 bytes, before getting Ack
13
Chapter 12 TCP Traffic Control 13 Normalized Throughput S 1 W > RD / 4 S = 4W W < RD / 4 RD
14
Chapter 12 TCP Traffic Control 14 Complicating Factors Multiple TCP connections are multiplexed over same network interface, reducing R and efficiency For multi-hop connections, D is the sum of delays across each network plus delays at each router If source data rate R exceeds data rate on one of the hops, that hop will be a bottleneck –Use lower rate (goal of congestion control) Lost segments are retransmitted, reducing throughput. Impact depends on retransmission policy TCP’s delayed (accumulated) acks: send Ack only after receiving 2 MSS or 200msec after Ack – W=(2MSS)*k [Why not just W 2MSS ?] –Actually W 4MSS [otherwise sender waits for Ack] –Indeed W=4MSS is popular default [but not always best]
15
Chapter 12 TCP Traffic Control 15 Retransmission Strategy Retransmission required when: 1. Segment arrives damaged, as indicated by checksum error 2. Segment fails to arrive TCP relies exclusively on positive acknowledgements + Timeouts: –Retransmission on acknowledgement timeout –No explicit negative acknowledgement
16
Chapter 12 TCP Traffic Control 16 Retransmission Timer A timer is associated with each segment as it is sent (or with oldest unacknowledged segment) If timer expires before segment acknowledged, sender must retransmit Value of retransmission timer: a key design issue –Timer should be a bit longer than round-trip delay (send segment, receive ack) –Delay is variable –Timer too small: many unnecessary retransmissions, wasting network bandwidth –Timer too large: delay in handling lost segment
17
Chapter 12 TCP Traffic Control 17 Difficulties to `measure delay` Delayed (accumulated) Acks For retransmitted segments, can’t tell whether acknowledgement is response to original transmission or retransmission –Don’t estimate delay from retransmissions [Karn’s algorithm] Network conditions may change suddenly –Yet don’t overreact to impact of burst traffic
18
Chapter 12 TCP Traffic Control 18 Which Round-trip Samples? If an ack is received for retransmitted segment, there are 2 possibilities: 1. Ack is for first transmission 2. Ack is for second transmission TCP source cannot distinguish 2 cases No valid way to calculate RTT: –From first transmission to ack, or –From second transmission to ack?
19
Chapter 12 TCP Traffic Control 19 Karn’s Algorithm Do not use measured RTT to update SRTT and SDEV Calculate backoff RTO when a retransmission occurs Use backoff RTO for segments until an ack arrives for a segment that has not been retransmitted Then use Jacobson’s algorithm to calculate RTO
20
Chapter 12 TCP Traffic Control 20 Adaptive Retransmission Timer: Simple Averaging Method Average Round-Trip Time (ARTT) K + 1 ARTT(K + 1) = 1 ∑ RTT(i) K + 1 i = 1 = K ART(K) + 1 RTT(K + 1) K + 1 K + 1
21
Chapter 12 TCP Traffic Control 21 RFC 793 Exponential Averaging Smoothed Round-Trip Time (SRTT) SRTT(K + 1) = α × SRTT(K) + (1 – α) × RTT(K + 1) The older the observation, the less it is counted in the average.
22
Chapter 12 TCP Traffic Control 22 Figure 12.4 Exponential Smoothing Coefficients SRTT(K + 1) = α × SRTT(K) + (1 – α) × SRTT(K + 1) (α=0.5, 0.875)
23
Chapter 12 TCP Traffic Control 23 Figure 12.5 Exponential Averaging
24
Chapter 12 TCP Traffic Control 24 RFC 793 Retransmission Timeout RTO(K + 1) = Min(UB, Max(LB, β × SRTT(K + 1))) UB, LB: prechosen fixed upper and lower bounds (typical/default: 1sec and 64 sec) Example values for α, β: 0.8 < α < 0.9 1.3 < β < 2.0
25
Chapter 12 TCP Traffic Control 25 Problem: RTT Instability Periods 3 sources of high variance in RTT Queuing, collisions due to other sources Peer may not acknowledge segments immediately Low data rate different packet size cause different delays
26
Chapter 12 TCP Traffic Control 26 Solution: Jacobson’s Algorithm SRTT(K + 1) = (1 – g) × SRTT(K) + g × RTT(K + 1) SERR(K + 1) = RTT(K + 1) – SRTT(K) SDEV(K + 1) = (1 – h) × SDEV(K) + h × |SERR(K + 1)| RTO(K + 1) = SRTT(K + 1) + f × SDEV(K + 1) g = 0.125 h = 0.25 f = 2 or f = 4 (most current implementations use f = 4)
27
Chapter 12 TCP Traffic Control 27 Figure 12.8 Jacobson’s RTO Calculations
28
Chapter 12 TCP Traffic Control 28 Exponential RTO Backoff RTT measured only on no failures –`Karn’s algorithm`: don’t measure on resend What if RTO too small failures no RTT… Increase RTO each time the same segment retransmitted – backoff process Multiply RTO by constant on retransmit: RTO[K+1] = q × RTO [K] q = 2 is called binary exponential backoff
29
Chapter 12 TCP Traffic Control 29 TCP Implementation Options Send: Nagle’s algorithm or immediate –Nagle: When waiting for Ack, send partial-segments only on receipt/time-out Deliver: cumulative or immediate (`push`) Ack: Delayed (cumulative) or Immediate Accept: In-order or In-window Retransmit First-only: one timer for entire Q, retransmit only first in Q Batch: one timer for entire Q, retransmit entire Q Individual: timer and retransmit per each packet Which is best if receiver uses in-order accept policy?
30
Chapter 12 TCP Traffic Control 30 TCP Congestion Control Dynamic routing can alleviate congestion by spreading load more evenly But only effective for unbalanced loads and brief surges in traffic Congestion can only be controlled by limiting total amount of data entering network ICMP source Quench message is crude and not effective RSVP may help but not widely implemented
31
Chapter 12 TCP Traffic Control 31 TCP Congestion Control is Difficult IP is connectionless and stateless, with no provision for detecting or controlling congestion TCP only provides end-to-end flow control No cooperative, distributed algorithm to bind together various TCP entities
32
Chapter 12 TCP Traffic Control 32 Congestion Window Management Slow start / rapid accelerate Dynamic window sizing on congestion Fast retransmit Fast recovery Limited transmit
33
Chapter 12 TCP Traffic Control 33 Slow Start / Rapid Accelerate awnd = MIN[ credit, cwnd] where awnd = allowed window in segments cwnd = congestion window in segments credit = amount of unused credit granted in most recent ack cwnd = 1 for a new connection and increased by 1 for each ack received, up to a maximum; from maximum: increase by 1 per RTT
34
Chapter 12 TCP Traffic Control 34 Figure 23.9 Effect of Slow Start
35
Chapter 12 TCP Traffic Control 35 Dynamic Window Sizing on Congestion A lost segment indicates congestion Prudent to reset cwsd = 1 and begin slow start process May not be conservative enough: “ easy to drive a network into saturation but hard for the net to recover” (Jacobson) Instead: –Set slow-start threshold ssthresh=cwnd/2 –Use slow start from cwnd=1 till cwnd=ssthresh –For cwnd>sstrhesh, increase only by one each Ack This is called congestion avoidance
36
Chapter 12 TCP Traffic Control 36 Figure 12.10 Slow Start and Congestion Avoidance
37
Chapter 12 TCP Traffic Control 37 Figure 12.11 Illustration of Slow Start and Congestion Avoidance
38
Chapter 12 TCP Traffic Control 38 Fast Retransmit RTO is generally noticeably longer than actual RTT If a segment is lost, TCP may be slow to retransmit TCP rule: if a segment is received out of order, an ack must be issued immediately for the last in- order segment Fast Retransmit rule: if 4 acks received for same segment, highly likely it was lost, so retransmit immediately, rather than waiting for timeout
39
Chapter 12 TCP Traffic Control 39 Figure 12.12 Fast Retransmit
40
Chapter 12 TCP Traffic Control 40 Fast Recovery When TCP retransmits a segment using Fast Retransmit, a segment was assumed lost Congestion avoidance measures are appropriate at this point E.g., slow-start/congestion avoidance procedure This may be unnecessarily conservative since multiple acks indicate segments are getting through (and out of the Net) Fast Recovery: retransmit lost segment, cut cwnd in half, proceed with linear increase of cwnd This avoids initial exponential slow-start
41
Chapter 12 TCP Traffic Control 41 Figure 12.13 Fast Recovery Example Small ticks: dup acks Acks line window line Upon 3 rd dup ack: a.Set ssthresh=cwnd/2 b.Retransmit segment c.Set cwnd=ssthresh+3 Upon additional dup ack: a.cwnd++; b.If pending<cwnd, send another segment Upon new ack: set cwnd=ssthresh.
42
Chapter 12 TCP Traffic Control 42 Limited Transmit If congestion window at sender is small, fast retransmit may not get triggered, e.g., cwnd = 3 1.Under what circumstances does sender have small congestion window? 2.Is the problem common? 3.If the problem is common, why not reduce number of duplicate acks needed to trigger retransmit?
43
Chapter 12 TCP Traffic Control 43 Limited Transmit If congestion window at sender is small, fast retransmit may not get triggered, e.g., cwnd = 3 1.Under what circumstances does sender have small congestion window? 1. Limited amount of data to send 2. Small limit on receive window (credit) 3. Small bandwidth*delay (e.g. very low delay) 2.Is the problem common? 1. Yes, e.g. about 56% retransmit due to RTO expires, only 44% of them by fast retransmit 3.If the problem is common, why not reduce number of duplicate acks needed to trigger retransmit? 1. Packet reordering is not all that rare
44
Chapter 12 TCP Traffic Control 44 Limited Transmit Algorithm RFC 3042 Sender can transmit new segment when 3 conditions are met: 1. Two consecutive duplicate acks are received 2. Destination advertised window allows transmission of segment 3. Amount of outstanding data after sending is less than or equal to cwnd + 2
45
Chapter 12 TCP Traffic Control 45 Mices vs. Elephants 80% of the traffic is due to a small number of large, steady flows {elephants}. The remaining traffic volume is due to many short- lived, bursty flows {mice}. With TCP congestion control mechanisms, these short flows receive less than their fair share when they compete for the bottleneck bandwidth. Elephants fill up routers queues; when mice packets arrive (in bursts), they are dropped!
46
Chapter 12 TCP Traffic Control 46 Another view: Buffer Size Small buffers: –small delays –but packets often lost (dropped, due to bursts) Large buffers: –reduce number of packet drops (due to bursts) –but increase delays: filled up by `elephants` –So, it may not reduce drops so much ! Can we have the best of both worlds? Can elephants and mice prosper in peace?
47
Chapter 12 TCP Traffic Control 47 Proactive Packet Discard Congestion management by proactive packet discard –Before buffer full –Used on single FIFO queue Or multiple queues for elastic traffic (but not here…) –Random Early Detection (RED) [Floy97]
48
Chapter 12 TCP Traffic Control 48 Random Early Discard (RED) Basic premise: –router should signal congestion when the queue starts building up (slow elephants) RED: signal congestion by dropping a packet –give flows time to reduce their sending rates before dropping more packets –Don’t penalize burst traffic (mice)! Therefore, packet drops should be: –early: don’t wait for queue to overflow –random: better slow an elephant than kill a mouse
49
Chapter 12 TCP Traffic Control 49 Random Early Discard (RED): Motivation If burst fills buffers, gets discarded… (if TCP) Reduce window, probably: slow start –Lost packets need to be resent Adds to load and delay –Global synchronization Traffic burst fills queues so packets lost Many TCP connections enter slow start Traffic drops so network under utilized Connections may leave slow start at same time causing burst Bigger buffers? High cost, limited returns, more delays Idea: Try to anticipate onset of congestion and tell one connection (at a time) to slow down –Classical RED (in [Stalling]): notify by dropping packets –Or: ECN (Explicit Congestion Notification)
50
Chapter 12 TCP Traffic Control 50 RED Design Goals Congestion avoidance Global synchronization avoidance –Current systems inform (all) connections to back off implicitly by dropping packets Avoidance of bias against bursty traffic –Burst fills tail of queue; so `drop-tail` is bad Bound on average queue length –Hence control on average delay
51
Chapter 12 TCP Traffic Control 51 RED: Random Early Detect/Discard FIFO scheduling Buffer management: –Probabilistically discard packets –Probability is computed as a function of average queue length (why average?) Discard Probability Average Queue Length 0 1 min_thmax_th queue_len
52
Chapter 12 TCP Traffic Control 52 RED packet TH min TH max TH min :: average queue length threshold for triggering probabilistic drops/marks. TH max :: average queue length threshold for triggering forced drops.
53
Chapter 12 TCP Traffic Control 53 RED (cont’d) TH min – minimum threshold TH max – maximum threshold avg – exponential average queue length q – sample queue length –avg = (1-w)*avg + w*q –Forget factor w : 0 < w < 1 Discard Probability Average Queue Length 0 1 TH min TH max queue_len
54
Chapter 12 TCP Traffic Control 54 RED (cont’d) TH min – minimum threshold TH max – maximum threshold avg[k] –average queue length (at period k) q[k] – sample queue length (at period k) –avg[k] = (1-w)*avg[k-1] + w*q[k] –Forget factor w : 0 < w < 1 Discard Probability Average Queue Length 0 1 TH min TH max queue_len
55
Chapter 12 TCP Traffic Control 55 Average vs Instantaneous Queue Length
56
Chapter 12 TCP Traffic Control 56 RED (cont’d) If (avg < TH min ) enqueue packet If (avg > TH max ) drop packet Else discard with probability P a Discard Probability (P a ) Average Queue Length 0 1 TH min TH max queue_len
57
Chapter 12 TCP Traffic Control 57 RED Algorithm – Overview Calculate average queue size avg if avg < TH min queue packet else if TH min avg TH max calculate probability P a with probability P a discard packet else with probability 1-P a queue packet else if avg TH max discard packet
58
Chapter 12 TCP Traffic Control 58 RED Buffer
59
Chapter 12 TCP Traffic Control 59 RED: discard probability (simplified) P a = max_P*(avg – TH min )/(TH maz – TH min ) Discard Probability Average Queue Length 0 1 TH min TH max queue_len avg PaPa max_P
60
Chapter 12 TCP Traffic Control 60 RED: discard probability (real) P b = max_P*(avg – TH min )/(TH maz – TH min ) count: # of times we rolled `don’t discard` P a =P b /(1-count*P b ) Discard Probability Average Queue Length 0 1 TH min TH max queue_len avg P b (e.g. 0.2) max_P (e.g. 0.35) P a (e.g. 0.5, for count=3)
61
Chapter 12 TCP Traffic Control 61 RED Algorithm Detail
62
Chapter 12 TCP Traffic Control 62 RED Summary Average queue length small –Reduces latency –Yet: reserves to absorb bursts Does not always work well –Better if we can avoid dropping packets… –RED is random: may hit mouses, too… –Painful for short connections, e.g. Web traffic ECN (Explicit Congestion Notification): –Same as RED, but mark instead of drop
63
Chapter 12 TCP Traffic Control 63 ECN Routers [RFC3168] Explicit Congestion Notification (ECN) marks packets to signal congestion. ECN works only if supported by router, sender and recipient ECN-compliant TCP senders initiate their congestion avoidance algorithm after receiving marked ACK packets from the TCP receiver. Packets from non-ECN flows are dropped (RED)
64
Chapter 12 TCP Traffic Control 64 ECN Signalling Two bits (in DS field of IP header) Explicit congestion notification (ECN) bit: –Set by ECN router (instead of drop in RED) –If data packet has ECN bit set, then set it in ACK ECN capable bit (for backward compatibility) –Indicates that sender implements ECN –To be ignored (and passed) by non-ECN router
65
Chapter 12 TCP Traffic Control 65 Picture W W/2 AB
66
Chapter 12 TCP Traffic Control 66 ECN Advantages No extra delay (to receive retransmit) No wait for ack on retransmit (before shifting window to new packets) Most critical for short lived flows (e.g. Web transfers) No waste of bandwidth (retransmit) Indication of corruption losses?
67
Chapter 12 TCP Traffic Control 67 ECN/RED: (Simplified) Analysis [KMK], mostly section 7.6.6 Simplifications: –Single source –Fixed delays: d=τ f + τ r + τ q –Rate of source at time t is r(t), buffer is x(t) Router wide area link ClientServer x(t) r(t) τfτf c bps τrτr τqτq
68
Chapter 12 TCP Traffic Control 68 ECN/RED: (Simplified) Analysis Dynamic system model: Router wide area link ClientServer x(t) r(t) τfτf c bps τrτr τqτq
69
Chapter 12 TCP Traffic Control 69 Analysis: (more) Simplifications Fixed delays: d=τ f + τ r + τ q Rate of source at time t is r(t)=w(t)/d Sender in congestion avoidance (AIMD): –Ack without ECN: w(t + )=w(t)+1/w(t) [AddInc] –Ack with ECN: w(t + )=w(t) / 2 [MultDec] Ack sent (immediately) for every packet Simplified mark probability:
70
Chapter 12 TCP Traffic Control 70 So far we have…. Equilibrium ?? r(t)=c, x(t)=x 0 s.t.:
71
Chapter 12 TCP Traffic Control 71 Equilibrium ?? Equilibrium point: r(t)=c, x(t)=x 0 s.t.: Sufficient condition:
72
Chapter 12 TCP Traffic Control 72 END
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.