Buffer requirements for TCP Damon Wischik www.wischik.com/damon DARPA grant W911NF05-1-0254.

Buffer requirements for TCP Damon Wischik www.wischik.com/damon DARPA grant W911NF05-1-0254

Motivation Internet routers have buffers, –to accomodate bursts in traffic –to keep utilization high Large buffers are challenging: –optical packet buffers are hard to build –large electronic buffers are unsustainable since data volumes double every 10–12 months [Odlyzko, 2003] and CPU speeds double every 18 months but DRAM memory access speeds double every 10 years How big do the buffers need to be, to accommodate TCP traffic? –3 GByte? Rule of thumb says buffer = bandwidth×delay –300 MByte? buffer = bandwidth×delay/√#flows [Appenzeller, Keslassy, McKeown, 2004] –30 kByte? constant buffer size, independent of line rate [Kelly, Key, etc.]

History of TCP Transmission Control Protocol 1974: First draft of TCP/IP [“A protocol for packet network interconnection”, Vint Cerf and Robert Kahn] 1983: ARPANET switches on TCP/IP 1986: Congestion collapse 1988: Congestion control for TCP [“Congestion avoidance and control”, Van Jacobson] “A Brief History of the Internet”, the Internet Society

End-to-end congestion control Internet congestion is controlled by the end-systems. The network operates as a dumb pipe. [“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] User Server (TCP) request

End-to-end congestion control Internet congestion is controlled by the end-systems. The network operates as a dumb pipe. [“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] User Server (TCP) data

End-to-end congestion control Internet congestion is controlled by the end-systems. The network operates as a dumb pipe. [“End-to-end arguments in system design” by Saltzer, Reed, Clark, 1981] The TCP algorithm, running on the server, decides how fast to send data. User Server (TCP) data acknowledgements

TCP algorithm if (seqno > _last_acked) { if (!_in_fast_recovery) { _last_acked = seqno; _dupacks = 0; inflate_window(); send_packets(now); _last_sent_time = now; return; } if (seqno < _recover) { uint32_t new_data = seqno - _last_acked; _last_acked = seqno; if (new_data < _cwnd) _cwnd -= new_data; else _cwnd=0; _cwnd += _mss; retransmit_packet(now); send_packets(now); return; } uint32_t flightsize = _highest_sent - seqno; _cwnd = min(_ssthresh, flightsize + _mss); _last_acked = seqno; _dupacks = 0; _in_fast_recovery = false; send_packets(now); return; } if (_in_fast_recovery) { _cwnd += _mss; send_packets(now); return; } _dupacks++; if (_dupacks!=3) { send_packets(now); return; } _ssthresh = max(_cwnd/2, (uint32_t)(2 * _mss)); retransmit_packet(now); _cwnd = _ssthresh + 3 * _mss; _in_fast_recovery = true; _recover = _highest_sent; } time [0-8 sec] traffic rate [0-100 kB/sec]

Buffer sizing I: TCP sawtooth rate time The traffic rate produced by a TCP flow follows a ‘sawtooth’ To prevent the link’s going idle, the buffer must be big enough to smooth out a sawtooth –buffer = bandwidth×delay But is this a worthwhile goal? Why not sacrifice a little utilization, so that virtually no buffer is needed? When there are many TCP flows, the sawteeth should average out, so a smaller buffer is sufficient –buffer = bandwidth×delay/√N N = number of TCP flows [Appenzeller, Keslassy, McKeown, 2004] But what if they don’t average out, i.e. they remain synchronized?

Buffer sizing I: TCP sawtooth The traffic rate produced by a TCP flow follows a ‘sawtooth’ To prevent the link’s going idle, the buffer must be big enough to smooth out a sawtooth –buffer = bandwidth×delay But is this a worthwhile goal? Why not sacrifice some utilization, so that virtually no buffer is needed? When there are many TCP flows, the sawteeth should average out, so a smaller buffer is sufficient –buffer = bandwidth×delay/√N N = number of TCP flows [Appenzeller, Keslassy, McKeown, 2004] But what if they don’t average out, i.e. they remain synchronized? rate time

Buffer sizing II: TCP dynamics When the aggregate data rate is less than the service rate, the queue stays small No packet drops, so TCPs increase their data rate aggregate data rate queue size [0-160pkt] time [0-3.5s] service rate

Buffer sizing II: TCP dynamics When the aggregate data rate is less than the service rate, the queue stays small No packet drops, so TCPs increase their data rate Eventually the aggregate data rate exceeds the service rate, and a queue starts to build up When the queue is full, packets start to get dropped aggregate data rate queue size [0-160pkt] time [0-3.5s] service rate drops

Buffer sizing II: TCP dynamics When the aggregate data rate is less than the service rate, the queue stays small No packet drops, so TCPs increase their data rate Eventually the aggregate data rate exceeds the service rate, and a queue starts to build up When the queue is full, packets start to get dropped One round trip time later, TCPs respond and cut back They may overreact, leading to synchronization i.e. periodic fluctuations aggregate data rate queue size [0-160pkt] time [0-3.5s] service rate

Buffer sizing II: TCP dynamics When the aggregate data rate is less than the service rate, the queue stays small No packet drops, so TCPs increase their data rate Eventually the aggregate data rate exceeds the service rate, and a queue starts to build up When the queue is full, packets start to get dropped One round trip time later, TCPs respond and cut back They may overreact, leading to synchronization i.e. periodic fluctuations Is it possible to keep the link slightly underutilized, so that the queue size remains small? aggregate data rate queue size [0-160pkt] time [0-3.5s] service rate

Buffer sizing III: packet burstiness TCP traffic is made up of packets –there may be packet clumps, if the access network is fast –or the packets may be spaced out Even if we manage to keep total data rate < service rate, packet-level bursts will still cause small, rapid fluctuations in queue size If the buffer is small, these fluctuations will lead to packet loss –which makes TCPs keep their data rates low –which keeps the queue size small, & reduces the chance of synchronization packets rate time aggregate data rate queue size [0-16pkt]

Mathematical models for the queue Over short timescales (<1ms), TCP traffic is approximately Poisson [“Internet traffic tends toward Poisson and independent as the load increases”, Cao, Cleveland, Lin, Sun, 2002] With small buffers, queue fluctuations are rapid (average duration of a busy period is inversely proportional to line rate) [Cao, Ramanan, 2002] queue size [0-16pkt] time [0-0.1s] 500 flows, service rate 0.8Mbit/s (100pkt/s/flow), average RTT 120ms, buffer size 16pkt

Mathematical models for the queue Over short timescales (<1ms), TCP traffic is approximately Poisson [“Internet traffic tends toward Poisson and independent as the load increases”, Cao, Cleveland, Lin, Sun, 2002] With small buffers, queue fluctuations are rapid (average duration of a busy period is inversely proportional to line rate) [Cao, Ramanan, 2002] Therefore, we model small-buffer routers using Poisson traffic models Long-range dependence, self similarity, and heavy tails are irrelevant! The packet loss probability in an M/D/1/B queue, at link utilization , is roughly p =  B queue size [0-16pkt] time [0-0.1s] 500 flows, service rate 0.8Mbit/s (100pkt/s/flow), average RTT 120ms, buffer size 16pkt

Mathematical models for TCP Over long timescales (>10ms), TCP traffic can be modelled using differential equations & control theory [Misra, Gong, Towsley, 2000] Use this to –calculate equilibrium link utilization (the “TCP throughput formula”) –work out whether the system is synchronized or stable –evaluate the effect of changing buffer size –investigate alternatives to TCP average data rate at time t pkt drop probability at time t-RTT (calculated from Poisson model for the queue) round trip time (“delay”) available bandwidth per flow

link utilization  log 10 of pkt loss probability p TCP throughput formula C = bandwidth per flow RTT = average round trip time C RTT = TCP window size loss probability for M/D/1/B queue Instability plot

link utilization  log 10 of pkt loss probability p TCP throughput formula C = bandwidth per flow RTT = average round trip time C RTT = TCP window size loss probability for M/D/1/B queue extent of oscillations in  Instability plot

C RTT = 4 pkts C RTT = 20 pkts C RTT = 100 pkts log 10 of pkt loss probability p Instability plot link utilization  B = 20 pkts

Intermediate buffers buffer = bandwidth×delay / √ #flows or Large buffers buffer = bandwidth×delay Large buffers with AQM buffer = bandwidth×delay× {¼,1,4} Small buffers buffer={10,20,50} pkts Small buffers, ScalableTCP buffer={50,1000} pkts [Vinnicombe 2002, T.Kelly 2002] Different buffer size / TCP

Small buffers [20-50 pkts] Large buffers [>50ms queue delay] low- bandwidth flows [window <5pkts] slightly synchronized; high utilization; low delay/jitter. desynchronized; high utilization; high delay, low jitter. high- bandwidth flows [window >10pkts] desynchronized; moderate utilization; low delay/jitter. severely synchronized; high utilization; high delay/jitter. Results There is a virtuous circle: desynchronization permits small buffers; small buffers induce desynchronization.

Limitations/concerns Surely bottlenecks are at the access network, not the core network? –Unwise to rely on this! –The small-buffer theory works fine for as few as 20 flows –If the core is underutilized, it definitely doesn’t need big buffers The Poisson model sometimes breaks down –because of short-timescale packet clumps (not LRD!) –need more measurement of short-timescale Internet traffic statistics Limited validation so far [McKeown et al. at Stanford, Level3, Internet2] Proper validation needs –goodly amount of traffic –full measurement kit –ability to control buffer size

Conclusion Buffer sizes can be very small –a buffer of 25pkt gives link utilization > 90% –small buffers mean that TCP flows get better feedback, so they can better judge how much capacity is available –use Poisson traffic models for the router, differential equation models for aggregate traffic –Further questions: What is the impact of packet ‘clumpiness’? How well would non-FIFO buffers work? TCP can be improved with simple changes –e.g. spacing out TCP packets –e.g. modify the window increase/decrease rules [ScalableTCP: Vinnicombe 2004, Kelly 2004; XCP: Katabi, Handley, Rohrs 2000] –any future transport protocol should be designed along these lines –improved TCP may find its way into Linux/Windows within 5 years

Buffer requirements for TCP Damon Wischik www.wischik.com/damon DARPA grant W911NF05-1-0254.

Similar presentations

Presentation on theme: "Buffer requirements for TCP Damon Wischik www.wischik.com/damon DARPA grant W911NF05-1-0254."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Buffer requirements for TCP Damon Wischik www.wischik.com/damon DARPA grant W911NF05-1-0254.

Similar presentations

Presentation on theme: "Buffer requirements for TCP Damon Wischik www.wischik.com/damon DARPA grant W911NF05-1-0254."— Presentation transcript:

Similar presentations

About project

Feedback