Queueing theory for TCP Damon Wischik University College London TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A
2 Some Internet history 1974: First draft of TCP/IP [“A protocol for packet network interconnection”, Vint Cerf and Robert Kahn] 1983: ARPANET switches on TCP/IP 1986: Congestion collapse 1988: Congestion control for TCP [“Congestion avoidance and control”, Van Jacobson]
3 TCP transmission control protocol Internet congestion is controlled by the end-systems The TCP algorithm, running on the server, decides how fast to send data –It sends data packets, and waits for ACKnowledgement packets –It sets a target window size w t the number of packets that have been sent but not yet acknowledged (here 5+3) –The average transmission rate is x t ≈ w t /RTT where RTT is the round trip time –It adapts w t based on the ACKs it receives User Server (TCP) data acknowledgements
4 if (seqno > _last_acked) { if (!_in_fast_recovery) { _last_acked = seqno; _dupacks = 0; inflate_window(); send_packets(now); _last_sent_time = now; return; } if (seqno < _recover) { uint32_t new_data = seqno - _last_acked; _last_acked = seqno; if (new_data < _cwnd) _cwnd -= new_data; else _cwnd=0; _cwnd += _mss; retransmit_packet(now); send_packets(now); return; } uint32_t flightsize = _highest_sent - seqno; _cwnd = min(_ssthresh, flightsize + _mss); _last_acked = seqno; _dupacks = 0; _in_fast_recovery = false; send_packets(now); return; } if (_in_fast_recovery) { _cwnd += _mss; send_packets(now); return; } _dupacks++; if (_dupacks!=3) { send_packets(now); return; } _ssthresh = max(_cwnd/2, (uint32_t)(2 * _mss)); retransmit_packet(now); _cwnd = _ssthresh + 3 * _mss; _in_fast_recovery = true; _recover = _highest_sent; } TCP algorithm transmission rate [0–100 kB/sec] time [0–8 sec]
5 Models for TCP For a single TCP "sawtooth", one can explicitly calculate the mean transmission rate etc. –Consider a single link, and a single flow with round trip time RTT –When the link is not congested, transmission rate x t increases by 1/RTT 2 pkt/sec every sec –When the link becomes congested and the queue fills, packets are dropped and TCP cuts its transmission rate by 50% transmission rate x ( t ) queue size time * * * service capacity buffer size *
6 Models for TCP ***** * For a single TCP "sawtooth", one can explicitly calculate the mean transmission rate etc. With a few TCP flows, use a "billiards model": random matrices describe the evolution of the system from one congestion instant to the next. Study the system's stability, fairness, convergence properties. Baccelli+Hong 2002 Leith+Shorten 2005 When there are many TCP flows, their aggregate behaviour can be modelled as a dynamical system, and analysed using control theory. Kelly+Maulloo+Tan 1998 Misra+Gong+Towsley 2000 Baccelli+McDonald+Reynier 2002 * ***
7 Models for TCP ***** * For a single TCP "sawtooth", one can explicitly calculate the mean transmission rate etc. With a few TCP flows, use a "billiards model": random matrices describe the evolution of the system from one congestion instant to the next. Study the system's stability, fairness, convergence properties. Baccelli+Hong 2002 Leith+Shorten 2005 When there are many TCP flows, their aggregate behaviour can be modelled as a dynamical system, and analysed using control theory. Kelly+Maulloo+Tan 1998 Misra+Gong+Towsley 2000 Baccelli+McDonald+Reynier 2002 * *** Where does queueing theory fit in? What should the buffer size be? Whose packets get dropped? What's the packet drop probability?
8 Simulation results queue size [0–100%] drop probability [0–30%] traffic intensity [0–100%] time [0–5s] small bufferlarge bufferint. buffer
9 Simulation results queue size [0–100%] drop probability [0–30%] traffic intensity [0–100%] time [0–5s] small bufferlarge bufferint. buffer When we vary the buffer size, the system behaviour changes dramatically. WHY? HOW? ANSWER We need careful stochastic analysis of what the queue is doing, to understand this... ANSWER We need careful stochastic analysis of what the queue is doing, to understand this...
10 Formulate a maths question What is the limiting queue-length process, as N , in the three regimes –large buffers B (N) =BN –intermediate buffers B (N) =B √ N –small buffers B (N) =B Why are these the interesting limiting regimes? What are the alternatives? N TCP flows Service rate NC Buffer size B (N) Round trip time RTT
11 Intermediate buffers B (N) = B√N Consider a standalone queue –fed by N Poisson flows, each of rate x pkts/sec where x varies slowly, served at constant rate NC, with buffer size B √ N –i.e. an M Nx/ / D NC / / 1 / B √ N queue – x t =0.95 then 1.05 pkts/sec, C=1 pkt/sec, B=3 pkt Observe –queue size fluctuates very rapidly; busy periods have duration O(1/N) so packet drop probability p t is a function of instantaneous arrival rate x t –queue size flips quickly from near-empty to near-full, in time O(1/ √ N) –packet drop probability is p t =(x t -C) + /x t N=50N=100N=500N=1000 queue size traffic intensity time
12 Closing the loop In the open-loop queue –we assumed Poisson inputs with slowly varying arrival rate –queue size flips quickly from near- empty to near-full –queue size fluctuates very rapidly, with busy periods of duration O(1/N), so drop probability p t is a function of instantaneous arrival rate x t In the closed-loop TCP system –the round trip time means that we can treat inputs over time [t,t+RTT] as an exogenous arrival process to the queue –the average arrival rate should vary slowly, according to a differential equation –the aggregate of many point processes converges to a Poisson process, and this carries through to queue size [Cao+Ramanan 2002] There is thus a separation of timescales –queue size fluctuates over short timescales, and we obtain p t from the open-loop M/D/1/ B (N) model –TCP adapts its rate over long timescales, according to the fluid model differential equation
13 Intermediate buffers B (N) = B√N queue size [0–619pkt] drop probability [0–30%] traffic intensity [0–100%] queue size [594–619pkt] time [0–5s] time [50ms] 2 Gb/s link, C=50 pkt/sec/flow 5000 TCP flows RTT 150–200ms buffer 619pkt simulator htsim by Mark Handley, UCL
14 Instability and synchronization When the queue is not full, there is no packet loss, so transmission rates increase additively When the queue becomes full, there are short periods of concentrated packet loss This makes the flows become synchronized queue size [0–619pkt] TCP window sizes [0-25pkt] time [0–5s] This is the billiards model for TCP
15 Large buffers B (N) = BN queue size [0–BN pkt] drop probability [0–30%] traffic intensity [0–100%] queue size [(B-25)–B pkt] time [0–5s] time [50ms] When the queue jitters around full, then queue size fluctuations are O(1), invisible on the O(N) scale... The total arrival rate is Nx t and the service rate is NC, so it takes time O(1) for the buffer to fill or drain and we can calculate the packet drop probability as a function of excess arrival rate this induces a differential equation for queue size
16 Small buffers B (N) = B queue size [0–BN pkt] drop probability [0–30%] traffic intensity [0–100%] queue size [(B-25)–B pkt] time [0–5s] time [50ms] Queue size fluctuations are of size O(1), over timescale O(1/N), enough to fill or drain the buffer... The queue size changes too quickly to be controlled, but the queue size distribution is controlled by the average arrival rate x t
17 Modelling conclusion ***** * * *** Sawtooth model is good for a single flow. It also looks much like the large-buffer many- flows limit. Billiards model is good for a few flows. It also looks much like the intermediate-buffer many-flows limit. Dynamical systems model is appropriate when there are many flows. There are different differential equations, depending on the buffer size.
18 Dynamical system summary x t = average traffic rate at time t p t = packet loss probability at time t C = capacity/flow B = buffer size RTT = round trip time N =# flows TCP traffic model Small buffers Intermediate buffers Large buffers
19 Dynamical system analysis For some parameter values the dynamical system is stable –we can calculate the steady-state traffic intensity, loss probability etc. For others it is unstable and there are oscillations –we can calculate the amplitude of the oscillations, either by numerical integration or by algebraic approximation [Gaurav Raina, PhD thesis, 2005] traffic intensity x t /C time
20 Instability & buffer size On the basis of this analysis, we claimed –a buffer of 30–50 packets should be sufficient –intermediate buffers will make the system become synchronized, and utilization will suffer –large buffers get high utilization, but cause synchronization and delay queue size distribution [ 0 – 100% ] B=10pkt B=20pkt B=30pkt B=40pkt B=50pkt B=100pkt traffic intensity distribution [ 85 – 110% ] Plot by Darrell Newman, Cambridge ½=1 Other people ran simulations and found –small buffers lead to terribly low utilization –intermediate buffers are much better
21 Burstiness The problem is that the dynamical system captures TCP's rate x t but not its burstiness –with a fast access link, packets are sent back-to-back in bursts –with a slow access link, packets are spaced out Simulations of a system with a single bottleneck link with capacity NC, where each flow is throttled by a slow access link, confirm that this matters access speed = 5 C 40 C
22 Burstiness models The original queueing model needs to be updated to incorporate burstiness –our Poisson approximation, from the Cao+Ramanan limit theorem, only works well when each flow contributes at most one packet to the queue per busy cycle –we can estimate how fast the access links need to be for this to break down; when it does we use e.g. a batch-Poisson traffic model, plus large deviations estimates for queues with small buffers [James Cruise 2007] Burstiness leads to low utilization when buffers are small –this is because the drop probability p( x t ) is higher for burstier traffic Burstiness cuts down synchronization problems in intermediate-size buffers –burstiness makes the drop probability p( x t ) become a smoother function of x t and this in turn makes the dynamical system more stable access speed = 5 C 40 C
23 Conclusion When we model TCP using dynamical systems, we need qualitatively different models depending on the buffer size. We need quantitatively different models, depending on fine- grained stochastic analysis of traffic & queueing. In queueing theory we often choose arbitrary ways to scale the system, to obtain interesting stochastic outcomes. Here, TCP forces our hand, through its adaptive feedback mechanism.