Open Issues in Router Buffer Sizing Amogh Dhamdhere Constantine Dovrolis College of Computing Georgia Tech
Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Summary and future work 9/20/2018 NTG Seminar
Motivation Router buffers are crucial elements of packet networks Absorb rate variations of incoming traffic Prevent packet losses during traffic bursts Increasing the router buffer size: Can increase link utilization (especially with TCP traffic) Can decrease packet loss rate Can also increase queuing delays So the million dollar question is: How much buffer should a router have ? 9/20/2018 NTG Seminar
Motivation (cont’) Some recent results suggest that small buffers are sufficient to achieve full utilization The loss rate is not considered ! Other results propose larger buffers to achieve full utilization and a bounded loss rate Why these contradictory results ? Different assumptions, applicability of these models ? Is there a single answer to the buffer sizing problem ? NO ! Is that answer a very small buffer ? 9/20/2018 NTG Seminar
Previous work Approaches based on queuing theory (e.g. M|M|1|B) Assume a certain input traffic model, service model and buffer size Loss probability for M|M|1|B system is given by p=ρB(1- ρ)/(1- ρB+1) TCP is not open-loop; TCP flows react to congestion There is no universally accepted Internet traffic model Morris’ Flow Proportional Queuing (Infocom ’00) Proposed a buffer size proportional to the number of active TCP flows (B = 6*N) Did not specify which flows to count in N Objective: limit loss rate High loss rate causes unfairness and poor application performance 9/20/2018 NTG Seminar
Previous work (cont’) BSCL (Dhamdhere et al. Infocom 2005) Proposed a buffer sizing formula to achieve full utilization and a bounded loss rate Applicable to congested edge links Proposes a buffer proportional to the number of active large flows Can lead to a large queuing delay ! 9/20/2018 NTG Seminar
Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Summary and future work 9/20/2018 NTG Seminar
Stanford Model - Appenzeller et al. Objective: Find the minimum buffer size to achieve full utilization of target link Assumptions: Most traffic is from “long” TCP flows Long flows are in congestion avoidance for most of their lifetime (follow the TCP throughput equation) The number of flows is large enough that flows are independent and unsynchronized Aggregate window size distribution tends to normal Queue size distribution also tends to normal 9/20/2018 NTG Seminar
Stanford Model (cont’) Buffer for full utilization is given by B = CT / √N N is the number of “long” flows at the link CT: Bandwidth delay product If link has only short flows, buffer size depends only on offered load and average flow size Flow size determines the size of bursts during slow start For a mix of short and long flows, buffer size is determined by number of long flows Small flows do not have a significant impact on buffer sizing Resulting buffer can achieve full utilization of target link Loss rate at target link is not taken into account 9/20/2018 NTG Seminar
Stanford Model (cont’) More recent results (Wischik, McKeown et al. ’05) Sacrifice some utilization to make buffers smaller Of the order of 20 packets If TCP sources are “paced”, even smaller buffers are sufficient O(log W) where W is the TCP window size Pacing makes the sources less bursty Pacing can occur automatically, due to slow access links and fast backbone links Don’t want to sound like a broken record, but… WHAT ABOUT THE LOSS RATE ? 9/20/2018 NTG Seminar
Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Summary and future work 9/20/2018 NTG Seminar
What are the objectives ? Network layer vs. application layer objectives Network’s perspective: Utilization, loss rate, queuing delay User’s perspective: Per-flow throughput, fairness etc. Stanford Model: Focus on utilization & queueing delay Can lead to high loss rate (> 10% in some cases) BSCL (Infocom ’05) : Both utilization and loss rate Can lead to large queuing delay Buffer sizing scheme that bounds queuing delay Can lead to high loss rate and low utilization A certain buffer size cannot meet all objectives Which problem should we try to solve? 9/20/2018 NTG Seminar
Saturable/congestible links A link is saturable when offered load is sufficient to fully utilize it, given large enough buffer A link may not be saturable at all times Some links may never be saturable Advertised-window limitation, other bottlenecks, size-limited Small buffers are sufficient for non-saturable links Only needed to absorb short term traffic bursts Stanford model is targeted at backbone links Backbone links are usually not saturable due to over-provisioning Edge links are more likely to be saturable But N may not be large for such links Stanford model requires large N 9/20/2018 NTG Seminar
Which flows to count ? N: Number of “long” flows at the link “Long” flows show TCP’s saw-tooth behavior “Short” flows do not exit slow start Does size matter? Size does not indicate slow start or congestion avoidance behavior If no congestion, even large flows do not exit slow start If highly congested, small flows can enter congestion avoidance Should the following flows be included in N ? Flows limited by congestion at other links Flows limited by sender/receiver socket buffer size N varies with time. Which value should we use ? Min ? Max ? Time average ? 9/20/2018 NTG Seminar
Which traffic model to use ? Traffic model has major implications on buffer sizing Early work considered traffic as exogenous process Not realistic. The offered load due to TCP flows depends on network conditions Stanford model considers mostly persistent connections No ambiguity about number of “long” flows (N) N is time-invariant In practice, TCP connections have finite size and duration, and N varies with time Open-loop vs closed-loop flow arrivals 9/20/2018 NTG Seminar
Traffic model (cont’) Open-loop TCP traffic: Closed-loop TCP traffic: Flows arrive randomly with average size S, average rate l Offered load lS, link capacity C Offered load is independent of system state (delay, loss) The system is unstable if lS > C Closed-loop TCP traffic: Each user starts a new transfer only after the completion of previous transfer Random think time between consecutive transfers Offered load depends on system state The system can never be unstable 9/20/2018 NTG Seminar
Outline Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Summary and future work 9/20/2018 NTG Seminar
Why worry about loss rate? The Stanford model gives very small buffer if N is large E.g., CT=200 packets, N=400 flows: B=10 packets What is the loss rate with such a small buffer size? Per-flow throughput and transfer latency? Compare with BDP-based buffer sizing Distinguish between large and small flows Small flows that do not see losses: limited only by RTT Flow size: k segments Large flows depend on both losses & RTT: 9/20/2018 NTG Seminar
Simulation setup Use ns-2 simulations to study the effect of buffer size on loss rate for different traffic models Heterogeneous RTTs (20ms to 530ms) TCP NewReno with SACK option BDP = 250 packets (1500 B) Model-1: persistent flows + mice 200 “infinite” connections – active for whole simulation duration mice flows - 5% of capacity, size between 3 and 25 packets, exponential inter-arrivals 9/20/2018 NTG Seminar
Simulation setup (cont’) Flow size distribution for finite size flows: Sum of 3 exponential distributions: Small files (avg. 15 packets), medium files (avg. 50 packets) and large files (avg. 200 packets) 70% of total bytes come from the largest 30% of flows Model-2: Closed-loop traffic 675 source agents Think time exponentially distributed with average 5 s Time average of 200 flows in congestion avoidance Model-3: Open-loop traffic Exponentially distributed flow inter-arrival times Offered load is 95% of link capacity 9/20/2018 NTG Seminar
Simulation results – Loss rate CT=250 packets, N=200 for all traffic types Stanford model gives a buffer of 18 packets High loss rate with Stanford buffer Greater than 10% for open loop traffic 7-8% for persistent and closed loop traffic Increasing buffer to BDP or small multiple of BDP can significantly decrease loss rate Stanford buffer 9/20/2018 NTG Seminar
Why the different loss rate trends ? Open loop traffic: The offered load does not depend on the buffer size Possible to decrease loss rate to zero with sufficient buffer size Loss rate decreases quickly with buffer size Closed loop traffic: Larger buffer leads to smaller loss rate, flows complete faster, and new flows arrive faster Loss rate decreases slowly with buffer size 9/20/2018 NTG Seminar
Per-flow throughput Transfer latency = flow-size / flow-throughput Flow throughput depends on both loss rate and queuing delay Loss rate decreases with buffer size (good) Queuing delay increases with buffer size (bad) Major tradeoff: Should we have low loss rate or low queuing delay ? Answer depends on various factors Which flows are considered: Long or short ? Which traffic model is considered? 9/20/2018 NTG Seminar
Persistent connections and mice Application layer throughput for B=18 (Stanford buffer) and larger buffer B=500 Two flow categories: Large (>100KB) and small (<100KB) Majority of large flows get better throughput with large buffer Large difference in loss rates Smaller variability of per-flow throughput with larger buffer Majority of short flows get better throughput with small buffer Lower RTT and smaller difference in loss rates 9/20/2018 NTG Seminar
Why the difference between large and small flows ? Persistent flows: Larger buffer is better Mice flows: Smaller buffer is better Reason: Different effect of packet loss Persistent flows: Large congestion window halved due to packet loss Take longer to reach original window = Decreased throughput Mice flows: Congestion windows never become very large Quickly return to original window, especially with a smaller buffer For persistent flows, the tradeoff is in favor of low loss rate, while for mice it is in favor of low queuing delay 9/20/2018 NTG Seminar
Variability of per-flow throughputs Large buffer reduce the variability of throughput for persistent flows Two reasons: All RTTs increased by a constant (the queuing delay) Smaller loss rate decreases the chance of a flow getting “unlucky” and seeing repeated losses In our simulations, the RTT increase accounts for most of the variability reduction Why is variability important ? For N persistent connections, we have a zero-sum game If one flow gets high throughput, some other must be losing 9/20/2018 NTG Seminar
Closed-loop traffic Per-flow throughput for large flows is slightly better with larger buffer Majority of small flows see better throughput with smaller buffer Similar to persistent case Smaller difference in per-flow loss rate Reason: Loss rate decreases slowly with buffer size 9/20/2018 NTG Seminar
Open-loop traffic Both large and small flows get much better throughput with large buffer Significantly smaller per-flow loss rate with larger buffer Reason: Loss rate decreases very quickly with buffer size 9/20/2018 NTG Seminar
Summary and Future Work The buffer size required at a router depends on multiple factors: The provisioning objective The model which the input traffic follows The nature of flows that populate that link (large vs small) Very small buffers proposed in recent work can cause a large loss rate and harm application performance Most work so far has focused on network layer performance What is the optimal buffer when some application layer performance metric is considered ? 9/20/2018 NTG Seminar
Thank You ! 9/20/2018 NTG Seminar