Google’s BBR Congestion control algorithm “Bottleneck Bandwidth and RTT” An end to end approach to Better network behavior Dealing with long RTTs with loss Long lived connections ... And ending bufferbloat
Overview BBR Background BBR Analysis BBR Upsides BBR Downsides BBR Conclusions and recommendations
BBR Background Under development at google for 3 years Deployed as part of their b4 backbone https://people.eecs.berkeley.edu/~sylvia/cs268- 2014/papers/b4-sigcomm13.pdf Selective deployments on youtube, google.com and other streaming servers Published in ACM Queue: http://queue.acm.org/app/ Video at: https://www.youtube.com/watch?v=hIl_zXzU3DA First TCP CC designed from Big Data From real applications From worldwide connectivity I had nothing to do with it! http://blog.cerowrt.org/post/a_bit_about_bbr/
What is congestion control for?
Cubic v BBR – CMTS emulation
BBR basics NOT delay or loss based: Models the pipe Tests for RTT and Bandwidth in 8 separate phases: 5/4, ¾, 1,1,1,1,1,1 RTT-Probe (200ms out of 10 seconds, otherwise opportunistic: taking advantage of natural application pauses) BW-Probe Detects Policers and works around them Relies on modifying the “gain of the paced rate” Cwnd becomes a secondary metric
Some BBR evaluations Typical CMTS Cable Modem setup 11ms and 48ms RTTs 64K and 512K CMTS buffer Pfifo (DSL) Pie, Codel, FQ_Codel AQMs
Modeling the pipe wins No loss, low RTTs = Serious wow factor.
BBR measures RTT and Bandwidth in separate phases Worst case behavior -
BBR game theoretic wins Outcompetes cubic on short transactions Competes fairly (on long timescales) Outperforms cubic by a factor of 133, on long, lossy RTTs Deals with policers well
Some BBR Issues Requires a modern Linux With sch_fq Low latency bare metal, or good vm tech Latecomer advantage No ECN support (yet?) Very aggressive startup Bad single queue AQM interactions
Latecomer advantage
Shows non ECN respecting senders can kill single queued AQMS No ECN (currently) Shows non ECN respecting senders can kill single queued AQMS
AQMs inflict more packet loss ...Which BBR disregards as noise
Single Queue AQM: BBR vs Cubic
FQ_Codel vs Cubic & BBR
SLAs Packet loss as a metric just got even more useless Most folk overprovision at loads > 50% to avoid loss Google runs at 100% utilization and 1-10% loss
+ * Here is a state transition diagram for BBR: + * | + * V + * +---> STARTUP ----+ + * | | | + * | V | + * | DRAIN ----+ + * +---> PROBE_BW ----+ + * | ^ | | + * | | | | + * | +----+ | + * | | + * +---- PROBE_RTT <--+ + * A BBR flow starts in STARTUP, and ramps up its sending rate quickly. + * When it estimates the pipe is full, it enters DRAIN to drain the queue. + * In steady state a BBR flow only uses PROBE_BW and PROBE_RTT. + * A long-lived BBR flow spends the vast majority of its time remaining + * (repeatedly) in PROBE_BW, fully probing and utilizing the pipe's bandwidth + * in a fair manner, with a small, bounded queue. *If* a flow has been + * continuously sending for the entire min_rtt window, and hasn't seen an RTT + * sample that matches or decreases its min_rtt estimate for 10 seconds, then + * it briefly enters PROBE_RTT to cut inflight to a minimum value to re-probe + * the path's two-way propagation delay (min_rtt). When exiting PROBE_RTT, if + * we estimated that we reached the full bw of the pipe then we enter PROBE_BW; + * otherwise we enter STARTUP to try to fill the pipe.
Recomendations IF Streaming apps DCs If you can multiple flows with one Eliminate sharding HTTP 2.0 Try BBR
LPCC
Sources Linux Net-Next: Many thanks to: Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Van Jacobson, and Soheil Yeganeh