Revisiting Transport Congestion Control Jian He UT Austin 1.

Slides:



Advertisements
Similar presentations
Internet Measurement Conference 2003 Source-Level IP Packet Bursts: Causes and Effects Hao Jiang Constantinos Dovrolis (hjiang,
Advertisements

Finishing Flows Quickly with Preemptive Scheduling
1 TCP Vegas: New Techniques for Congestion Detection and Avoidance Lawrence S. Brakmo Sean W. O’Malley Larry L. Peterson Department of Computer Science.
Congestion Control Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Fixing TCP in Datacenters Costin Raiciu Advanced Topics in Distributed Systems 2011.
Hui Zhang, Fall Computer Networking TCP Enhancements.
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
Congestion Control: TCP & DC-TCP Swarun Kumar With Slides From: Prof. Katabi, Alizadeh et al.
Practice Questions: Congestion Control and Queuing
Copyright © 2005 Department of Computer Science 1 Solving the TCP-incast Problem with Application-Level Scheduling Maxim Podlesny, University of Waterloo.
Explicit Congestion Notification (ECN) RFC 3168 Justin Yackoski DEGAS Networking Group CISC856 – TCP/IP Thanks to Namratha Hundigopal.
Congestion control in data centers
Modeling of Web/TCP Transfer Latency Yujian Peter Li January 22, 2004 M. Sc. Committee: Dr. Carey Williamson Dr. Wayne Eberly Dr. Elena Braverman Department.
Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.
Defense: Christopher Francis, Rumou duan Data Center TCP (DCTCP) 1.
1 Minseok Kwon and Sonia Fahmy Department of Computer Sciences Purdue University {kwonm, TCP Increase/Decrease.
Internet and Intranet Protocols and Applications Section V: Network Application Performance Lecture 11: Why the World Wide Wait? 4/11/2000 Arthur P. Goldberg.
1 ATP: A Reliable Transport Protocol for Ad-hoc Networks Sundaresan, Anantharam, Hseih, Sivakumar.
TCP in Heterogeneous Network Md. Ehtesamul Haque # P.
Department of Electronic Engineering City University of Hong Kong EE3900 Computer Networks Transport Protocols Slide 1 Transport Protocols.
CS335 Networking & Network Administration Tuesday, April 20, 2010.
CMPE 257 Spring CMPE 257: Wireless and Mobile Networking Spring 2005 E2E Protocols (point-to-point)
TCP. Learning objectives Reliable Transport in TCP TCP flow and Congestion Control.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
Mohammad Alizadeh, Abdul Kabbani, Tom Edsall,
AQM Recommendation Fred Baker. History At IETF 86, TSVAREA decided to update the recommendation of RFC 2309 to not recommend the use of RED Argument:
1 Transport Layer Computer Networks. 2 Where are we?
CS640: Introduction to Computer Networks Aditya Akella Lecture 22 - Wireless Networking.
Curbing Delays in Datacenters: Need Time to Save Time? Mohammad Alizadeh Sachin Katti, Balaji Prabhakar Insieme Networks Stanford University 1.
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
Transport Layer 4 2: Transport Layer 4.
Network Technologies essentials Week 8: TCP congestion control Compilation made by Tim Moors, UNSW Australia Original slides by David Wetherall, University.
Qian Zhang Department of Computer Science HKUST Advanced Topics in Next- Generation Wireless Networks Transport Protocols in Ad hoc Networks.
Detail: Reducing the Flow Completion Time Tail in Datacenter Networks SIGCOMM PIGGY.
Much better than the old TCP Flavours 1Rajon Bhuiyan.
Mobile Communications: Mobile Transport Layer Mobile Communications Chapter 10: Mobile Transport Layer  Motivation  TCP-mechanisms  Indirect TCP  Snooping.
Understanding the Performance of TCP Pacing Amit Aggarwal, Stefan Savage, Thomas Anderson Department of Computer Science and Engineering University of.
B 李奕德.  Abstract  Intro  ECN in DCTCP  TDCTCP  Performance evaluation  conclusion.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 2.5 Internetworking Chapter 25 (Transport Protocols, UDP and TCP, Protocol Port Numbers)
Link Scheduling & Queuing COS 461: Computer Networks
MaxNet NetLab Presentation Hailey Lam Outline MaxNet as an alternative to TCP Linux implementation of MaxNet Demonstration of fairness, quick.
Transport over Wireless Networks Myungchul Kim
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
1 Mao W07 Midterm Review EECS 489 Computer Networks Z. Morley Mao Monday Feb 19, 2007 Acknowledgement: Some.
1 Transport Layer Lecture 10 Imran Ahmed University of Management & Technology.
Ethernet. Ethernet standards milestones 1973: Ethernet Invented 1983: 10Mbps Ethernet 1985: 10Mbps Repeater 1990: 10BASE-T 1995: 100Mbps Ethernet 1998:
Computer Networking Lecture 18 – More TCP & Congestion Control.
Thoughts on the Evolution of TCP in the Internet (version 2) Sally Floyd ICIR Wednesday Lunch March 17,
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
TCP OVER ADHOC NETWORK. TCP Basics TCP (Transmission Control Protocol) was designed to provide reliable end-to-end delivery of data over unreliable networks.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
Thoughts on the Evolution of TCP in the Internet Sally Floyd PFLDnet 2004 February 16, 2004.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
6.888: Lecture 3 Data Center Congestion Control Mohammad Alizadeh Spring
11 CS716 Advanced Computer Networks By Dr. Amir Qayyum.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
R2C2: A Network Stack for Rack-scale Computers Paolo Costa, Hitesh Ballani, Kaveh Razavi, Ian Kash Microsoft Research Cambridge EECS 582 – W161.
Mobile Transport Layer  Motivation  TCP-mechanisms  Indirect TCP  Snooping TCP  Mobile TCP  Fast retransmit/recovery  Transmission freezing  Selective.
1 ICCCN 2003 Modelling TCP Reno with Spurious Timeouts in Wireless Mobile Environments Shaojian Fu School of Computer Science University of Oklahoma.
Approaches towards congestion control
Satellite TCP Lecture 19 04/10/02.
COMP 431 Internet Services & Protocols
TCP Vegas: New Techniques for Congestion Detection and Avoidance
Improving Datacenter Performance and Robustness with Multipath TCP
AMP: A Better Multipath TCP for Data Center Networks
Lecture 16, Computer Networks (198:552)
TCP Congestion Control
Transport Layer: Congestion Control
Horizon: Balancing TCP over multiple paths in wireless mesh networks
Presentation transcript:

Revisiting Transport Congestion Control Jian He UT Austin 1

Why is Congestion Control necessary? Data Packets ACK Congested Link  Congested link vs. reliability: long queuing delay, packet loss  But, can delay or packet loss always well explain congestion? 2

Can we distinguish congestion reasons?  Congestion related signals: - packet loss: duplicate ACKs, retransmission timeout (TCP Reno, TCP Cubic) - round-trip delay: TCP packet RTT (TCP Vegas, FAST TCP, Compound TCP) - queue size: explicit congestion notification(ECN) (DCTCP) 3

Existing TCP Variants 4 TCP Throughput-Latency Tradeoff Exploration [Remy SIGCOMM’13] Datacenter TCP Tail performance[TIMELY SIGCOMM’15], New Architectures[R2C2 SIGCOMM’15] RDMA[DCQCN SIGCOMM’15] Persistently High Performance Large flows[PCC NSDI’15] Highly-variant network condition Cellular transport[Verus SIGCOMM’15, Sprout NSDI’13] Reducing Start-up Delay [Halfback CoNext’15], [RC3 NSDI’14] Performance interference for competing flows Application Heterogeneity[QJUMP NSDI’15]

TCP Evolution Application TCP IP Link Hardware Application Sensing Layer Networking Sensing Layer Application-Specific Performance Requirements Network Condition 5

Optimizing Datacenter Transport Tail Performance Mittal, Radhika, et al. "TIMELY: RTT-based congestion control for the datacenter." In ACM SIGCOMM

Why does tail performance matter? …  TCP Incast: many servers reply the client simultaneously  All replies should meet their deadlines.  Datacenter transport must deliver high throughput(>>Gbps) and utilization with low delay(<<msec). 7

Hardware Assisted RTT Measurement 8 Why was RTT not widely used?  RTT-based congestion control performed poorly at WANs.  Highly noisy RTT estimation(system kernel scheduling, etc.)  Datacenter RTT measurement needs ms-level granularity.  Hardware timestamp and hardware acknowledgement can significantly remove noise.

RTT As a Congestion Control Signal 9 Multi-bit signal Single-bit signal  ECN can not reflect the extent of end-to-end latency inflated by network queuing, due to traffic priorities, multiple congested switches, etc.

RTT Correlates with Queuing Delay 10

TIMELY Framework 11

RTT Measurement 12 t send t completion ACK Turnaround Time Serialization Delay Propagation & Queuing Delay  One RTT for one segment (NIC Offload)  Hardware ACKs make ACK turnaround time ignorable  RTT = Propagation + Queuing Delay = t completion – t send – segment_size/NIC_line_rate RTT

Transmission Rate Control 13 Rate Controller Message to be sent Segments RTT Estimation Transmission Queue Insert delay between segments  Target rate is determined by segment size and delay between segments

Rate vs. Window  Segment size as high as 64KB.  (32us RTT x 10Gbps) = 40KB window size  40KB < 64KB: Window makes no sense 14

Rate Update 15

Evaluation 16

17 Datacenter Transport for Emerging Architectures Costa, Paolo, et al. "R2C2: A Network Stack for Rack-scale Computers." In ACM SIGCOMM 2015.

Rack-Scale Computing 18  Building Block for future datacenters  High BW low latency network  Direct-connected topology

Rack-Scale Network Topology 19 3D Torus Fat-tree Topology  Distributed switches(each node works as a switch)  High path diversities

Broadcasting-Assisted Rack Congestion Control 20  Broadcast flow information(e.g., start time, finish time)  Each node has a global view of the network  Locally optimize flow rate with the global view Broadcasting overhead is low(around 1.3%).

Evaluation 21

22 Congestion Control for RDMA-enabled Datacenters Zhu, Yibo, et al. "Congestion Control for Large-Scale RDMA Deployments.” In ACM SIGCOMM, 2015.

Congestion Spreading in Lossless Networks 23 PAUSE  Port-based congestion control incurs congestion spreading  DCQCN: incorporating explicit congestion notification to support flow-based congestion control

24 Wireless Congestion Control Zaki, Yasir, et al. "Adaptive Congestion Control for Unpredictable Cellular Networks.“ In SIGCOMM 2015.

What do Cellular Traffic Look Like? 25 Burst Scheduling Competing Traffic

What do Cellular Traffic Look Like? 26 Channel Unpredictability

Verus Protocol 27 Epoch i Epoch i+1  Epoch: a short period of time (e.g., 5 ms)  Sending window is updated at each epoch.  Sending window represents the number packets in flight. Sending window W i Sending window W i+1

Verus Overview 28 Delay Estimator: estimate delay in the future based on the changes of delay Delay Profiler: record the relationship of delay-sending window Window Estimator: estimate the sending window for the next epoch Packet Scheduler: calculate the number packets to be sent in the next epoch Go to next epoch

Delay Estimation 29 Epoch i-1 Epoch i D max,i-1 D max,i D max,i = alpha x + (1-alpha) x ∆D i = D max,i -D max,i-1 D est,i D est,i+1 ∆D i <=0 ∆D i >0 Time Estimated Delay

Window Update 30  Delay-Window Profile: updated based on historical data  Each epoch can contribute many points to the profile.  Profile is initialized using data in the slow-start phase.

Packet Scheduler 31 Epoch i Epoch i+1 Sending window W i Sending window W i+1  How many packets to be sent in current epoch? S i+1 = max[0, (W i+1 + ((2-n)/(n-1))*W i )] n is the number of epochs over the current estimated RTT

Loss Handling 32 Epoch i Epoch i+1 Sending window W i Multiplicative Decrease W i+1 = M * W i  Stop updating delay profile during the loss recovery phase

Evaluation 33

34 Thanks!