U NDERSTANDING TCP I NCAST T HROUGHPUT C OLLAPSE IN D ATACENTER N ETWORKS Presenter: Aditya Agarwal Tyler Maclean.

Slides:



Advertisements
Similar presentations
2: Transport Layer 31 Transport Layer 3. 2: Transport Layer 32 TCP Flow Control receiver: explicitly informs sender of (dynamically changing) amount of.
Advertisements

Simulation-based Comparison of Tahoe, Reno, and SACK TCP Kevin Fall & Sally Floyd Presented: Heather Heiman September 10, 2002.
24-1 Chapter 24. Congestion Control and Quality of Service (part 1) 23.1 Data Traffic 23.2 Congestion 23.3 Congestion Control 23.4 Two Examples.
1 Transport Protocols & TCP CSE 3213 Fall April 2015.
1 End to End Bandwidth Estimation in TCP to improve Wireless Link Utilization S. Mascolo, A.Grieco, G.Pau, M.Gerla, C.Casetti Presented by Abhijit Pandey.
Copyright © 2005 Department of Computer Science 1 Solving the TCP-incast Problem with Application-Level Scheduling Maxim Podlesny, University of Waterloo.
School of Information Technologies TCP Congestion Control NETS3303/3603 Week 9.
Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat David Andersen, Greg Ganger, Garth Gibson, Brian Mueller* Carnegie Mellon University, *Panasas.
Chapter 3 Transport Layer slides are modified from J. Kurose & K. Ross CPE 400 / 600 Computer Communication Networks Lecture 12.
Albert Greenberg, Cheng Huang, Randy Kern, Dave Maltz, Jitu Padhye, Parveen Patel, Lihua Yuan *with help from MurariS and others in COSD.
Modeling TCP Throughput Jeng Lung WebTP Meeting 11/1/99.
1 Lecture 10: TCP Performance Slides adapted from: Congestion slides for Computer Networks: A Systems Approach (Peterson and Davis) Chapter 3 slides for.
Week 9 TCP9-1 Week 9 TCP 3 outline r 3.5 Connection-oriented transport: TCP m segment structure m reliable data transfer m flow control m connection management.
MULTIMEDIA TRAFFIC MANAGEMENT ON TCP/IP OVER ATM-UBR By Dr. ISHTIAQ AHMED CH.
1 TCP latency modeling. 2 Q: How long does it take to receive an object from a Web server after sending a request? r TCP connection establishment r data.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
1 Experiment And Analysis of Dynamic TCP Acknowledgement Daeseob Lim Sam Lai Wing-Ho Gordon Wong.
1 Internet Networking Spring 2006 Tutorial 10 The Eifel Detection Algorithm for TCP RFC 3522.
Understanding TCP Incast Throughput Collapse in Datacenter Network Offense: Carly Ho Ning Xia.
1 Internet Networking Spring 2004 Tutorial 10 TCP NewReno.
Microsoft Research Shujaat Hussain. Cloud Faster! Low latency web transactions …. especially important to our key online properties.
1 K. Salah Module 6.1: TCP Flow and Congestion Control Connection establishment & Termination Flow Control Congestion Control QoS.
Low-Rate TCP Denial of Service Defense Johnny Tsao Petros Efstathopoulos Tutor: Guang Yang UCLA 2003.
Copyright © 2005 Department of Computer Science CPSC 641 Winter Tutorial: TCP 101 The Transmission Control Protocol (TCP) is the protocol that sends.
ICTCP: Incast Congestion Control for TCP in Data Center Networks∗
Lect3..ppt - 09/12/04 CIS 4100 Systems Performance and Evaluation Lecture 3 by Zornitza Genova Prodanoff.
IA-TCP A Rate Based Incast- Avoidance Algorithm for TCP in Data Center Networks Communications (ICC), 2012 IEEE International Conference on 曾奕勳.
Courtesy: Nick McKeown, Stanford 1 TCP Congestion Control Tahir Azim.
TCP Incast in Data Center Networks
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
TCP Throughput Collapse in Cluster-based Storage Systems
COMT 4291 Communications Protocols and TCP/IP COMT 429.
Copyright © Lopamudra Roychoudhuri
CS540/TE630 Computer Network Architecture Spring 2009 Tu/Th 10:30am-Noon Sue Moon.
Principles of Congestion Control Congestion: informally: “too many sources sending too much data too fast for network to handle” different from flow control!
17-1 Last time □ UDP socket programming ♦ DatagramSocket, DatagramPacket □ TCP ♦ Sequence numbers, ACKs ♦ RTT, DevRTT, timeout calculations ♦ Reliable.
B 李奕德.  Abstract  Intro  ECN in DCTCP  TDCTCP  Performance evaluation  conclusion.
Chapter 12 Transmission Control Protocol (TCP)
High TCP performance over wide area networks Arlington, VA May 8, 2002 Sylvain Ravot CalTech HENP Working Group.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
Transmission Control Protocol TCP Part 2 University of Glamorgan Networked & Distributed Systems.
Copyright © Lopamudra Roychoudhuri
1 TCP - Part II Relates to Lab 5. This is an extended module that covers TCP data transport, and flow control, congestion control, and error control in.
Lecture 9 – More TCP & Congestion Control
1 Transport Layer Lecture 10 Imran Ahmed University of Management & Technology.
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
1 TCP Timeout And Retransmission Chapter 21 TCP sets a timeout when it sends data and if data is not acknowledged before timeout expires it retransmits.
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
1 TCP - Part II. 2 What is Flow/Congestion/Error Control ? Flow Control: Algorithms to prevent that the sender overruns the receiver with information.
ECE 4110 – Internetwork Programming
CPSC TCP Plots r Slides originally from Williamson at Calgary r Minor modifications are made.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
Karn’s Algorithm Do not use measured RTT to update SRTT and SDEV Calculate backoff RTO when a retransmission occurs Use backoff RTO for segments until.
ICTCP: Incast Congestion Control for TCP in Data Center Networks By: Hilfi Alkaff.
TCP over Wireless PROF. MICHAEL TSAI 2016/6/3. TCP Congestion Control (TCP Tahoe) Only ACK correctly received packets Congestion Window Size: Maximum.
Window Control Adjust transmission rate by changing Window Size
TCP - Part II.
Transmission Control Protocol (TCP) Retransmission and Time-Out
Topics discussed in this section:
Approaches towards congestion control
OTCP: SDN-Managed Congestion Control for Data Center Networks
COMP 431 Internet Services & Protocols
Chapter 3 outline 3.1 Transport-layer services
TCP Sequence Number Plots
Lecture 19 – TCP Performance
Carnegie Mellon University, *Panasas Inc.
TCP Throughput Modeling
Chapter 3 outline 3.1 Transport-layer services
TCP flow and congestion control
Presentation transcript:

U NDERSTANDING TCP I NCAST T HROUGHPUT C OLLAPSE IN D ATACENTER N ETWORKS Presenter: Aditya Agarwal Tyler Maclean

M OTIVATION /I MPORTANCE Internet datacenters support a myriad of service and applications. Google, Microsoft, Yahoo, Amazon Vast majority of datacenter use TCP for communication between nodes. The unique workload, scale and environment of internet datacenter violate the WAN assumption on which TCP was originally designed. RTO = 200ms (default value in Linux) 2-3 order of magnitude greater than the RTT in the data center

W HAT IS THE P ROBLEM Incast communication pattern: Try to understand TCP incast throughput collapse. Prove this problem is general, An analytical model Modifications to TCP and make sure that it works client server switch server

T HE C ONTRIBUTIONS Reproduce the problem in our own experimental testbeds and demonstrate the generality of Incast. Propose a quantitative model that accounts some of the observed Incast behavior. Implement several intuitive modifications to the TCP stack in Linux, and prove that some modifications are more helpful than others.

R OADMAP Experiment setting: Workload Experiment results: Initial Finding Deep analysis Quantitative Models Conclusions

W ORK L OAD SETTING Map Reduce like application: Receiver requests k blocks of data from S storage servers. Each block of data striped across S storage servers Each server responses with a “ fixed ” amount of data. ( fixed-fragment workload ) Client won’t request block k+1 until all the fragments of block k have been received. Setting: k=100 S = 1-48 fragment size : 256KB

D ETER N ETWORK S ECURITY T ESTBED 400 PCs, located at USC ISI and UC Berkeley Supported operating systems include Linux, FreeBSD, Windows

I NITIAL R ESULTS

Different sender experience long, synchronized TCP retransmission timeout (RTO) events. RTO =200ms (default value in WAN environment)

M INOR AND INTUITIVE MODIFICATIONS Decrease the minimum RTO timer from 200ms Randomize the minimum RTO timer Smaller multiplier for the RTO exponential back off Randomize the multiplier for the RTO exponential back off.

I NITIAL R ESULTS Smaller multiplier for the RTO exponential back off Useless Randomize the multiplier for the RTO exponential back off Useless There are only a tiny number of exponential back offs for the entire transfer

I NITIAL R ESULTS Randomize the RTO timer Useless, but also no penalty Because the servers share the same switch, all subsequent switch buffer overflow events will be synchronized for all sender.???

A NALYSIS IN DEPTH Different RTO Timers Observations: Initial goodput min occurs at the same number of servers. Larger min RTO timer value, max goodput occurs at large number of senders. Smaller RTO timer value has faster goodput “recovery” rate The decrease rate after local max is the same between different min RTO settings.

D ELAY ACK S AND H IGH R ESOLUTION T IMERS Improving methods proposed by [11] Turn off the delay ACKs function (defaults delayed ACKs threshold is 40ms) Use high resolution Timer.

C ONGESTION WINDOWS WITH / WITHOUT DELAY ACK S

S MOOTHED RTT WITH / WITHOUT DELAY ACK S

D IFFERENT WORKLOAD

S UB - OPTIMAL BEHAVIOR WITH REGARDS TO DELAYED ACK S IS WORKLOAD INDEPENDENT.

C ANNOT MATCH THE RESULTS IN PREVIOUS WORK [11]

S MOOTHED RTT WITH / WITHOUT DELAY ACK S

Q UANTITATIVE M ODELS Net good put: D: total amount of data to be sent, 100 blocks of 256KB L: total transfer time of the workload without and RTO events. R: the number of RTO events during the transfer S: number of server: r: the value of the minimum RTO timer value

F IT THE CURVE OF THE NUMBER OF RTO EVENTS

E QUATION OF L I is the inter-packet waiting time

H OW GOOD IS THEIR ANALYSIS MODEL ?

F URTHER ANALYSIS ON R AND I Number of RTO event is similar for different RTO values( 200ms and 1ms). Interpkt waiting is vary different for different RTO value( 200ms and 1ms).

Q UALITATIVE REFINEMENT FOR THEIR MODEL As the number of sender increase, the number of RTO event per sender increases. Beyond a certain number of sender, the number of RTO event is constant. When a network resource becomes saturated, it is saturated at the same time for all senders. After a congestion event, the senders enter the TCP RTO state. The RTO timer expires at each sender with a uniform distribution in time and a constant delay after the congestion event. T is increase as the number of sender increase, however, T is bounded.

M ORE EXPLANATIONS A smaller minimum RTO timer value means larger goodput values for the initial minimum. The initial goodput minimum occurs at the same number of senders, regardless the value of the minimum RTO times. The second order goodput peak occurs at a higher number of senders for a larger RTO timer value The smaller the RTO timer values, the faster the rate of recovery between the goodput minimum and the second order goodput maximum. After the second order goodput maximum, the slope of goodput decrease is the same for different RTO timer values.

C ONCLUSIONS Study the dynamic of Incast. Propose a simple mathematical model to explain the observed trends Account for the difference between their observation and that in previous work.