Sustained Wide-Area TCP Memory Transfers Over Dedicated Connections

Slides:



Advertisements
Similar presentations
Helping TCP Work at Gbps Cheng Jin the FAST project at Caltech
Advertisements

GSFC to Alaska Performance Results Tino Sciuto Swales Aerospace ESDIS Network Prototype Lab. NASA GSFC Greenbelt, MD.
Appropriateness of Transport Mechanisms in Data Grid Middleware Rajkumar Kettimuthu 1,3, Sanjay Hegde 1,2, William Allcock 1, John Bresnahan 1 1 Mathematics.
Tuning and Evaluating TCP End-to-End Performance in LFN Networks P. Cimbál* Measurement was supported by Sven Ubik**
The Effects of Wide-Area Conditions on WWW Server Performance Erich Nahum, Marcel Rosu, Srini Seshan, Jussara Almeida IBM T.J. Watson Research Center,
1 Transport Protocols & TCP CSE 3213 Fall April 2015.
Restricted Slow-Start for TCP William Allcock 1,2, Sanjay Hegde 3 and Rajkumar Kettimuthu 1,2 1 Argonne National Laboratory 2 The University of Chicago.
Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.
Chapter 3 Transport Layer slides are modified from J. Kurose & K. Ross CPE 400 / 600 Computer Communication Networks Lecture 12.
Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
Internet and Intranet Protocols and Applications Section V: Network Application Performance Lecture 11: Why the World Wide Wait? 4/11/2000 Arthur P. Goldberg.
Department of Electronic Engineering City University of Hong Kong EE3900 Computer Networks Transport Protocols Slide 1 Transport Protocols.
Introduction 1 Lecture 14 Transport Layer (Congestion Control) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science.
Lect3..ppt - 09/12/04 CIS 4100 Systems Performance and Evaluation Lecture 3 by Zornitza Genova Prodanoff.
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
Transport Layer 4 2: Transport Layer 4.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer3-1 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles.
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
NITRD/LSN Workshop On Complex Engineered Networks September 20-21, 2012 Washington DC Sponsored by AFOSR NSF DOE.
TFRC: TCP Friendly Rate Control using TCP Equation Based Congestion Model CS 218 W 2003 Oct 29, 2003.
Optimizing UDP-based Protocol Implementations Yunhong Gu and Robert L. Grossman Presenter: Michal Sabala National Center for Data Mining.
Principles of Congestion Control Congestion: informally: “too many sources sending too much data too fast for network to handle” different from flow control!
UDT: UDP based Data Transfer Protocol, Results, and Implementation Experiences Yunhong Gu & Robert Grossman Laboratory for Advanced Computing / Univ. of.
1 On Class-based Isolation of UDP, Short-lived and Long-lived TCP Flows by Selma Yilmaz Ibrahim Matta Computer Science Department Boston University.
27th, Nov 2001 GLOBECOM /16 Analysis of Dynamic Behaviors of Many TCP Connections Sharing Tail-Drop / RED Routers Go Hasegawa Osaka University, Japan.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March
1. Introduction REU 2006-Packet Loss Distributions of TCP using Web100 Zoriel M. Salado, Mentors: Dr. Miguel A. Labrador and Cesar D. Guerrero 2. Methodology.
Transport Layer 3- Midterm score distribution. Transport Layer 3- TCP congestion control: additive increase, multiplicative decrease Approach: increase.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
INDIANAUNIVERSITYINDIANAUNIVERSITY Status of FAST TCP and other TCP alternatives John Hicks TransPAC HPCC Engineer Indiana University APAN Meeting – Hawaii.
Peer-to-Peer Networks 13 Internet – The Underlay Network
An Analysis of AIMD Algorithm with Decreasing Increases Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data Mining.
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
28/09/2016 Congestion Control Ian McDonald (with many other WAND members)
Qiang Liu and Nagi Rao Oak Ridge National Laboratory
Accelerating Peer-to-Peer Networks for Video Streaming
CS450 – Introduction to Networking Lecture 19 – Congestion Control (2)
Approaches towards congestion control
Chapter 3 outline 3.1 transport-layer services
COMP 431 Internet Services & Protocols
Introduction to Congestion Control
TCP Vegas: New Techniques for Congestion Detection and Avoidance
Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting
Chapter 3 outline 3.1 Transport-layer services
Transport Protocols over Circuits/VCs
Queue Dynamics with Window Flow Control
Transport Layer Unit 5.
Amogh Dhamdhere, Hao Jiang and Constantinos Dovrolis
ECE 599: Multimedia Networking Thinh Nguyen
IT351: Mobile & Wireless Computing
TCP Overview.
Computer Science Division
Chapter 3 outline 3.1 Transport-layer services
An Empirical Evaluation of Wide-Area Internet Bottlenecks
TCP flow and congestion control
Anant Mudambi, U. Virginia
Review of Internet Protocols Transport Layer
Summer 2002 at SLAC Ajay Tirumala.
Presentation transcript:

Sustained Wide-Area TCP Memory Transfers Over Dedicated Connections Nagi Rao, Oak Ride National Laboratory raons@ornl.gov Don Towsley, Gayane Vardoyan, University of Massachusetts Brad Settlemyer, Los Alamos National Laboratory Ian Foster, Raj Kettimuthu, Argonne National Laboratory IEEE International Symposium on High Performance and Smart Computing August 26, 2015, New York Sponsored by U.S. Department of Defense U.S. Department of Energy

Outline Motivation and Background Throughput Measurements Emulation Testbed Throughput Analytical Models TCP over dedicated connections and small memory transfers Monotonicity of throughput Concavity Analysis Conclusions

Background A Class of HPC Applications require memory transfers over wide-area connections computational monitoring of code on supercomputer coordinated remote computations on two remote supercomputers dedicated network connections are increasingly deployed also in commercial space – big data Network transfers: Current models and experiments mainly driven by Internet connections shared connections: losses due to other traffic analytical models - complicated to handle “complex” traffic Memory transfers over dedicated connections limited measurements: most common are over internet analytical models: majority based on losses (internal and external) somewhat “easier” environments to collect measurements: dedicated connections: easier to emulate/simulate

TCP Throughput Models and Measurements Existing TCP memory transfer models congestion avoidance mode is prominent: derived using loss rates: slow-start is often “approximated out” round trip time appears in denominator (with a coefficient) loss rate parameter appears in the denominator For small datasets over dedicated connections: there are very few, often, no losses models with loss rate parameter in the denominator become unstable Summary of our measurements: 10Gbps emulated connections Five different TCP versions: Reno, Scalable TCP, Hamilton TCP, Highspeed TCP, CUBIC (default in Linux) Range of rtt: 0-366ms cross-country connections: 100ms Observations: throughput depends critically on rtt different TCP versions are effective in different rtt ranges throughput profiles has two modes – sharp contrast to well-known convex convex: longer rtt concave: shorter to medium rtt

TCP Throughput Profiles Most common TCP throughput profile convex function of rtt example, Mathis et al Observed Dual-mode profiles: throughput measuremen CUBIC ScalableTCP Smaller RTT Concave region Larger RTT Convex region Throughput at rtt loss-rate concave region Throughput - Gbps convex region RTT - ms

Desired Features of Concave Region Concave regions is very desirable throughput does not decay as fast rate of decrease slows down as rtt Function is concave iff derivative is non-increasing not satisfied by Mathis model: Measurements: throughput profiles for rtt: 0-366ms Concavity: small rtt region Only some TCP versions: CUBIC, Hamilton TCP Scalable TCP Not for some TCP versions: Reno Highspeed TCP These are loadable linux modules

Our Contributions Analytical model to capture concave region For smaller datasets over dedicated connections: there are very few, often, no losses models with loss rate parameter in denominator become unstable transfers last beyond slow start (e.g. unlike Mellia et al 2002) Result 1: Throughput is decreasing function of RTT Result 2: Robust slow-start leads to concave profile for smaller rtt Observations from measurements: throughput depends critically on rtt different TCP versions are effective in different rtt ranges throughput profiles has two modes – sharp contrast to well-known convex profile of TCP convex: longer rtt concave: shorter to medium rtt Practical solutions Choose TCP version for the connection rtt: Five different TCP versions: Reno, Scalable TCP, Hamilton TCP, Highspeed TCP, CUBIC (default in Linux) Load kernel module in Linux

Multiple Multi-Bus Host-to-Host Memory/Disk/File Transfers LAN switch LAN switch NIC router NIC router host NIC NIC memory-to-memory: iperf host HBA HBA FCAs NIC NIC HBA HBA HBA FCAs host storage switch HBA HBA FCAs measurements: iperf xddprof disk array disk/file profile: xddprof storage controller disk array

Throughput Measurements: 10GigE ANUE Connection Emulator bohr04 HP 48-core 4-socket Linux host connection emulation: latency 0-800 ms segment loss: 0.1% Periodic Poisson Gaussian Uniform Not used here ANUE 10GigE bohr05 HP 48-core 4-socket Linux host Connection Emulation: physical packets are sent to/from hosts delayed inside emulator for rtt More accurate than simulators bohr04 bohr05 TCP connection RTT loss rate 10Gbps iperf TCP iperf TCP

Throughput Measurements: SONET OC192 ANUE Emulator feynman 1 HP 16-core 2-socket Linux host Ethernet-SONET conversion fiber loopback e300 10GE-OC192 ANUE OC192 feynman 2 HP 16-core 2-socket Linux host connection emulation: latency 0-800 ms loss: 0% only OC192 ANUE: unable to emulate losses Available congestion control modules: CUBIC (default) Scalable TCP Reno Hamilton TCP Highspeed TCP feynman 1 feynman 2 TCP connection RTT loss rate 9.6 Gbps iperf TCP iperf TCP

CUBIC and Scalable TCP For dedicated 10G links: fairly stable statistics Throughput - Gbps Throughput - Gbps Scalable TCP CUBIC RTT - ms RTT - ms

CUBIC and Scalable TCP For dedicated OC192 – 9.6Gbps connections: Not all TCP Versions have concave regions Scalable TCP Hamilton TCP CUBIC Reno Highspeed TCP

TCP Parameters and Dynamics TCP and Connection Parameters: Congestion window: Instantaneous throughput: Connection capacity: Connection round trip time Average throughput during observation window: Dedicated Connections: under close-to-zero losses Two Regimes: Slow Start: increases by 1 for each acknowledgement until it reaches (b) Congestion Avoidance: increases with acknowledgement and decreases with inference of loss (such as time-out) details of increase and decrease depend on TCP version simple case: Reno Additive Increase and Multiplicative Decrease (AIMD) acknowledgement: loss inferred:

TCP Dynamics for Dedicated Connections Dynamics are dominated by acknowledgements: Most, if any, losses are by TCP itself – buffer overflows Very rarely, physical layer losses lead to TCP/IP losses Resultant Dynamics: Congestion window mostly increases – slow start and into congestion avoidance Sending and receiving rates are limited by Send and receive host buffers – TCP, IP, kernel, NIC, IO and others Connection capacity and round-trip time Throughput increases until it reaches connection capacity small delay bandwidth product: transition during slow start large delay bandwidth product: transition after slow start To a first-order approximation: Overall qualitative properties of throughput can be inferred without finer details of congestion avoidance phase for smaller data transfers

TCP Throughput Peaks During Slow start Short Connections: Smaller Delay-Bandwidth Product loss event link capacity reached congestion avoidance slow start time

TCP Throughput Peaks During Slow start Longer Connections: Higher Delay-Bandwidth Product loss event slow start congestion avoidance time

Throughput Estimation: Decrease with rtt For smaller rtt, link capacity is reached while in slow start throughput during period of observation During, slow-start, throughput doubles for every rtt Monotonicity of throughput: reduces to

Concave Profile of Throughput Condition of concave throughput profile: for By substituting the terms, we obtain This reduces to concavity of simpler form: follows from decreasing

TCP Throughput: Decreases in general Using monotonity for smaller rtt: suffices to show for regions CA1 By using the generic form: It suffices to show: (i) (ii) In summary, throughput decreases with rtt for all TCP versions

UDP-Based Transport: UDT For dedicated 10G links, UDT provides higher throughput than CUBIC (linux default) TCP and UDT Throughput transition-point depends on connection parameters – rtt, loss rate, host – NIC parameters, IP and UDP parameters Disk-to-Disk transfers (xdd) have lower transfer rate UDT xdd-read CUBIC Single stream xdd-write

Conclusions Future Work Summary: For smaller memory transfers over dedicated connections: Collected systematic throughput measurements: different rtt ranges throughput profiles has two modes – sharp contrast to well-known convex profiles convex: longer rtt concave: shorter to medium rtt Developed analytical models to explain the two modes Our Results also lead to practical solutions Choose TCP version for the connection rtt: Five different TCP versions: Reno, Scalable TCP, Hamilton TCP, Highspeed TCP, CUBIC (default in Linux) Load congestion avoidance kernel module Future Work Large data transfers – congestion avoidance region Disk and File transfers – effects of IO limits TCP-UDT trade-offs Parallel TCP streams

Thank you