Parallel TCP Bill Allcock Argonne National Laboratory.

Slides:



Advertisements
Similar presentations
Appropriateness of Transport Mechanisms in Data Grid Middleware Rajkumar Kettimuthu 1,3, Sanjay Hegde 1,2, William Allcock 1, John Bresnahan 1 1 Mathematics.
Advertisements

Storage System Integration with High Performance Networks Jon Bakken and Don Petravick FNAL.
TCP--Revisited. Background How to effectively share the network? – Goal: Fairness and vague notion of equality Ideal: If N connections, each should get.
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
August 10, Circuit TCP (CTCP) Helali Bhuiyan
Restricted Slow-Start for TCP William Allcock 1,2, Sanjay Hegde 3 and Rajkumar Kettimuthu 1,2 1 Argonne National Laboratory 2 The University of Chicago.
Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.
Presentation by Joe Szymanski For Upper Layer Protocols May 18, 2015.
GridFTP: File Transfer Protocol in Grid Computing Networks
Transport Layer 3-1 Fast Retransmit r time-out period often relatively long: m long delay before resending lost packet r detect lost segments via duplicate.
GridPP meeting Feb 03 R. Hughes-Jones Manchester WP7 Networking Richard Hughes-Jones.
High-performance bulk data transfers with TCP Matei Ripeanu University of Chicago.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
Internet and Intranet Protocols and Applications Section V: Network Application Performance Lecture 11: Why the World Wide Wait? 4/11/2000 Arthur P. Goldberg.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
Computer Networks Transport Layer. Topics F Introduction  F Connection Issues F TCP.
INDIANAUNIVERSITYINDIANAUNIVERSITY TransPAC QBSS “Scavenger Service” Chris Robb Indiana University TransPAC Network Engineer
Introduction 1 Lecture 14 Transport Layer (Congestion Control) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science.
The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.
GridFTP Guy Warner, NeSC Training.
GT4 GridFTP for Users: The New GridFTP Server Bill Allcock, ANL NeSC, Edinburgh, Scotland Jan 27-28, 2005.
Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.
Lecturer: Ghadah Aldehim
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer3-1 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles.
CS144 An Introduction to Computer Networks
UDT: UDP based Data Transfer Protocol, Results, and Implementation Experiences Yunhong Gu & Robert Grossman Laboratory for Advanced Computing / Univ. of.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
ACN: RED paper1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
The Impact of Active Queue Management on Multimedia Congestion Control Wu-chi Feng Ohio State University.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
LEGS: A WSRF Service to Estimate Latency between Arbitrary Hosts on the Internet R.Vijayprasanth 1, R. Kavithaa 2,3 and Raj Kettimuthu 2,3 1 Coimbatore.
Harnessing Multicore Processors for High Speed Secure Transfer Raj Kettimuthu Argonne National Laboratory.
1 UNIT 13 The World Wide Web Lecturer: Kholood Baselm.
GridFTP Richard Hopkins
What is TCP? Connection-oriented reliable transfer Stream paradigm
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
1 CS 4396 Computer Networks Lab TCP – Part II. 2 Flow Control Congestion Control Retransmission Timeout TCP:
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer 3- Midterm score distribution. Transport Layer 3- TCP congestion control: additive increase, multiplicative decrease Approach: increase.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
Winter 2008CS244a Handout 71 CS244a: An Introduction to Computer Networks Handout 7: Congestion Control Nick McKeown Professor of Electrical Engineering.
Winter 2003CS244a Handout 71 CS492B Project #2 TCP Tutorial # Jin Hyun Ju.
ECE 4110 – Internetwork Programming
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
Parallel IO for Cluster Computing Tran, Van Hoai.
GridFTP Guy Warner, NeSC Training Team.
1 GridFTP and SRB Guy Warner Training, Outreach and Education Team, Edinburgh e-Science.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
Distributed Systems 11. Transport Layer
CS450 – Introduction to Networking Lecture 19 – Congestion Control (2)
Chapter 3 outline 3.1 transport-layer services
Networking for grid Network capacity Network throughput
Chapter 5 TCP Sliding Window
PUSH Flag A notification from the sender to the receiver to pass all the data the receiver has to the receiving application. Some implementations of TCP.
IT351: Mobile & Wireless Computing
Chapter 6 TCP Congestion Control
Andy Wang COP 5611 Advanced Operating Systems
CS4470 Computer Networking Protocols
TCP Congestion Control
Transport Layer: Congestion Control
TCP flow and congestion control
Presentation transcript:

Parallel TCP Bill Allcock Argonne National Laboratory

Definitions l Logical Transfer u The transfer of interest to the initiator, i.e. move file foo from server A to server B. l Network Endpoint u In general something that has an IP address. A Network Interface Card (NIC). l Parallel Transfer u Use of multiple TCP streams between a given pair of network endpoints during a logical transfer l Striped Transfer u Use of multiple pairs of network endpoints during a logical transfer.

What’s wrong with TCP? l You probably wouldn’t be here if you didn’t know that. l TCP was designed for Telnet / Web like applications. l It was designed when T1 was a fast network, big memory was 2MB, not 2 GB, and a big file transfer was 100MB, not 100GB or even Terabytes.

AIMD and BWDP l The primary problems are: u Additive Increase Multiplicative Decrease (AIMD) congestion control algorithm of TCP u Requirement of having a buffer equal to the Bandwidth Delay Product (BWDP) u The interaction between those two. u We use parallel and striped transfers to work around these problems.

AIMD l To the first order this algorithm: u Exponentially increases the congestion window (CWND) until it gets a congestion event u Cuts the CWND in half u Linearly increases the CWND until it reaches a congestion event. u This assumes that congestion is the limiting factor u Note that CWND size is equivalent to Max BW

BWDP l Use a tank as an analogy l I can keep putting water in until it is full. l Then, I can only put in one gallon for each gallon removed. l You can calculate the volume of the tank by taking the cross sectional area times the height l Think of the BW as the area and the RTT as the length of the network pipe.

Recovery Time

Recovery Time for a Single Congestion Event l T1 (1.544 Mbs) with 50ms RTT  10 KB u Recovery Time (1500 MTU): 0.16 Sec l GigE with 50ms RTT  6250 KB u Recovery Time (1500 MTU): 104 Seconds l GigE to Amsterdam (100ms)  1250 KB u Recovery Time (1500 MTU): 416 Seconds l GigE to CERN (160ms)  2000 KB u Recovery Time (1500 MTU): 1066 Sec (17.8 min)

How does Parallel TCP Help? l We are basically cheating…. I mean we are taking advantage of loopholes in the system l Reduces the severity of a congestion event l Buffers are divided across streams so faster recovery l Probably get more than your fair share in the router

Reduced Severity from Congestion Events l Don’t put all your eggs in one basket l Normal TCP your BW Reduction is 50% u 1000 Mbs * 50% = 500 Mbs Reduction l In Parallel TCP BW Reduction is: u Total BW / N Streams * 50% u 1000 / 4 * 50% = 125 Mbs Reduction l Note we are assuming only one stream receives a congestion event

Faster Recovery from Congestion Events l Optimum TCP Buffer Size is now BWDP / N where N is number of Streams l Since Buffers are reduced in size by a factor of 1/N so is the recovery time. l This can also help work around host limitations. If the maximum buffer size is too small for max bandwidth, you can get multiple smaller buffers.

More than your Fair Share l This part is inferred, but we have no data with which to back it up. l Routers apply fair sharing algorithms to the streams being processed. l Since your logical transfer now has N streams, it is getting N times the service it otherwise normally would. l I am told there are routers that can detect parallel streams and will maintain your fair share, though I have not run into one yet.

What about Striping? l Typically used in a cluster with a shared file system, but it can be a multi-homed host l All the advantages of Parallel TCP l Also get parallelism of CPUs, Disk subsystems, buses, NICs, etc.. l You can, in certain circumstances, also get parallelism of network paths l This is a much more complicated implementation and beyond the scope of what we are primarily discussing here.

Nothing comes for free… l As noted earlier, we are cheating. l Congestion Control is there for a reason l Buffer limitations may or may not be there for a reason l Other Netizens may austracize you.

Congestion Control l Congestion Control is in place for a reason. l If every TCP application started using parallel TCP, overall performance would decrease and there would be the risk of congestive network collapse. l Note that in the face of no congestion parallel streams does not help l In the face of heavy congestion, it can perform worse.

Buffer Limitations l More often than not, the system limitations are there because that is way it came out of the box. l It requires root privilege to change them. l However, sometimes, they are there because of real resource limitations of the host and you risk crashing the host by over-extending its resources.

Cheat enough, but not too much l If your use of parallel TCP causes too many problems you could find yourself in trouble. u Admins get cranky when you crash their machines u Other users get cranky if you are hurting overall network performance. l Be a good Netizen

When should you use Parallel TCP? l Engineered, private, semi private, or very over provisioned networks are good places to use parallel TCP. l Bulk data transport. It makes no sense at all to use parallel TCP for most interactive apps. l QOS: If you are guaranteed the bandwidth, use it l Community Agreement: You are given permission to hog the network. l Lambda Switched Networks: You have your own circuit, go nuts.

William (Bill) E. Allcock Bill Allcock is the technology coordinator and evangelist for GridFTP within the Globus Alliance. Bill has a BS in Computer Science and an MS in Paper Science. In his 15 years work experience he has been involved in a wide array of areas including computer networking, distributed systems, embedded systems, data acquisition, process engineering, control system tuning, and colloidal chemistry. Bill’s current research focus involves applications requiring access to large (Terabyte and Petabyte sized) data sets, so called DataGrid problems. He is also heavily involved in the Global Grid Forum. Bill has presented several previous tutorials on the Globus Toolkit(R), primarily on GridFTP use and development libraries. He also has lead tutorials on Introduction to Grids, as well as IO and security in the Globus Toolkit.