J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology High speed WAN data transfers for science Session Recent Results.

Slides:



Advertisements
Similar presentations
Storage System Integration with High Performance Networks Jon Bakken and Don Petravick FNAL.
Advertisements

TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
FAST TCP Anwis Das Ajay Gulati Slides adapted from : IETF presentation slides Link:
Cheng Jin David Wei Steven Low FAST TCP: design and experiments.
LCG TCP performance optimization for 10 Gb/s LHCOPN connections 1 on behalf of M. Bencivenni, T.Ferrari, D. De Girolamo, Stefano.
Shawn P. McKee University of Michigan University of Michigan UltraLight Meeting, NSF January 26, 2005 Network Working Group Report.
Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.
Advanced Computer Networking Congestion Control for High Bandwidth-Delay Product Environments (XCP Algorithm) 1.
Congestion Control An Overview -Jyothi Guntaka. Congestion  What is congestion ?  The aggregate demand for network resources exceeds the available capacity.
16 September The ongoing evolution from Packet based networks to Hybrid Networks in Research & Education Networks Olivier Martin, CERN NEC’2005.
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
ESLEA Bedfont Lakes Dec 04 Richard Hughes-Jones Network Measurement & Characterisation and the Challenge of SuperComputing SC200x.
Congestion Control for High Bandwidth-delay Product Networks Dina Katabi, Mark Handley, Charlie Rohrs.
MB - NG MB-NG Meeting UCL 17 Jan 02 R. Hughes-Jones Manchester 1 Discussion of Methodology for MPLS QoS & High Performance High throughput Investigations.
UltraLight: Network & Applications Research at UF Dimitri Bourilkov University of Florida CISCO - UF Collaborative Team Meeting Gainesville, FL, September.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
10GbE WAN Data Transfers for Science High Energy/Nuclear Physics (HENP) SIG Fall 2004 Internet2 Member Meeting Yang Xia, HEP, Caltech
Transport Layer 4 2: Transport Layer 4.
31 October The ongoing evolution from Packet based networks to Hybrid Networks in Research & Education Networks Olivier Martin, CERN Swiss ICT Task.
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology State of the art in the use of long distance network International.
Shawn McKee / University of Michigan USATLAS Tier1 & Tier2 Network Planning Meeting December 14, BNL UltraLight Overview.
Technology for Using High Performance Networks or How to Make Your Network Go Faster…. Robin Tasker UK Light Town Meeting 9 September.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
NORDUnet 2003, Reykjavik, Iceland, 26 August 2003 High-Performance Transport Protocols for Data-Intensive World-Wide Grids T. Kelly, University of Cambridge,
LambdaStation Monalisa DoE PI meeting September 30, 2005 Sylvain Ravot.
1 CHAPTER 8 TELECOMMUNICATIONSANDNETWORKS. 2 TELECOMMUNICATIONS Telecommunications: Communication of all types of information, including digital data,
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Shawn McKee University of Michigan University of Michigan UltraLight: A Managed Network Infrastructure for HEP CHEP06, Mumbai, India February 14, 2006.
FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
Rate Control Rate control tunes the packet sending rate. No more than one packet can be sent during each packet sending period. Additive Increase: Every.
DYNES Storage Infrastructure Artur Barczyk California Institute of Technology LHCOPN Meeting Geneva, October 07, 2010.
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
First of ALL Big appologize for Kei’s absence Hero of this year’s LSR achievement Takeshi in his experiment.
TERENA Networking Conference, Zagreb, Croatia, 21 May 2003 High-Performance Data Transport for Grid Applications T. Kelly, University of Cambridge, UK.
Project Results Thanks to the exceptional cooperation spirit between the European and North American teams involved in the DataTAG project,
The Design and Demonstration of the UltraLight Network Testbed Presented by Xun Su GridNets 2006, Oct.
TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
30 June Wide Area Networking Performance Challenges Olivier Martin, CERN UK DTI visit.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
FAST Protocols for High Speed Network David netlab, Caltech For HENP WG, Feb 1st 2003.
TCP Traffic Characteristics—Deep buffer Switch
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
FAST TCP Cheng Jin David Wei Steven Low netlab.CALTECH.edu GNEW, CERN, March 2004.
S. Ravot, J. Bunn, H. Newman, Y. Xia, D. Nae California Institute of Technology CHEP 2004 Network Session September 1, 2004 Breaking the 1 GByte/sec Barrier?
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 1 The Uptake of High Speed Protocols or Are these protocols making their way.
An Analysis of AIMD Algorithm with Decreasing Increases Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data Mining.
Masaki Hirabaru (NICT) and Jin Tanaka (KDDI) Impact of Bottleneck Queue on Long Distant TCP Transfer August 25, 2005 NOC-Network Engineering Session Advanced.
1 FAST TCP for Multi-Gbps WAN: Experiments and Applications Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003
J. Bunn, D. Nae, H. Newman, S. Ravot, R. Voicu, X. Su, Y. Xia California Institute of Technology US LHCNet US LHC Network Working Group Meeting October.
Fall 2005 Internet2 Member Meeting International Task Force Julio Ibarra, PI Heidi Alvarez, Co-PI Chip Cox, Co-PI John Silvester, Co-PI September 19, 2005.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology US LHCNet LHCNet WG September 12 th 2006.
Joint Genome Institute
R. Hughes-Jones Manchester
Networking between China and Europe
Transport Protocols over Circuits/VCs
TransPAC HPCC Engineer
Wide Area Networking at SLAC, Feb ‘03
Experiences from SLAC SC2004 Bandwidth Challenge
High-Performance Data Transport for Grid Applications
Presentation transcript:

J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology High speed WAN data transfers for science Session Recent Results TERENA Networking Conference 2005 Poznan, Poland 6-9 June

AGENDA u TCP performance over high speed/latency networks u End-Systems performance u Next generation network u Advanced R&D projects

Single TCP stream performance under periodic losses Loss rate =0.01%: è LAN BW utilization= 99% è WAN BW utilization=1.2% Bandwidth available = 1 Gbps  TCP throughput much more sensitive to packet loss in WANs than LANs  TCP’s congestion control algorithm (AIMD) is not well-suited to gigabit networks  The effect of packets loss can be disastrous  TCP is inefficient in high bandwidth*delay networks  The future performance-outlook for computational grids looks bad if we continue to rely solely on the widely-deployed TCP RENO

Single TCP stream between Caltech and CERN u Available (PCI-X) Bandwidth=8.5 Gbps u RTT=250ms (16’000 km) u 9000 Byte MTU u 15 min to increase throughput from 3 to 6 Gbps u Sending station:  Tyan S2882 motherboard, 2x Opteron 2.4 GHz, 2 GB DDR. u Receiving station:  CERN OpenLab:HP rx4640, 4x 1.5GHz Itanium-2, zx1 chipset, 8GB memory u Network adapter:  S2IO 10 GbE Burst of packet losses Single packet loss CPU load = 100%

ResponsivenessPathBandwidth RTT (ms) MTU (Byte) Time to recover LAN 10 Gb/s ms Geneva–Chicago 10 Gb/s hr 32 min Geneva-Los Angeles 1 Gb/s min Geneva-Los Angeles 10 Gb/s hr 51 min Geneva-Los Angeles 10 Gb/s min Geneva-Los Angeles 10 Gb/s k (TSO) 5 min Geneva-Tokyo 1 Gb/s hr 04 min  Large MTU accelerates the growth of the window  Time to recover from a packet loss decreases with large MTU  Larger MTU reduces overhead per frames (saves CPU cycles, reduces the number of packets)  C. RTT 2. MSS 2 C : Capacity of the link  Time to recover from a single packet loss:

Fairness: TCP Reno MTU & RTT bias Starlight (Chi) CERN (GVA)  Two TCP streams share a 1 Gbps bottleneck  RTT=117 ms  MTU = 1500 Bytes; Avg. throughput over a period of 4000s = 50 Mb/s  MTU = 9000 Bytes; Avg. throughput over a period of 4000s = 698 Mb/s  Factor 14 !  Connections with large MTU increase quickly their rate and grab most of the available bandwidth RR GbE Switch Host #1 POS 2.5 Gbps 1 GE Host #2 Host #1 Host #2 1 GE Bottleneck

TCP Variants  HSTCP, Scalable, Westwood+, Bic TCP, HTCP  FAST TCP  Based on TCP Vegas  Uses end-to-end delay and loss to dynamically adjust the congestion window  Defines an explicit equilibrium  7.3 Gbps between CERN and Caltech for hours  FAST vs RENO: 1G BW utilization 19% BW utilization 27% BW utilization 95% capacity = 1Gbps; 180 ms round trip latency; 1 flow 3600s Linux TCP Linux TCP (Optimized) FAST

TCP Variants Performance   Tests between CERN and Caltech   Capacity = OC Gbps; 264 ms round trip latency; 1 flow u u Sending station: Tyan S2882 motherboard, 2x Opteron 2.4 GHz, 2 GB DDR. u u Receiving station (CERN OpenLab): HP rx4640, 4x 1.5GHz Itanium-2, zx1 chipset, 8GB memory u u Network adapter: Neterion 10 GE NIC Linux TCP Linux Westwood+ Linux BIC TCP FAST TCP 3.0 Gbps 5.0 Gbps7.3 Gbps 4.1 Gbps

 Product of transfer speed and distance using standard Internet (TCP/IP) protocols.  Single Stream 7.5 Gbps X 16 kkm with Linux: July 2004  IPv4 Multi-stream record with FAST TCP: 6.86 Gbps X 27kkm: Nov 2004  IPv6 record: 5.11 Gbps between Geneva and Starlight: Jan  Concentrate now on reliable Terabyte-scale file transfers  Disk-to-disk Marks: 536 Mbytes/sec (Windows); 500 Mbytes/sec (Linux)  Note System Issues: PCI-X Bus, Network Interface, Disk I/O Controllers, CPU, Drivers Internet 2 Land Speed Record (LSR) Redefining the Role and Limits of TCP Nov Record Network Internet2 LSRs: Blue = HEP 7.2G X 20.7 kkm Throuhgput (Petabit-m/sec)

High Throughput Disk to Disk Transfers: From 0.1 to 1GByte/sec Server Hardware (Rather than Network) Bottlenecks:  Write/read and transmit tasks share the same limited resources: CPU, PCI- X bus, memory, IO chipset  PCI-X bus bandwidth: 8.5 Gbps [133MHz x 64 bit]  Neterion SR 10GE NIC  10 GE NIC – 7.5 Gbits/sec (memory-to- memory, with 52% CPU utilization)  2*10 GE NIC (802.3ad link aggregation) – 11.1 Gbits/sec (memory-to-memory)  1 TB at 536 MB/sec from CERN to Caltech Performance in this range (from 100 MByte/sec up to 1 GByte/sec) is required to build a responsive Grid- based Processing and Analysis System for LHC

SC2004: High Speed TeraByte Transfers for Physics u Demonstrating that many 10 Gbps wavelengths can be used efficiently over continental and transoceanic distances u Preview of the globally distributed grid system that is now being developed in preparation for the next generation of high-energy physics experiments at CERN's Large Hadron Collider (LHC), u Monitoring the WAN performance using the MonALISA agent-based system u Major Partners : Caltech-FNAL-SLAC u Major Sponsors:  Cisco, S2io, HP, Newysis u Major networks:  NLR, Abilene, ESnet, LHCnet, Ampath, TeraGrid u Bandwidth challenge award: 101 Gigabit Per Second Mark

SC2004 Network (I)   80 10GE ports   50 10GE network adapters   10 Cisco 7600/6500 dedicated to the experiment   Backbone for the experiment built in two days.

101 Gigabit Per Second Mark 101 Gbps Unstable end-sytems END of demo Source: Bandwidth Challenge committee

UltraLight: Developing Advanced Network Services for Data Intensive HEP Applications  UltraLight: a next-generation hybrid packet- and circuit- switched network infrastructure  Packet switched: cost effective solution; requires ultrascale protocols to share 10G efficiently and fairly  Circuit-switched: Scheduled or sudden “overflow” demands handled by provisioning additional wavelengths; Use path diversity, e.g. across the US, Atlantic, Canada,…  Extend and augment existing grid computing infrastructures (currently focused on CPU/storage) to include the network as an integral component  Using MonALISA to monitor and manage global systems  Partners: Caltech, UF, FIU, UMich, SLAC, FNAL, MIT/Haystack; CERN, NLR, CENIC, Internet2; Translight, UKLight, Netherlight; UvA, UCL, KEK, Taiwan  Strong support from Cisco

Ultralight 10GbE Caltech u L1, L2 and L3 services u Hybrid packet- and circuit-switched PoP u Control plane is L3

LambdaStation  A Joint Fermilab and Caltech project  Enabling HEP applications to send high throughput traffic between mass storage systems across advanced network paths  Dynamic Path Provisioning across USNet, UltraLight, NLR; Plus an ESNet/Abilene “standard path”  DoE funded

LambdaStation: Results  Plot shows functionality  Alternate use of lightpaths and conventional routed networks.  Grid FTP

Summary  For many years the Wide Area Network has been the bottleneck; this is no longer the case in many countries thus making deployment of a data intensive Grid infrastructure possible!  Some transport protocol issues still need to be resolved; however there are many encouraging signs that practical solutions may now be in sight.  1GByte/sec disk to disk challenge.  Today: 1 TB at 536 MB/sec from CERN to Caltech  Next generation network and Grid system: UltraLight and LambdaStation  The integrated, managed network  Extend and augment existing grid computing infrastructures (currently focused on CPU/storage) to include the network as an integral component.

Thanks&Questions