J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology State of the art in the use of long distance network International.

Slides:



Advertisements
Similar presentations
Storage System Integration with High Performance Networks Jon Bakken and Don Petravick FNAL.
Advertisements

TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
Cheng Jin David Wei Steven Low FAST TCP: design and experiments.
LCG TCP performance optimization for 10 Gb/s LHCOPN connections 1 on behalf of M. Bencivenni, T.Ferrari, D. De Girolamo, Stefano.
Shawn P. McKee University of Michigan University of Michigan UltraLight Meeting, NSF January 26, 2005 Network Working Group Report.
Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.
Congestion Control An Overview -Jyothi Guntaka. Congestion  What is congestion ?  The aggregate demand for network resources exceeds the available capacity.
16 September The ongoing evolution from Packet based networks to Hybrid Networks in Research & Education Networks Olivier Martin, CERN NEC’2005.
RIT Campus Data Network. General Network Statistics Over 23,000 wired outlets Over 14,500 active switched ethernet ports > 250 network closets > 1,000.
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
Review on Networking Technologies Linda Wu (CMPT )
ESLEA Bedfont Lakes Dec 04 Richard Hughes-Jones Network Measurement & Characterisation and the Challenge of SuperComputing SC200x.
02 nd April 03Networkshop Managed Bandwidth Next Generation F. Saka UCL NETSYS (NETwork SYStems centre of excellence)
UltraLight: Network & Applications Research at UF Dimitri Bourilkov University of Florida CISCO - UF Collaborative Team Meeting Gainesville, FL, September.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
10GbE WAN Data Transfers for Science High Energy/Nuclear Physics (HENP) SIG Fall 2004 Internet2 Member Meeting Yang Xia, HEP, Caltech
Is Lambda Switching Likely for Applications? Tom Lehman USC/Information Sciences Institute December 2001.
31 October The ongoing evolution from Packet based networks to Hybrid Networks in Research & Education Networks Olivier Martin, CERN Swiss ICT Task.
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology High speed WAN data transfers for science Session Recent Results.
Shawn McKee / University of Michigan USATLAS Tier1 & Tier2 Network Planning Meeting December 14, BNL UltraLight Overview.
Technology for Using High Performance Networks or How to Make Your Network Go Faster…. Robin Tasker UK Light Town Meeting 9 September.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
LambdaStation Monalisa DoE PI meeting September 30, 2005 Sylvain Ravot.
Shawn P. McKee / University of Michigan International ICFA Workshop on HEP Networking, Grid and Digital Divide Issues for Global e-Science May 25, 2005.
A Framework for Internetworking Heterogeneous High-Performance Networks via GMPLS and Web Services Xi Yang, Tom Lehman Information Sciences Institute (ISI)
1 CHAPTER 8 TELECOMMUNICATIONSANDNETWORKS. 2 TELECOMMUNICATIONS Telecommunications: Communication of all types of information, including digital data,
Delivering Circuit Services to Researchers: The HOPI Testbed Rick Summerhill Director, Network Research, Architecture, and Technologies, Internet2 Joint.
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Shawn McKee University of Michigan University of Michigan UltraLight: A Managed Network Infrastructure for HEP CHEP06, Mumbai, India February 14, 2006.
FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
Rate Control Rate control tunes the packet sending rate. No more than one packet can be sent during each packet sending period. Additive Increase: Every.
DYNES Storage Infrastructure Artur Barczyk California Institute of Technology LHCOPN Meeting Geneva, October 07, 2010.
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
First of ALL Big appologize for Kei’s absence Hero of this year’s LSR achievement Takeshi in his experiment.
TERENA Networking Conference, Zagreb, Croatia, 21 May 2003 High-Performance Data Transport for Grid Applications T. Kelly, University of Cambridge, UK.
The Design and Demonstration of the UltraLight Network Testbed Presented by Xun Su GridNets 2006, Oct.
TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
30 June Wide Area Networking Performance Challenges Olivier Martin, CERN UK DTI visit.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
FAST Protocols for High Speed Network David netlab, Caltech For HENP WG, Feb 1st 2003.
TCP Traffic Characteristics—Deep buffer Switch
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
S. Ravot, J. Bunn, H. Newman, Y. Xia, D. Nae California Institute of Technology CHEP 2004 Network Session September 1, 2004 Breaking the 1 GByte/sec Barrier?
The EU DataTAG Project Richard Hughes-Jones Based on Olivier H. Martin GGF3 Frascati, Italy Oct 2001.
An Analysis of AIMD Algorithm with Decreasing Increases Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data Mining.
HENP SIG Austin, TX September 27th, 2004Shawn McKee The UltraLight Program UltraLight: An Overview and Update Shawn McKee University of Michigan.
Masaki Hirabaru (NICT) and Jin Tanaka (KDDI) Impact of Bottleneck Queue on Long Distant TCP Transfer August 25, 2005 NOC-Network Engineering Session Advanced.
1 FAST TCP for Multi-Gbps WAN: Experiments and Applications Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003
J. Bunn, D. Nae, H. Newman, S. Ravot, R. Voicu, X. Su, Y. Xia California Institute of Technology US LHCNet US LHC Network Working Group Meeting October.
Fall 2005 Internet2 Member Meeting International Task Force Julio Ibarra, PI Heidi Alvarez, Co-PI Chip Cox, Co-PI John Silvester, Co-PI September 19, 2005.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology US LHCNet LHCNet WG September 12 th 2006.
Joint Genome Institute
R. Hughes-Jones Manchester
Networking between China and Europe
Transport Protocols over Circuits/VCs
Internet Congestion Control Research Group
TCP Performance over a 2.5 Gbit/s Transatlantic Circuit
MB-NG Review High Performance Network Demonstration 21 April 2004
Wide Area Networking at SLAC, Feb ‘03
High-Performance Data Transport for Grid Applications
Presentation transcript:

J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology State of the art in the use of long distance network International ICFA Workshop on HEP Networking Daegu, Korea May 23, 2005

AGENDA u TCP performance over high speed/latency networks u Recent results u End-Systems performance u Next generation network u Advanced R&D projects

Single TCP stream performance under periodic losses Loss rate =0.01%: è LAN BW utilization= 99% è WAN BW utilization=1.2% Bandwidth available = 1 Gbps  TCP throughput much more sensitive to packet loss in WANs than LANs  TCP’s congestion control algorithm (AIMD) is not well-suited to gigabit networks  The effect of packets loss can be disastrous  TCP is inefficient in high bandwidth*delay networks  The future performance-outlook for computational grids looks bad if we continue to rely solely on the widely-deployed TCP RENO

Single TCP stream between Caltech and CERN u Available (PCI-X) Bandwidth=8.5 Gbps u RTT=250ms (16’000 km) u 9000 Byte MTU u 15 min to increase throughput from 3 to 6 Gbps u Sending station:  Tyan S2882 motherboard, 2x Opteron 2.4 GHz, 2 GB DDR. u Receiving station:  CERN OpenLab:HP rx4640, 4x 1.5GHz Itanium-2, zx1 chipset, 8GB memory u Network adapter:  S2IO 10 GbE Burst of packet losses Single packet loss CPU load = 100%

ResponsivenessPathBandwidth RTT (ms) MTU (Byte) Time to recover LAN 10 Gb/s ms Geneva–Chicago 10 Gb/s hr 32 min Geneva-Los Angeles 1 Gb/s min Geneva-Los Angeles 10 Gb/s hr 51 min Geneva-Los Angeles 10 Gb/s min Geneva-Los Angeles 10 Gb/s k (TSO) 5 min Geneva-Tokyo 1 Gb/s hr 04 min  Large MTU accelerates the growth of the window  Time to recover from a packet loss decreases with large MTU  Larger MTU reduces overhead per frames (saves CPU cycles, reduces the number of packets)  C. RTT 2. MSS 2 C : Capacity of the link  Time to recover from a single packet loss:

Fairness: TCP Reno MTU & RTT bias Different MTUs lead to a very poor sharing of the bandwidth.

TCP Variants  HSTCP, Scalable, Westwood+, Bic TCP, HTCP  FAST TCP  Based on TCP Vegas  Uses end-to-end delay and loss to dynamically adjust the congestion window  Defines an explicit equilibrium  7.3 Gbps between CERN and Caltech for hours  FAST vs RENO: 1G BW utilization 19% BW utilization 27% BW utilization 95% capacity = 1Gbps; 180 ms round trip latency; 1 flow 3600s Linux TCP Linux TCP (Optimized) FAST

TCP Variants Performance   Tests between CERN and Caltech   Capacity = OC Gbps; 264 ms round trip latency; 1 flow u u Sending station: Tyan S2882 motherboard, 2x Opteron 2.4 GHz, 2 GB DDR. u u Receiving station (CERN OpenLab): HP rx4640, 4x 1.5GHz Itanium-2, zx1 chipset, 8GB memory u u Network adapter: Neterion 10 GE NIC Linux TCP Linux Westwood+ Linux BIC TCP FAST TCP 3.0 Gbps 5.0 Gbps7.3 Gbps 4.1 Gbps

 Product of transfer speed and distance using standard Internet (TCP/IP) protocols.  Single Stream 7.5 Gbps X 16 kkm with Linux: July 2004  IPv4 Multi-stream record with FAST TCP: 6.86 Gbps X 27kkm: Nov 2004  IPv6 record: 5.11 Gbps between Geneva and Starlight: Jan  Concentrate now on reliable Terabyte-scale file transfers  Disk-to-disk Marks: 536 Mbytes/sec (Windows); 500 Mbytes/sec (Linux)  Note System Issues: PCI-X Bus, Network Interface, Disk I/O Controllers, CPU, Drivers Internet 2 Land Speed Record (LSR) Redefining the Role and Limits of TCP Nov Record Network Internet2 LSRs: Blue = HEP 7.2G X 20.7 kkm Throuhgput (Petabit-m/sec)

SC2004: High Speed TeraByte Transfers for Physics u Demonstrating that many 10 Gbps wavelengths can be used efficiently over continental and transoceanic distances u Preview of the globally distributed grid system that is now being developed in preparation for the next generation of high-energy physics experiments at CERN's Large Hadron Collider (LHC), u Monitoring the WAN performance using the MonALISA agent-based system u Major Partners : Caltech-FNAL-SLAC u Major Sponsors:  Cisco, S2io, HP, Newysis u Major networks:  NLR, Abilene, ESnet, LHCnet, Ampath, TeraGrid u Bandwidth challenge award: 101 Gigabit Per Second Mark

SC2004 Network (I)   80 10GE ports   50 10GE network adapters   10 Cisco 7600/6500 dedicated to the experiment   Backbone for the experiment built in two days.

101 Gigabit Per Second Mark 101 Gbps Unstable end-sytems END of demo Source: Bandwidth Challenge committee

High Throughput Disk to Disk Transfers: From 0.1 to 1GByte/sec Server Hardware (Rather than Network) Bottlenecks:  Write/read and transmit tasks share the same limited resources: CPU, PCI-X bus, memory, IO chipset  PCI-X bus bandwidth: 8.5 Gbps [133MHz x 64 bit] Performance in this range (from 100 MByte/sec up to 1 GByte/sec) is required to build a responsive Grid- based Processing and Analysis System for LHC

Transferring a TB from Caltech to CERN  3 Supermicro Marvell SATA disk controllers + 24 SATA 7200rpm SATA disks  Local Disk IO – 9.6 Gbits/sec (1.2 GBytes/sec read/write, with <20% CPU utilization)  Neterion SR 10GE NIC  10 GE NIC – 7.5 Gbits/sec (memory-to-memory, with 52% CPU utilization)  2*10 GE NIC (802.3ad link aggregation) – 11.1 Gbits/sec (memory-to-memory)  Memory to Memory WAN data flow, and local Memory to Disk read/write flow, are not matched when combining the two operations  Quad Opteron AMD GHz processors with 3 AMD-8131 chipsets: 4 64-bit/133MHz PCI-X slots. 4.3Gbits/sec (~500MB/sec) - Single TCP stream from Caltech to CERN; 1TB file

New services aspirations u Circuit-based services   Layer 1 & 2 switching, “the light path”  High bandwidth point-to-point circuits for big users (up to 10 Gbps)  Redundant paths  On-demand  Advance Reservation System; Authentication, Authorization and Accounting (AAA)  Control plane  GMPLS, UCLP, MonaLisa u New Initiatives/projects  GEANT2, USNet, HOPI  GLIF (Global Lambda Integrated Facility)  OMNINET, Dragon, Cheetah  UltraLight and LambdaStation

New Technology Candidates: Opportunities and Issues u New standard for SONET infrastructures  Alternative to the expensive Packet-Over-Sonet (POS) technology currently used  May change significantly the way in which we use SONET infrastructures u 10 GE WAN-PHY standard  Ethernet frames across OC-192 SONET networks  Ethernet as inexpensive linking technology between LANs and WANs  Supported by only a few vendors u LCAS/VCAT standards  Point-to-point circuit-based services  Transport capacity adjustments according to the traffic pattern.  “Bandwidth on Demand” becomes possible for SONET network  “Intelligent” optical multiplexers (Available at the end of 2005)

UltraLight: Developing Advanced Network Services for Data Intensive HEP Applications  UltraLight: a next-generation hybrid packet- and circuit- switched network infrastructure  Packet switched: cost effective solution; requires ultrascale protocols to share 10G efficiently and fairly  Circuit-switched: Scheduled or sudden “overflow” demands handled by provisioning additional wavelengths; Use path diversity, e.g. across the US, Atlantic, Canada,…  Extend and augment existing grid computing infrastructures (currently focused on CPU/storage) to include the network as an integral component  Using MonALISA to monitor and manage global systems  Partners: Caltech, UF, FIU, UMich, SLAC, FNAL, MIT/Haystack; CERN, NLR, CENIC, Internet2; Translight, UKLight, Netherlight; UvA, UCL, KEK, Taiwan  Strong support from Cisco

Ultralight 10GbE Caltech u L1, L2 and L3 services u Hybrid packet- and circuit-switched PoP u Control plane is L3

LambdaStation  A Joint Fermilab and Caltech project  Enabling HEP applications to send high throughput traffic between mass storage systems across advanced network paths  Dynamic Path Provisioning across Ultranet, NLR; Plus an Abilene “standard path”  DOE funded

Summary  For many years the Wide Area Network has been the bottleneck; this is no longer the case in many countries thus making deployment of a data intensive Grid infrastructure possible!  Some transport protocol issues still need to be resolved; however there are many encouraging signs that practical solutions may now be in sight.  1GByte/sec disk to disk challenge.  Today: 1 TB at 536 MB/sec from CERN to Caltech  Next generation network and Grid system: UltraLight and LambdaStation  The integrated, managed network  Extend and augment existing grid computing infrastructures (currently focused on CPU/storage) to include the network as an integral component.

Thanks&Questions