1 High Performance WAN Testbed Experiences & Results Les Cottrell – SLAC Prepared for the CHEP03, San Diego, March 2003

Slides:



Advertisements
Similar presentations
The DataTAG Project 25 March, Brussels FP6 Information Day Peter Clarke, University College London.
Advertisements

TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
FAST TCP Anwis Das Ajay Gulati Slides adapted from : IETF presentation slides Link:
Cheng Jin David Wei Steven Low FAST TCP: design and experiments.
Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 5, 2001.
1 High Performance Active End-to- end Network Monitoring Les Cottrell, Connie Logg, Warren Matthews, Jiri Navratil, Ajay Tirumala – SLAC Prepared for the.
High-Performance Throughput Tuning/Measurements Davide Salomoni & Steffen Luitz Presented at the PPDG Collaboration Meeting, Argonne National Lab, July.
1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
FAST TCP in Linux Cheng Jin David Wei
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
ESLEA Bedfont Lakes Dec 04 Richard Hughes-Jones Network Measurement & Characterisation and the Challenge of SuperComputing SC200x.
Presented by Anshul Kantawala 1 Anshul Kantawala FAST TCP: From Theory to Experiments C. Jin, D. Wei, S. H. Low, G. Buhrmaster, J. Bunn, D. H. Choe, R.
1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell Sun SuperG Spring 2005, April,
02 nd April 03Networkshop Managed Bandwidth Next Generation F. Saka UCL NETSYS (NETwork SYStems centre of excellence)
The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology High speed WAN data transfers for science Session Recent Results.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 26, 2001.
Technology for Using High Performance Networks or How to Make Your Network Go Faster…. Robin Tasker UK Light Town Meeting 9 September.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
1 Using Netflow data for forecasting Les Cottrell SLAC and Fawad Nazir NIIT, Presented at the CHEP06 Meeting, Mumbai India, February
Slide: 1 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 1 TCP/IP Overview & Performance Richard Hughes-Jones The University.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.
High TCP performance over wide area networks Arlington, VA May 8, 2002 Sylvain Ravot CalTech HENP Working Group.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
Parallel TCP Bill Allcock Argonne National Laboratory.
1 Overview of IEPM-BW - Bandwidth Testing of Bulk Data Transfer Tools Connie Logg & Les Cottrell – SLAC/Stanford University Presented at the Internet 2.
1 High performance Throughput Les Cottrell – SLAC Lecture # 5a presented at the 26 th International Nathiagali Summer College on Physics and Contemporary.
23 January 2003Paolo Moroni (Slide 1) SWITCH-cc meeting DataTAG overview.
Internet data transfer record between CERN and California Sylvain Ravot (Caltech) Paolo Moroni (CERN)
Iperf Quick Mode Ajay Tirumala & Les Cottrell. Sep 12, 2002 Iperf Quick Mode at LBL – Les Cottrell & Ajay Tirumala Iperf QUICK Mode Problem – Current.
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
TERENA Networking Conference, Zagreb, Croatia, 21 May 2003 High-Performance Data Transport for Grid Applications T. Kelly, University of Cambridge, UK.
Project Results Thanks to the exceptional cooperation spirit between the European and North American teams involved in the DataTAG project,
1 Achieving High Throughput on Fast Networks (Bandwidth Challenges and World Records) Les Cottrell & Yee-Ting Li Stanford Linear Accelerator Center Presented.
TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
30 June Wide Area Networking Performance Challenges Olivier Martin, CERN UK DTI visit.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
FAST Protocols for High Speed Network David netlab, Caltech For HENP WG, Feb 1st 2003.
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
1 Evaluation of Advanced TCP stacks on Fast Long-Distance production Networks Prepared by Les Cottrell & Hadrien Bullot, Richard Hughes-Jones EPFL, SLAC.
INDIANAUNIVERSITYINDIANAUNIVERSITY Status of FAST TCP and other TCP alternatives John Hicks TransPAC HPCC Engineer Indiana University APAN Meeting – Hawaii.
FAST TCP Cheng Jin David Wei Steven Low netlab.CALTECH.edu GNEW, CERN, March 2004.
S. Ravot, J. Bunn, H. Newman, Y. Xia, D. Nae California Institute of Technology CHEP 2004 Network Session September 1, 2004 Breaking the 1 GByte/sec Barrier?
1 FAST TCP for Multi-Gbps WAN: Experiments and Applications Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003
1 Achieving Record Speed Trans-Atlantic End-to-end TCP Throughput Les Cottrell – SLAC Prepared for the NANOG meeting, Salt Lake City, June 2003
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
R. Hughes-Jones Manchester
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
Networking between China and Europe
Networking for grid Network capacity Network throughput
CENIC Road to Ten Gigabit: Biggest Fastest in the West
TransPAC HPCC Engineer
High Speed File Replication
TCP Performance over a 2.5 Gbit/s Transatlantic Circuit
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
MB-NG Review High Performance Network Demonstration 21 April 2004
Wide Area Networking at SLAC, Feb ‘03
High Performance Active End-to-end Network Monitoring
Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
Breaking the Internet2 Land Speed Record: Twice
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
High-Performance Data Transport for Grid Applications
Presentation transcript:

1 High Performance WAN Testbed Experiences & Results Les Cottrell – SLAC Prepared for the CHEP03, San Diego, March Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), by the SciDAC base program.

2 Outline Who did it? What was done? How was it done? Who needs it? So what’s next? Where do I find out more?

3 Who did it: Collaborators and sponsors Caltech: Harvey Newman, Steven Low, Sylvain Ravot, Cheng Jin, Xiaoling Wei, Suresh Singh, Julian Bunn SLAC: Les Cottrell, Gary Buhrmaster, Fabrizio Coccetti LANL: Wu-chun Feng, Eric Weigle, Gus Hurwitz, Adam Englehart NIKHEF/UvA: Cees DeLaat, Antony Antony CERN: Olivier Martin, Paolo Moroni ANL: Linda Winkler DataTAG, StarLight, TeraGrid, SURFnet, NetherLight, Deutsche Telecom, Information Society Technologies Cisco, Level(3), Intel DoE, European Commission, NSF

4 What was done? Beat the Gbps limit for a single TCP stream across the Atlantic – transferred a TByte in an hour WhenFromToBottle- neck MTUStrea ms TCPThru- put Nov ’02 (SC02) Amste rdam Sunny- vale 1 Gbps9000B1Stand ard 923 Mbps Nov ’02 (SC02) Balti- more Sunny- vale 10 Gbps FAST8.6 Gbps Feb ‘03Sunny -vale Geneva2.5 Gbps 9000B1Stand ard 2.38 Gbps Set a new Internet2 TCP land speed record, 10,619 Tbit-meters/sec –(see With 10 streams achieved 8.6Gbps across US One Terabyte transferred in less than one hour

5 On February 27-28, over a Terabyte of data was transferred in 3700 seconds by S. Ravot of Caltech between the Level3 PoP in Sunnyvale, near SLAC, and CERN. The data passed through the TeraGrid router at StarLight from memory to memory as a single TCP/IP stream at an average rate of 2.38 Gbps (using large windows and 9KByte “jumbo” frames). This beat the former record by a factor of approximately 2.5, and used the US-CERN link at 99% efficiency. 10GigE Data Transfer Trial European Commission Original slide by: Olivier Martin, CERN

6 How was it done: Typical testbed 12*2cpu servers 4 disk servers GSRGSR 6*2cpu servers Sunnyvale 6*2cpu servers 4 disk servers OC192/POS (10Gbits/s) 2.5Gbits/s T640T640 Sunnyvale section deployed for SC2002 (Nov 02) (EU+US) Geneva Chicago SNV CHIAMS GVA > 10,000 km

7 Typical Components CPU –Pentium 4 (Xeon) with 2.4GHz cpu For GE used Syskonnect NIC For 10GE used Intel NIC –Linux or 20 Routers –Cisco GSR with OC192/POS & 1 and 10GE server interfaces (loaned, list > $1M) –Cisco 760x –Juniper T640 (Chicago) Level(3) OC192/POS fibers (loaned SNV-CHI monthly lease cost ~ $220K) Note bootees Earthquake strap GSR Heat sink Disk servers Compute servers

8 Challenges After a loss it can take over an hour for stock TCP (Reno) to recover to maximum throughput at 1Gbits/s –i.e. loss rate of 1 in ~ 2 Gpkts (3Tbits), or BER of 1 in 3.6*10 12 PCI bus limitations (66MHz * 64 bit = 4.2Gbits/s at best) At 2.5Gbits/s and 180msec RTT requires 120MByte window Some tools (e.g. bbcp) will not allow a large enough window – (bbcp limited to 2MBytes) Slow start problem at 1Gbits/s takes about 5-6 secs for 180msec link, –i.e. if want 90% of measurement in stable (non slow start), need to measure for 60 secs –need to ship >700MBytes at 1Gbits/s Sunnyvale-Geneva, 1500Byte MTU, stock TCP

9 Windows and Streams Well accepted that multiple streams (n) and/or big windows are important to achieve optimal throughput Effectively reduces impact of a loss by 1/n, and improves recovery time by 1/n Optimum windows & streams changes with changes (e.g. utilization) in path, hard to optimize n Can be unfriendly to others

10 Even with big windows (1MB) still need multiple streams with Standard TCP Above knee performance still improves slowly, maybe due to squeezing out others and taking more than fair share due to large number of streams Streams, windows can change during day, hard to optimize ANL, Caltech & RAL reach a knee (between 2 and 24 streams) above this gain in throughput slow

11 New TCP Stacks Reno (AIMD) based, loss indicates congestion –Back off less when see congestion –Recover more quickly after backing off Scalable TCP: exponential recovery –Tom Kelly, Scalable TCP: Improving Performance in Highspeed Wide Area Networks Submitted for publication, December High Speed TCP: same as Reno for low performance, then increase window more & more aggressively as window increases using a table Vegas based, RTT indicates congestion –Caltech FAST TCP, quicker response to congestion, but … Standard Scalable High Speed

12 Stock vs FAST TCP MTU=1500B Need to measure all parameters to understand effects of parameters, configurations: –Windows, streams, txqueuelen, TCP stack, MTU, NIC card –Lot of variables Examples of 2 TCP stacks –FAST TCP no longer needs multiple streams, this is a major simplification (reduces # variables to tune by 1) Stock TCP, 1500B MTU 65ms RTT FAST TCP, 1500B MTU 65ms RTT FAST TCP, 1500B MTU 65ms RTT

13 Jumbo frames Become more important at higher speeds: –Reduce interrupts to CPU and packets to process, reduce cpu utilization –Similar effect to using multiple streams (T. Hacker) Jumbo can achieve >95% utilization SNV to CHI or GVA with 1 or multiple stream up to Gbit/s Factor 5 improvement over single stream 1500B MTU throughput for stock TCP (SNV-CHI(65ms) & CHI-AMS(128ms)) Complementary approach to a new stack Deployment doubtful –Few sites have deployed –Not part of GE or 10GE standards Jumbos 1500B

14 TCP stacks with 1500B txqueuelen

15 Jumbo frames, new TCP stacks at 1 Gbits/s SNV-GVA

16 Other gotchas Large windows and large number of streams can cause last stream to take a long time to close. Linux memory leak Linux TCP configuration caching What is the window size actually used/reported 32 bit counters in iperf and routers wrap, need latest releases with 64bit counters Effects of txqueuelen (number of packets queued for NIC) Routers do not pass jumbos Performance differs between drivers and NICs from different manufacturers –May require tuning a lot of parameters

17 Who needs it? HENP – current driver Data intensive science: –Astrophysics, Global weather, Fusion, sesimology Industries such as aerospace, medicine, security … Future: –Media distribution Gbits/s=2 full length DVD movies/minute 2.36Gbits/s is equivalent to –Transferring a full CD in 2.3 seconds (i.e CDs/hour) –Transferring 200 full length DVD movies in one hour (i.e. 1 DVD in 18 seconds) Will sharing movies be like sharing music today?

18 What’s next? Break 2.5Gbits/s limit Disk-to-disk throughput & useful applications –Need faster cpus (extra 60% MHz/Mbits/s over TCP for disk to disk), understand how to use multi-processors Evaluate new stacks with real-world links, and other equipment –Other NICs –Response to congestion, pathologies –Fairnesss –Deploy for some major (e.g. HENP/Grid) customer applications Understand how to make 10GE NICs work well with 1500B MTUs

19 More Information Internet2 Land Speed Record Publicity –www-iepm.slac.stanford.edu/lsr/www-iepm.slac.stanford.edu/lsr/ –www-iepm.slac.stanford.edu/lsr2/www-iepm.slac.stanford.edu/lsr2/ 10GE tests –www-iepm.slac.stanford.edu/monitoring/bulk/10ge/www-iepm.slac.stanford.edu/monitoring/bulk/10ge/ –sravot.home.cern.ch/sravot/Networking/10GbE/10GbE_test.htmlsravot.home.cern.ch/sravot/Networking/10GbE/10GbE_test.html TCP stacks –netlab.caltech.edu/FAST/netlab.caltech.edu/FAST/ –datatag.web.cern.ch/datatag/pfldnet2003/papers/kelly.pdfdatatag.web.cern.ch/datatag/pfldnet2003/papers/kelly.pdf – Stack comparisons –www-iepm.slac.stanford.edu/monitoring/bulk/fast/www-iepm.slac.stanford.edu/monitoring/bulk/fast/ –

20 Impact on others