Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester1 TCP/IP and Other Transports for High Bandwidth Applications TCP/IP on High Performance.

Slides:



Advertisements
Similar presentations
TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
Advertisements

MB - NG MB-NG Technical Meeting 03 May 02 R. Hughes-Jones Manchester 1 Task2 Traffic Generation and Measurement Definitions Pass-1.
CALICE, Mar 2007, R. Hughes-Jones Manchester 1 Protocols Working with 10 Gigabit Ethernet Richard Hughes-Jones The University of Manchester
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 1 Protocols Working with 10 Gigabit Ethernet Richard Hughes-Jones The University.
Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004 R. Hughes-Jones Manchester Networking for ATLAS Remote Farms Richard Hughes-Jones The University.
GridPP meeting Feb 03 R. Hughes-Jones Manchester WP7 Networking Richard Hughes-Jones.
CdL was here DataTAG/WP7 Amsterdam June 2002 R. Hughes-Jones Manchester 1 EU DataGrid - Network Monitoring Richard Hughes-Jones, University of Manchester.
ESLEA Technical Collaboration Meeting, Jun 2006, R. Hughes-Jones Manchester 1 Protocols Recent and Current Work. Richard Hughes-Jones The University.
PFLDnet, Nara, Japan 2-3 Feb 2006, R. Hughes-Jones Manchester 1 Transport Benchmarking Panel Discussion Richard Hughes-Jones The University of Manchester.
5 Annual e-VLBI Workshop, September 2006, Haystack Observatory R. Hughes-Jones Manchester 1 The Network Transport layer and the Application or TCP/IP.
Slide: 1 Richard Hughes-Jones PFLDnet2005 Lyon Feb 05 R. Hughes-Jones Manchester 1 Investigating the interaction between high-performance network and disk.
DataGrid WP7 Meeting CERN April 2002 R. Hughes-Jones Manchester Some Measurements on the SuperJANET 4 Production Network (UK Work in progress)
JIVE VLBI Network Meeting 28 Jan 2004 R. Hughes-Jones Manchester Brief Report on Tests Related to the e-VLBI Project Richard Hughes-Jones The University.
1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
T2UK RAL 15 Mar 2006, R. Hughes-Jones Manchester 1 ATLAS Networking & T2UK Richard Hughes-Jones The University of Manchester then.
CALICE UCL, 20 Feb 2006, R. Hughes-Jones Manchester 1 10 Gigabit Ethernet Test Lab PCI-X Motherboards Related work & Initial tests Richard Hughes-Jones.
GEANT2 Network Performance Workshop, Jan 200, R. Hughes-Jones Manchester 1 TCP/IP Masterclass or So TCP works … but still the users ask: Where is.
Networkshop Apr 2006, R. Hughes-Jones Manchester 1 Bandwidth Challenges or "How fast can we really drive a Network?" Richard Hughes-Jones The University.
DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG.
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
© 2006 Open Grid Forum Interactions Between Networks, Protocols & Applications HPCN-RG Richard Hughes-Jones OGF20, Manchester, May 2007,
Slide: 1 Richard Hughes-Jones CHEP2004 Interlaken Sep 04 R. Hughes-Jones Manchester 1 Bringing High-Performance Networking to HEP users Richard Hughes-Jones.
ESLEA Bedfont Lakes Dec 04 Richard Hughes-Jones Network Measurement & Characterisation and the Challenge of SuperComputing SC200x.
CdL was here DataTAG CERN Sep 2002 R. Hughes-Jones Manchester 1 European Topology: NRNs & Geant SuperJANET4 CERN UvA Manc SURFnet RAL.
5 Annual e-VLBI Workshop, September 2006, Haystack Observatory R. Hughes-Jones Manchester 1 TCP/IP on High Bandwidth Long Distance Paths or So TCP.
MB - NG MB-NG Meeting UCL 17 Jan 02 R. Hughes-Jones Manchester 1 Discussion of Methodology for MPLS QoS & High Performance High throughput Investigations.
02 nd April 03Networkshop Managed Bandwidth Next Generation F. Saka UCL NETSYS (NETwork SYStems centre of excellence)
13th-14th July 2004 University College London End-user systems: NICs, MotherBoards, TCP Stacks & Applications Richard Hughes-Jones.
Slide: 1 Richard Hughes-Jones Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester 1 TCP/IP and Other Transports for High Bandwidth Applications.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology High speed WAN data transfers for science Session Recent Results.
EVN-NREN Meeting, Zaandan, 31 Oct 2006, R. Hughes-Jones Manchester 1 FABRIC 4 Gigabit Work & VLBI-UDP Performance and Stability. Richard Hughes-Jones The.
Technology for Using High Performance Networks or How to Make Your Network Go Faster…. Robin Tasker UK Light Town Meeting 9 September.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester 1 TCP/IP and Other Transports for High Bandwidth Applications TCP/IP on High Performance.
Slide: 1 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 1 TCP/IP Overview & Performance Richard Hughes-Jones The University.
FABRIC Meeting, Poznan Poland, 25 Sep 2006, R. Hughes-Jones Manchester 1 Broadband Protocols WP IP protocols, Lambda switching, multicasting Richard.
Slide: 1 Richard Hughes-Jones Mini-Symposium on Optical Data Networking, August 2005, R. Hughes-Jones Manchester 1 Using TCP/IP on High Bandwidth Long.
High TCP performance over wide area networks Arlington, VA May 8, 2002 Sylvain Ravot CalTech HENP Working Group.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
ESLEA VLBI Bits&Bytes Workshop, 4-5 May 2006, R. Hughes-Jones Manchester 1 VLBI Data Transfer Tests Recent and Current Work. Richard Hughes-Jones The University.
MB - NG MB-NG Meeting Dec 2001 R. Hughes-Jones Manchester MB – NG SuperJANET4 Development Network SuperJANET4 Production Network Leeds RAL / UKERNA RAL.
Slide: 1 Richard Hughes-Jones IEEE Real Time 2005 Stockholm, 4-10 June, R. Hughes-Jones Manchester 1 Investigating the Network Performance of Remote Real-Time.
ESLEA-FABRIC Technical Meeting, 1 Sep 2006, R. Hughes-Jones Manchester 1 Multi-Gigabit Trials on GEANT Collaboration with Dante. Richard Hughes-Jones The.
CAIDA Bandwidth Estimation Meeting San Diego June 2002 R. Hughes-Jones Manchester UDPmon and TCPstream Tools to understand Network Performance Richard.
1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
TERENA Networking Conference, Zagreb, Croatia, 21 May 2003 High-Performance Data Transport for Grid Applications T. Kelly, University of Cambridge, UK.
Networkshop March 2005 Richard Hughes-Jones Manchester Bandwidth Challenge, Land Speed Record, TCP/IP and You.
TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
MB - NG MB-NG Meeting UCL 17 Jan 02 R. Hughes-Jones Manchester 1 Discussion of Methodology for MPLS QoS & High Performance High throughput Investigations.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
INDIANAUNIVERSITYINDIANAUNIVERSITY Status of FAST TCP and other TCP alternatives John Hicks TransPAC HPCC Engineer Indiana University APAN Meeting – Hawaii.
ESLEA VLBI Bits&Bytes Workshop, 31 Aug 2006, R. Hughes-Jones Manchester 1 vlbi_udp Throughput Performance and Stability. Richard Hughes-Jones The University.
S. Ravot, J. Bunn, H. Newman, Y. Xia, D. Nae California Institute of Technology CHEP 2004 Network Session September 1, 2004 Breaking the 1 GByte/sec Barrier?
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester 1 How do transport protocols affect applications & The relative importance of different.
1 FAST TCP for Multi-Gbps WAN: Experiments and Applications Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003
5 Annual e-VLBI Workshop, September 2006, Haystack Observatory R. Hughes-Jones Manchester 1 TCP/IP on High Bandwidth Long Distance Paths or So TCP.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
R. Hughes-Jones Manchester
Networking between China and Europe
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
MB-NG Review High Performance Network Demonstration 21 April 2004
Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
MB – NG SuperJANET4 Development Network
High-Performance Data Transport for Grid Applications
Presentation transcript:

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester1 TCP/IP and Other Transports for High Bandwidth Applications TCP/IP on High Performance Networks Richard Hughes-Jones University of Manchester

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester2 The Bandwidth Challenge at SC2003 uThe peak aggregate bandwidth from the 3 booths was 23.21Gbits/s u1-way link utilisations of >90% u6.6 TBytes in 48 minutes

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester3 Multi-Gigabit flows at SC2003 BW Challenge u Three Server systems with 10 Gigabit Ethernet NICs u Used the DataTAG altAIMD stack 9000 byte MTU u Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to: Pal Alto PAIX rtt 17 ms, window 30 MB Shared with Caltech booth 4.37 Gbit HighSpeed TCP I=5% Then 2.87 Gbit I=16% Fall when 10 Gbit on link 3.3Gbit Scalable TCP I=8% Tested 2 flows sum 1.9Gbit I=39% Chicago Starlight rtt 65 ms, window 60 MB Phoenix CPU 2.2 GHz 3.1 Gbit HighSpeed TCP I=1.6% Amsterdam SARA rtt 175 ms, window 200 MB Phoenix CPU 2.2 GHz 4.35 Gbit HighSpeed TCP I=6.9% Very Stable Both used Abilene to Chicago

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester4 uSCINet Collaboration at SC2004 uSetting up the BW Bunker uThe BW Challenge at the SLAC Booth uWorking with S2io, Sun, Chelsio

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester5 The Bandwidth Challenge – SC2004 uThe peak aggregate bandwidth from the booths was Gbits/s uThat is 3 full length DVDs per second ! u4 Times greater that SC2003 ! uSaturated TEN 10Gigabit Ethernet waves uSLAC Booth: Sunnyvale to Pittsburgh, LA to Pittsburgh and Chicago to Pittsburgh (with UKLight).

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester6 Just a Well Engineered End-to-End Connection End-to-End “no loss” environment NO contention, NO sharing on the end-to-end path Processor speed and system bus characteristics TCP Configuration – window size and frame size (MTU) Tuned PCI-X bus Tuned Network Interface Card driver A single TCP connection on the end-to-end path Memory-to-Memory transfer no disk system involved No real user application (but did file transfers!!) Not a typical User or Campus situation BUT … So what’s the matter with TCP – Did we cheat? Internet Regional Campus Client Server Campu s Client Server UK Light From Robin Tasker

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester7 TCP (Reno) – What’s the problem? uTCP has 2 phases: Slowstart Probe the network to estimate the Available BW Exponential growth Congestion Avoidance Main data transfer phase – transfer rate glows “slowly” uAIMD and High Bandwidth – Long Distance networks Poor performance of TCP in high bandwidth wide area networks is due in part to the TCP congestion control algorithm. For each ack in a RTT without loss: cwnd -> cwnd + a / cwnd- Additive Increase, a=1 For each window experiencing loss: cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½ uPacket loss is a killer !!

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester8 TCP (Reno) – Details uTime for TCP to recover its throughput from 1 lost packet given by: u for rtt of ~200 ms: 2 min UK 6 ms Europe 20 ms USA 150 ms

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester9 Investigation of new TCP Stacks uThe AIMD Algorithm – Standard TCP (Reno) For each ack in a RTT without loss: cwnd -> cwnd + a / cwnd- Additive Increase, a=1 For each window experiencing loss: cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½ uHigh Speed TCP a and b vary depending on current cwnd using a table a increases more rapidly with larger cwnd – returns to the ‘optimal’ cwnd size sooner for the network path b decreases less aggressively and, as a consequence, so does the cwnd. The effect is that there is not such a decrease in throughput. uScalable TCP a and b are fixed adjustments for the increase and decrease of cwnd a = 1/100 – the increase is greater than TCP Reno b = 1/8 – the decrease on loss is less than TCP Reno Scalable over any link speed. uFast TCP Uses round trip time as well as packet loss to indicate congestion with rapid convergence to fair equilibrium for throughput. uHSTCP-LP, H-TCP, BiC-TCP

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester10 Packet Loss with new TCP Stacks uTCP Response Function Throughput vs Loss Rate – further to right: faster recovery Drop packets in kernel MB-NG rtt 6ms DataTAG rtt 120 ms

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester11 Packet Loss and new TCP Stacks uTCP Response Function UKLight London-Chicago-London rtt 177 ms Kernel Agreement with theory good

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester12 Topology of the MB – NG Network Key Gigabit Ethernet 2.5 Gbit POS Access MPLS Admin. Domains UCL Domain Edge Router Cisco 7609 man01 man03 Boundary Router Cisco 7609 RAL Domain Manchester Domain lon02 man02 ral01 UKERNA Development Network Boundary Router Cisco 7609 ral02 lon03 lon01 HW RAID

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester13 SC2004 UKLIGHT Overview MB-NG 7600 OSR Manchester ULCC UKLight UCL HEP UCL network K2 Ci Chicago Starlight Amsterdam SC2004 Caltech Booth UltraLight IP SLAC Booth Cisco 6509 UKLight 10G Four 1GE channels UKLight 10G Surfnet/ EuroLink 10G Two 1GE channels NLR Lambda NLR-PITT-STAR-10GE-16 K2 Ci Caltech 7600

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester14 High Throughput Demonstrations Manchester (Geneva) man03lon Gbit SDH MB-NG Core 1 GEth Cisco GSR Cisco 7609 Cisco 7609 London (Chicago) Dual Zeon 2.2 GHz Send data with TCP Drop Packets Monitor TCP with Web100

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester15 uDrop 1 in 25,000 urtt 6.2 ms uRecover in 1.6 s High Performance TCP – MB-NG StandardHighSpeed Scalable

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester16 High Performance TCP – DataTAG uDifferent TCP stacks tested on the DataTAG Network u rtt 128 ms uDrop 1 in 10 6 uHigh-Speed Rapid recovery uScalable Very fast recovery uStandard Recovery would take ~ 20 mins

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester17 FAST demo via OMNInet and Datatag J. Mambretti, F. Yeh (Northwestern) t OMNInet Nortel Passport 8600 Nortel Passport 8600 Photonic Switch NU-E (Leverone) Workstations 2 x GE StarLight-Chicago CalTech Cisco x GE Photonic Switch Alcatel GE Alcatel x GE OC-48 DataTAG 2 x GE Workstations CERN -Geneva San Diego FAST display CERN Cisco ,000 km A. Adriaanse, C. Jin, D. Wei (Caltech) S. Ravot (Caltech/CERN) FAST Demo Cheng Jin, David Wei Caltech Layer 2 path Layer 2/3 path

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester18 FAST TCP vs newReno è Channel #1 : newReno è Channel #2: FAST Utilization: 70% Utilization: 90% 90%

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester19 Is TCP fair? a look at Round Trip Times & Max Transfer Unit

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester20 MTU and Fairness uTwo TCP streams share a 1 Gb/s bottleneck uRTT=117 ms uMTU = 3000 Bytes ; Avg. throughput over a period of 7000s = 243 Mb/s uMTU = 9000 Bytes; Avg. throughput over a period of 7000s = 464 Mb/s uLink utilization : 70,7 % Starlight (Chi) CERN (GVA) RR GbE Switch Host #1 POS 2.5 Gbps 1 GE Host #2 Host #1 Host #2 1 GE Bottleneck Sylvain Ravot DataTag 2003

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester21 RTT and FairnessSunnyvale Starlight (Chi) CERN (GVA) RR GbE Switch Host #1 POS 2.5 Gb/s 1 GE Host #2 Host #1 Host #2 1 GE Bottleneck R POS 10 Gb/s R 10GE uTwo TCP streams share a 1 Gb/s bottleneck uCERN Sunnyvale RTT=181ms ; Avg. throughput over a period of 7000s = 202Mb/s uCERN Starlight RTT=117ms; Avg. throughput over a period of 7000s = 514Mb/s uMTU = 9000 bytes uLink utilization = 71,6 % Sylvain Ravot DataTag 2003

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester22 Is TCP fair? Do TCP Flows Share the Bandwidth ?

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester23 uChose 3 paths from SLAC (California) Caltech (10ms), Univ Florida (80ms), CERN (180ms) uUsed iperf/TCP and UDT/UDP to generate traffic uEach run was 16 minutes, in 7 regions Test of TCP Sharing: Methodology (1Gbit/s) Ping 1/s Iperf or UDT ICMP/ping traffic TCP/UDP bottleneck iperf SLAC Caltech/UFL/CERN 2 mins 4 mins Les Cottrell PFLDnet 2005

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester24 uLow performance on fast long distance paths AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion) Net effect: recovers slowly, does not effectively use available bandwidth, so poor throughput Unequal sharing TCP Reno single stream Congestion has a dramatic effect Recovery is slow Increase recovery rate SLAC to CERN RTT increases when achieves best throughput Les Cottrell PFLDnet 2005 Remaining flows do not take up slack when flow removed

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester25 Fast uAs well as packet loss, FAST uses RTT to detect congestion RTT is very stable: σ(RTT) ~ 9ms vs 37±0.14ms for the others SLAC-CERN Big drops in throughput which take several seconds to recover from 2 nd flow never gets equal share of bandwidth

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester26 Hamilton TCP uOne of the best performers Throughput is high Big effects on RTT when achieves best throughput Flows share equally Appears to need >1 flow to achieve best throughput Two flows share equally SLAC-CERN > 2 flows appears less stable

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester27 SC2004 & Transfers with UKLight A Taster for Lambda & Packet Switched Hybrid Networks

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester28 Transatlantic Ethernet: TCP Throughput Tests uSupermicro X5DPE-G2 PCs uDual 2.9 GHz Xenon CPU FSB 533 MHz u1500 byte MTU u2.6.6 Linux Kernel uMemory-memory TCP throughput uStandard TCP uWire rate throughput of 940 Mbit/s uFirst 10 sec uWork in progress to study: Implementation detail Advanced stacks Effect of packet loss Sharing

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester29 SC2004 Disk-Disk bbftp (work in progress) ubbftp file transfer program uses TCP/IP uUKLight: Path:- London-Chicago-London; PCs:- Supermicro +3Ware RAID0 uMTU 1500 bytes; Socket size 22 Mbytes; rtt 177ms; SACK off uMove a 2 Gbyte file uWeb100 plots: uStandard TCP uAverage 825 Mbit/s u(bbcp: 670 Mbit/s) uScalable TCP uAverage 875 Mbit/s u(bbcp: 701 Mbit/s ~4.5s of overhead) uDisk-TCP-Disk at 1Gbit/s is here!

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester30 uSuper Computing Bandwidth Challenge gives opportunity to make world-wide High performance tests. uLand Speed Record shows what can be achieved with state of the art kit uStandard TCP not optimum for high throughput long distance links uPacket loss is a killer for TCP Check on campus links & equipment, and access links to backbones Users need to collaborate with the Campus Network Teams Dante Pert uNew stacks are stable give better response & performance Still need to set the TCP buffer sizes ! Check other kernel settings e.g. window-scale maximum Watch for “TCP Stack implementation Enhancements” uHost is critical think Server quality not Supermarket PC uMotherboards NICs, RAID controllers and Disks matter NIC should use 64 bit 133 MHz PCI-X 66 MHz PCI can be OK but 32 bit 33 MHz is too slow for Gigabit rates Worry about the CPU-Memory bandwidth as well as the PCI bandwidth Data crosses the memory bus at least 3 times Separate the data transfers – use motherboards with multiple 64 bit PCI-X buses Choose a modern high throughput RAID controller Consider SW RAID0 of RAID5 HW controllers uUsers are now able to perform sustained 1 Gbit/s transfers Summary, Conclusions & Thanks MB - NG

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester31 More Information Some URLs uUKLight web site: uMB-NG project web site: uDataTAG project web site: uUDPmon / TCPmon kit + writeup: uMotherboard and NIC Tests: & “Performance of 1 and 10 Gigabit Ethernet Cards with Server Quality Motherboards” FGCS Special issue uTCP tuning information may be found at: & uTCP stack comparisons: “Evaluation of Advanced TCP Stacks on Fast Long-Distance Production Networks” Journal of Grid Computing 2004 uPFLDnet uDante PERT

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester32 Any Questions?

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester33 Backup Slides

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester34 10 Gigabit Ethernet: UDP Throughput Tests u1500 byte MTU gives ~ 2 Gbit/s uUsed byte MTU max user length uDataTAG Supermicro PCs uDual 2.2 GHz Xenon CPU FSB 400 MHz uPCI-X mmrbc 512 bytes uwire rate throughput of 2.9 Gbit/s uCERN OpenLab HP Itanium PCs uDual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz uPCI-X mmrbc 512 bytes uwire rate of 5.7 Gbit/s uSLAC Dell PCs giving a uDual 3.0 GHz Xenon CPU FSB 533 MHz uPCI-X mmrbc 4096 bytes uwire rate of 5.4 Gbit/s

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester35 10 Gigabit Ethernet: Tuning PCI-X u16080 byte packets every 200 µs uIntel PRO/10GbE LR Adapter uPCI-X bus occupancy vs mmrbc Measured times Times based on PCI-X times from the logic analyser Expected throughput ~7 Gbit/s Measured 5.7 Gbit/s mmrbc 1024 bytes mmrbc 2048 bytes mmrbc 4096 bytes 5.7Gbit/s mmrbc 512 bytes CSR Access PCI-X Sequence Data Transfer Interrupt & CSR Update

Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester36 10 Gigabit Ethernet: SC2004 TCP Tests uSun AMD opteron compute servers v20z uChelsio TOE Tests between Linux hosts 10 Gbit ethernet link from SC2004 to CENIC/NLR/Level(3) PoP in Sunnyvale Two 2.4GHz AMD 64 bit Opteron processors with 4GB of RAM at SC B MTU, all Linux in one direction 9.43G i.e. 9.07G goodput and the reverse direction 5.65G i.e. 5.44G goodput Total of 15+G on wire. 10 Gbit ethernet link from SC2004 to ESnet/QWest PoP in Sunnyvale One 2.4GHz AMD 64 bit Opteron each end 2MByte window, 16 streams, 1500B MTU, all Linux in one direction 7.72Gbit/s i.e Gbit/s goodput 120mins (6.6Tbits shipped) uS2io NICs with Solaris 10 in 4*2.2GHz Opteron cpu v40z to one or more S2io or Chelsio NICs with Linux or in 2*2.4GHz V20Zs LAN 1 S2io NIC back to back: 7.46 Gbit/s LAN 2 S2io in V40z to 2 V20z : each NIC ~6 Gbit/s total Gbit/s