GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.

Slides:



Advertisements
Similar presentations
GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations.
Advertisements

Storage System Integration with High Performance Networks Jon Bakken and Don Petravick FNAL.
TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
MB-NG Review – 24 April 2004 Richard Hughes-Jones The University of Manchester, UK MB-NG Review High Performance Network Demonstration 21 April 2004.
MB - NG MB-NG Technical Meeting 03 May 02 R. Hughes-Jones Manchester 1 Task2 Traffic Generation and Measurement Definitions Pass-1.
DataTAG CERN Oct 2002 R. Hughes-Jones Manchester Initial Performance Measurements With DataTAG PCs Gigabit Ethernet NICs (Work in progress Oct 02)
High Speed Total Order for SAN infrastructure Tal Anker, Danny Dolev, Gregory Greenman, Ilya Shnaiderman School of Engineering and Computer Science The.
CALICE, Mar 2007, R. Hughes-Jones Manchester 1 Protocols Working with 10 Gigabit Ethernet Richard Hughes-Jones The University of Manchester
JIVE VLBI Network Meeting 15 Jan 2003 R. Hughes-Jones Manchester The EVN-NREN Project Richard Hughes-Jones The University of Manchester.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 End-2-End Network Monitoring What do we do ? What do we use it for? Richard Hughes-Jones Many people.
Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004 R. Hughes-Jones Manchester Networking for ATLAS Remote Farms Richard Hughes-Jones The University.
GridPP meeting Feb 03 R. Hughes-Jones Manchester WP7 Networking Richard Hughes-Jones.
CAIDA Bandwidth Estimation Meeting San Diego June 2002 R. Hughes-Jones Manchester 1 EU DataGrid - Network Monitoring Richard Hughes-Jones, University of.
CdL was here DataTAG/WP7 Amsterdam June 2002 R. Hughes-Jones Manchester 1 EU DataGrid - Network Monitoring Richard Hughes-Jones, University of Manchester.
Slide: 1 Richard Hughes-Jones PFLDnet2005 Lyon Feb 05 R. Hughes-Jones Manchester 1 Investigating the interaction between high-performance network and disk.
DataGrid WP7 Meeting CERN April 2002 R. Hughes-Jones Manchester Some Measurements on the SuperJANET 4 Production Network (UK Work in progress)
JIVE VLBI Network Meeting 28 Jan 2004 R. Hughes-Jones Manchester Brief Report on Tests Related to the e-VLBI Project Richard Hughes-Jones The University.
T2UK RAL 15 Mar 2006, R. Hughes-Jones Manchester 1 ATLAS Networking & T2UK Richard Hughes-Jones The University of Manchester then.
CALICE UCL, 20 Feb 2006, R. Hughes-Jones Manchester 1 10 Gigabit Ethernet Test Lab PCI-X Motherboards Related work & Initial tests Richard Hughes-Jones.
RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.
DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG.
EDG WP7 Networking Demonstration uDemonstration sending HEP data CERN to SARA Multiple streams of TCP packets Tuned TCP parameters: ifconfig eth0 txqueuelen.
Internet Bandwidth Measurement Techniques Muhammad Ali Dec 17 th 2005.
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
© 2006 Open Grid Forum Interactions Between Networks, Protocols & Applications HPCN-RG Richard Hughes-Jones OGF20, Manchester, May 2007,
NMWG GGF10 Berlin March 2004 R. Hughes-Jones Manchester Network Measurements Working Group Chairs:Richard Hughes-Jones Brian Tierney Eric Boyd ISP Area.
Slide: 1 Richard Hughes-Jones CHEP2004 Interlaken Sep 04 R. Hughes-Jones Manchester 1 Bringing High-Performance Networking to HEP users Richard Hughes-Jones.
ESLEA Bedfont Lakes Dec 04 Richard Hughes-Jones Network Measurement & Characterisation and the Challenge of SuperComputing SC200x.
CdL was here DataTAG CERN Sep 2002 R. Hughes-Jones Manchester 1 European Topology: NRNs & Geant SuperJANET4 CERN UvA Manc SURFnet RAL.
MB - NG MB-NG Meeting UCL 17 Jan 02 R. Hughes-Jones Manchester 1 Discussion of Methodology for MPLS QoS & High Performance High throughput Investigations.
02 nd April 03Networkshop Managed Bandwidth Next Generation F. Saka UCL NETSYS (NETwork SYStems centre of excellence)
13th-14th July 2004 University College London End-user systems: NICs, MotherBoards, TCP Stacks & Applications Richard Hughes-Jones.
Sven Ubik, Petr Žejdl CESNET TNC2008, Brugges, 19 May 2008 Passive monitoring of 10 Gb/s lines with PC hardware.
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
Technology for Using High Performance Networks or How to Make Your Network Go Faster…. Robin Tasker UK Light Town Meeting 9 September.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester 1 TCP/IP and Other Transports for High Bandwidth Applications TCP/IP on High Performance.
Slide: 1 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 1 TCP/IP Overview & Performance Richard Hughes-Jones The University.
Scavenger performance Cern External Network Division - Caltech Datagrid WP January, 2002.
GridPP Collaboration Meeting Networking: Current Status Robin Tasker CCLRC, Daresbury Laboratory 3 June 2004.
Summer School, Brasov, Romania, July 2005, R. Hughes-Jones Manchester1 TCP/IP and Other Transports for High Bandwidth Applications TCP/IP on High Performance.
MB - NG MB-NG Meeting Dec 2001 R. Hughes-Jones Manchester MB – NG SuperJANET4 Development Network SuperJANET4 Production Network Leeds RAL / UKERNA RAL.
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
ESLEA-FABRIC Technical Meeting, 1 Sep 2006, R. Hughes-Jones Manchester 1 Multi-Gigabit Trials on GEANT Collaboration with Dante. Richard Hughes-Jones The.
Online-Offsite Connectivity Experiments Catalin Meirosu *, Richard Hughes-Jones ** * CERN and Politehnica University of Bucuresti ** University of Manchester.
Experiences Tuning Cluster Hosts 1GigE and 10GbE Paul Hyder Cooperative Institute for Research in Environmental Sciences, CU Boulder Cooperative Institute.
1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
DataGrid WP7 Meeting Amsterdam Nov 01 R. Hughes-Jones Manchester 1 UDPmon Measuring Throughput with UDP  Send a burst of UDP frames spaced at regular.
Networkshop March 2005 Richard Hughes-Jones Manchester Bandwidth Challenge, Land Speed Record, TCP/IP and You.
Xmas Meeting, Manchester, Dec 2006, R. Hughes-Jones Manchester 1 ATLAS TDAQ Networking, Remote Compute Farms & Evaluating SFOs Richard Hughes-Jones The.
MB - NG MB-NG Meeting UCL 17 Jan 02 R. Hughes-Jones Manchester 1 Discussion of Methodology for MPLS QoS & High Performance High throughput Investigations.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Development of a QoE Model Himadeepa Karlapudi 03/07/03.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
The EU DataTAG Project Richard Hughes-Jones Based on Olivier H. Martin GGF3 Frascati, Italy Oct 2001.
Networking and the Grid Ahmed Abdelrahim NeSC NeSC PPARC e-Science Summer School 10 th May 2005.
Run-time Adaptation of Grid Data Placement Jobs George Kola, Tevfik Kosar and Miron Livny Condor Project, University of Wisconsin.
CRISP WP18, High-speed data recording Krzysztof Wrona, European XFEL PSI, 18 March 2013.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
MB MPLS MPLS Technical Meeting Sep 2001 R. Hughes-Jones Manchester SuperJANET Development Network Testbed – Cisco GSR SuperJANET4 C-PoP – Cisco GSR.
Realization of a stable network flow with high performance communication in high bandwidth-delay product network Y. Kodama, T. Kudoh, O. Tatebe, S. Sekiguchi.
R. Hughes-Jones Manchester
Networking between China and Europe
Networking for grid Network capacity Network throughput
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
MB-NG Review High Performance Network Demonstration 21 April 2004
Wide Area Networking at SLAC, Feb ‘03
MB – NG SuperJANET4 Development Network
Presentation transcript:

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard Hughes-Jones

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 2 Network Monitoring is Essential  End2End Time Series  Throughput UDP/TCP  Rtt  Packet loss  Passive Monitoring  Routers Switches SNMP MRTG  Historical MRTG  Packet/Protocol Dynamics  tcpdump  web100  Output from Application tools  Detect or X-check problem reports  Isolate / determine a performance issue  Capacity planning  Publication of data: network “cost” for middleware  RBs for optimized matchmaking  WP2 Replica Manager  Capacity planning  SLA verification  Isolate / determine throughput bottleneck – work with real user problems  Test conditions for Protocol/HW investigations  Protocol performance / development  Hardware performance / development  Application analysis  Input to middleware – eg gridftp throughput  Isolate / determine a (user) performance issue  Hardware / protocol investigations

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 3 Multi-Gigabit transfers are possible and stable 10 GigEthernet at SC2003 BW Challenge  Three Server systems with 10 GigEthernet NICs  Used the DataTAG altAIMD stack 9000 byte MTU  Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to:  Pal Alto PAIX  rtt 17 ms, window 30 MB  Shared with Caltech booth  4.37 Gbit hstcp I=5%  Then 2.87 Gbit I=16%  Fall corresponds to 10 Gbit on link  3.3Gbit Scalable I=8%  Tested 2 flows sum 1.9Gbit I=39%  Chicago Starlight  rtt 65 ms, window 60 MB  Phoenix CPU 2.2 GHz  3.1 Gbit hstcp I=1.6%  Amsterdam SARA  rtt 175 ms, window 200 MB  Phoenix CPU 2.2 GHz  4.35 Gbit hstcp I=6.9%  Very Stable  Both used Abilene to Chicago

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 4 The performance of the end host / disks is really important BaBar Case Study: RAID Throughput & PCI Activity  3Ware RAID5 parallel EIDE  3Ware forces PCI bus to 33 MHz  BaBar Tyan to MB-NG SuperMicro Network mem-mem 619 Mbit/s  Disk – disk throughput bbcp Mbytes/s (320 – 360 Mbit/s)  PCI bus effectively full!  User throughput ~ 250 Mbit/s  User surprised !! Read from RAID5 Disks Write to RAID5 Disks

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 5 Application design – Throughput + Web100  2Gbyte file transferred RAID0 disks  Web100 output every 10 ms  Gridftp  See alternate 600/800 Mbit and zero MB - NG  Apachie web server + curl-based client  See steady 720 Mbit

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 6  Network Monitoring is vital  Development of new TCP stacks and non-TCP protocols is required  Multi-Gigabit transfers are possible and stable on current networks  Complementary provision of packet IP & λ-Networks is needed  The performance of the end host / disks is really important  Application design can determine Perceived Network Performance  Helping Real Users is a must – can be harder than herding cats  Cooperation between Network providers, Network Researchers, and Network Users has been impressive  Standards (eg GGF / IETF) are the way forward  Many grid projects just assume the network will work !!!  It takes lots of co-operation to put all the components together

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 7

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 8 Tuning PCI-X: Variation of mmrbc IA32 mmrbc 1024 bytes mmrbc 2048 bytes mmrbc 4096 bytes mmrbc 512 bytes CSR Access PCI-X Sequence Data Transfer Interrupt & CSR Update  byte packets every 200 µs  Intel PRO/10GbE LR Adapter  PCI-X bus occupancy vs mmrbc  Plot:  Measured times  Times based on PCI-X times from the logic analyser  Expected throughput

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 9  “A Hierarchy of Network Performance Characteristics for Grid Applications and Services”  Document defines terms & relations:  Network characteristics  Measurement methodologies  Observation  Discusses Nodes & Paths  For each Characteristic  Defines the meaning  Attributes that SHOULD be included  Issues to consider when making an observation  Status:  Originally submitted to GFSG as Community Practice Document draft-ggf-nmwg-hierarchy-00.pdf Jul 2003  Revised to Proposed Recommendation 7 Jan 04  Now in 60 day Public comment from 28 Jan 04 – 18 days to go. GGF: Hierarchy Characteristics Document