Presentation is loading. Please wait.

Presentation is loading. Please wait.

GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.

Similar presentations


Presentation on theme: "GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard."— Presentation transcript:

1 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard Hughes-Jones

2 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 2 Network Monitoring is Essential  End2End Time Series  Throughput UDP/TCP  Rtt  Packet loss  Passive Monitoring  Routers Switches SNMP MRTG  Historical MRTG  Packet/Protocol Dynamics  tcpdump  web100  Output from Application tools  Detect or X-check problem reports  Isolate / determine a performance issue  Capacity planning  Publication of data: network “cost” for middleware  RBs for optimized matchmaking  WP2 Replica Manager  Capacity planning  SLA verification  Isolate / determine throughput bottleneck – work with real user problems  Test conditions for Protocol/HW investigations  Protocol performance / development  Hardware performance / development  Application analysis  Input to middleware – eg gridftp throughput  Isolate / determine a (user) performance issue  Hardware / protocol investigations

3 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 3 Multi-Gigabit transfers are possible and stable 10 GigEthernet at SC2003 BW Challenge  Three Server systems with 10 GigEthernet NICs  Used the DataTAG altAIMD stack 9000 byte MTU  Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to:  Pal Alto PAIX  rtt 17 ms, window 30 MB  Shared with Caltech booth  4.37 Gbit hstcp I=5%  Then 2.87 Gbit I=16%  Fall corresponds to 10 Gbit on link  3.3Gbit Scalable I=8%  Tested 2 flows sum 1.9Gbit I=39%  Chicago Starlight  rtt 65 ms, window 60 MB  Phoenix CPU 2.2 GHz  3.1 Gbit hstcp I=1.6%  Amsterdam SARA  rtt 175 ms, window 200 MB  Phoenix CPU 2.2 GHz  4.35 Gbit hstcp I=6.9%  Very Stable  Both used Abilene to Chicago

4 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 4 The performance of the end host / disks is really important BaBar Case Study: RAID Throughput & PCI Activity  3Ware 7500-8 RAID5 parallel EIDE  3Ware forces PCI bus to 33 MHz  BaBar Tyan to MB-NG SuperMicro Network mem-mem 619 Mbit/s  Disk – disk throughput bbcp 40-45 Mbytes/s (320 – 360 Mbit/s)  PCI bus effectively full!  User throughput ~ 250 Mbit/s  User surprised !! Read from RAID5 Disks Write to RAID5 Disks

5 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 5 Application design – Throughput + Web100  2Gbyte file transferred RAID0 disks  Web100 output every 10 ms  Gridftp  See alternate 600/800 Mbit and zero MB - NG  Apachie web server + curl-based client  See steady 720 Mbit

6 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 6  Network Monitoring is vital  Development of new TCP stacks and non-TCP protocols is required  Multi-Gigabit transfers are possible and stable on current networks  Complementary provision of packet IP & λ-Networks is needed  The performance of the end host / disks is really important  Application design can determine Perceived Network Performance  Helping Real Users is a must – can be harder than herding cats  Cooperation between Network providers, Network Researchers, and Network Users has been impressive  Standards (eg GGF / IETF) are the way forward  Many grid projects just assume the network will work !!!  It takes lots of co-operation to put all the components together

7 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 7

8 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 8 Tuning PCI-X: Variation of mmrbc IA32 mmrbc 1024 bytes mmrbc 2048 bytes mmrbc 4096 bytes mmrbc 512 bytes CSR Access PCI-X Sequence Data Transfer Interrupt & CSR Update  16080 byte packets every 200 µs  Intel PRO/10GbE LR Adapter  PCI-X bus occupancy vs mmrbc  Plot:  Measured times  Times based on PCI-X times from the logic analyser  Expected throughput

9 GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 9  “A Hierarchy of Network Performance Characteristics for Grid Applications and Services”  Document defines terms & relations:  Network characteristics  Measurement methodologies  Observation  Discusses Nodes & Paths  For each Characteristic  Defines the meaning  Attributes that SHOULD be included  Issues to consider when making an observation  Status:  Originally submitted to GFSG as Community Practice Document draft-ggf-nmwg-hierarchy-00.pdf Jul 2003  Revised to Proposed Recommendation http://www-didc.lbl.gov/NMWG/docs/draft-ggf-nmwg-hierarchy-02.pdf 7 Jan 04  Now in 60 day Public comment from 28 Jan 04 – 18 days to go. GGF: Hierarchy Characteristics Document


Download ppt "GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard."

Similar presentations


Ads by Google