MAGGIE Monitoring and Analysis for the Global Grid and Internet End-to-end performance Warren Matthews (SLAC) Presented at the Measurement SIG ESCC/Internet2.

Slides:



Advertisements
Similar presentations
Web100 at SLAC Presented at the Web100 Workshop, Boulder, CO, August 2002.
Advertisements

HEPiX Edinburgh 28 May 2004 LCG les robertson - cern-it-1 Data Management Service Challenge Scope Networking, file transfer, data management Storage management.
1 High Performance Active End-to- end Network Monitoring Les Cottrell, Connie Logg, Warren Matthews, Jiri Navratil, Ajay Tirumala – SLAC Prepared for the.
1 IEPM-BWIEPM-BW Warren Matthews (SLAC) Presented at the UCL Monitoring Infrastructure Workshop, London, May 15-16, 2003.
1 End-to-end Monitoring of High Performance Network Paths Les Cottrell, Connie Logg, Jerrod Williams SLAC, for the ESCC meeting, Columbus Ohio, July 2004.
1 Traceanal: a tool for analyzing and representing traceroutes Les Cottrell, Connie Logg, Ruchi Gupta, Jiri Navratil SLAC, for the E2Epi BOF, Columbus.
1 Internet End-to-end Monitoring Project at SLAC Les Cottrell, Connie Logg, Jerrod Williams, Gary Buhrmaster Site visit to SLAC by DoE program managers.
1 Correlating Internet Performance & Route Changes to Assist in Trouble- shooting from an End-user Perspective Les Cottrell, Connie Logg, Jiri Navratil.
1 SLAC Internet Measurement Data Les Cottrell, Jerrod Williams, Connie Logg, Paola Grosso SLAC, for the ISMA Workshop, SDSC June,
1 Evaluation of Techniques to Detect Significant Performance Problems using End-to-end Active Network Measurements Les Cottrell, SLAC 2006 IEEE/IFIP Network.
MAGGIE NIIT- SLAC On Going Projects Measurement & Analysis of Global Grid & Internet End to end performance.
PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003.
1 Terapaths: Datagrid WAN Network Monitoring Infrastructure Les Cottrell, Connie Logg, Jerrod Williams SLAC, for the DoE 2004 PI Network Research Meeting,
1 IEPM-BW a new network/application throughput performance measurement infrastructure Les Cottrell – SLAC Presented at the GGF4 meeting, Toronto Feb 20-21,
Measurement and Fault-Finding Using MAGGIE and PIPES. Presented at the HENP SIG Internet2 Members Meeting, Indianapolis, October Paola Grosso (SLAC)
Network Monitoring grid network performance measurement, simulation & analysis Presented by Warren Matthews at the Performance.
What we have learned from developing and running ABwE Jiri Navratil, Les R.Cottrell (SLAC)
User-Perceived Performance Measurement on the Internet Bill Tice Thomas Hildebrandt CS 6255 November 6, 2003.
ESnet Abilene 3+3 Measurements Presented at the Joint Techs Meeting in Columbus July 19 th 2004 Joe Metzger ESnet Network Engineer
GEANT Performance Monitoring Infrastructure – Joint Techs meeting July Nicolas Simar GEANT’s Performance Monitoring.
Monitoring: Grid, Fabric, Network Jennifer M. Schopf, Argonne National Lab PPDG Review 28 April 2003, Fermilab.
1 End-to-end Monitoring of High Performance Network Paths Les Cottrell, Connie Logg, Jerrod Williams, Jiri Navratil, SLAC, for the ESCC meeting, Columbus.
LAN and WAN Monitoring at SLAC Connie Logg September 21, 2005.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
1 Using Netflow data for forecasting Les Cottrell SLAC and Fawad Nazir NIIT, Presented at the CHEP06 Meeting, Mumbai India, February
IEPM-BW Deployment Experiences Connie Logg SLAC Joint Techs Workshop February 4-9, 2006.
EMERGE Tech Agenda Oct 7, :30-11Vision (Applications focus) (TD) 11-12:30Describe GARA + Test Results (VS) 12:30-1:30Lunch 1:30-2:15GARNET demo.
Measurement & Analysis of Global Grid & Internet End to end performance (MAGGIE) Network Performance Measurement.
1 ESnet/HENP Active Internet End-to-end Performance & ESnet/University performance Les Cottrell – SLAC Presented at the ESSC meeting Albuquerque, August.
1 Overview of IEPM-BW - Bandwidth Testing of Bulk Data Transfer Tools Connie Logg & Les Cottrell – SLAC/Stanford University Presented at the Internet 2.
E2Epi piPEs Update Eric L. Boyd. 2 Decomposing the Monolithic Measurement Architecture.
1 Network Measurement Summary ESCC, Feb Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
1 Internet End-to-end Monitoring Project - Overview Les Cottrell – SLAC/Stanford University Partially funded by DOE/MICS Field Work Proposal on Internet.
February 6-8, 2006[Joint Techs] Albuquerque, NM Performance Tool Development: NLANR Network Performance Advisor J. W. Ferguson NCSA.
1 SLAC IEPM PingER and BW monitoring & tools PingER Presented by Les Cottrell, SLAC At LBNL, Jan 21,
IEPM. Warren Matthews (SLAC) Presented at the ESCC Meeting Miami, FL, February 2003.
1 High Performance Network Monitoring Challenges for Grids Les Cottrell, SLAC Presented at the International Symposium on Grid Computing 2006, Taiwan
13-Oct-2003 Internet2 End-to-End Performance Initiative: piPEs Eric Boyd, Matt Zekauskas, Internet2 International.
Interoperable Measurement Frameworks: Joint Monitoring of GEANT & Abilene Eric L. Boyd, Internet2 Nicolas Simar, DANTE.
1 MAGGIE Monitoring and Analysis for the Global Grid and Internet End-to-end performance Warren Matthews Stanford Linear Accelerator Center (SLAC)
Mark Leese Daresbury Laboratory GridMon EGEE JRA4 R-GMA workshop Thursday 22 nd July 2004 University College London Mark Leese.
1 WAN Monitoring Prepared by Les Cottrell, SLAC, for the Joint Engineering Taskforce Roadmap Workshop JLab April 13-15,
BOF Discussion: Uploading IEPM-BW data to MonALISA Connie Logg SLAC Winter 2006 ESCC/Internet2 Joint Techs Workshop ESCCInternet2ESCCInternet2 February.
1 FAST TCP for Multi-Gbps WAN: Experiments and Applications Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003
Advanced Network Diagnostic Tools Richard Carlson EVN-NREN workshop.
1 PingER6 Preliminary PingER Monitoring Results from the 6Bone/6REN. Warren Matthews Les Cottrell.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
1 Deploying Measurement Systems in ESnet Joint Techs, Feb Joseph Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
Toward a Measurement Infrastructure. Warren Matthews (SLAC) Presented at the e2e Workshop Miami, FL, February 2003.
IEPM-BW (or PingER on steroids) and the PPDG
LCG Operations Centres
Warren Matthews and Les Cottrell (SLAC)
Using Netflow data for forecasting
Wide Area Networking at SLAC, Feb ‘03
High Performance Active End-to-end Network Monitoring
My Experiences, results and remarks to TCP BW and CT Measurements Tools Jiří Navrátil SLAC.
Experiences in Traceroute and Available Bandwidth Change Analysis
Experiences in Traceroute and Available Bandwidth Change Analysis
E2E piPEs Overview Eric L. Boyd Internet2 24 February 2019.
SLAC monitoring Web Services
Advanced Networking Collaborations at SLAC
IEPM. Warren Matthews (SLAC)
Wide-Area Networking at SLAC
Correlating Internet Performance & Route Changes to Assist in Trouble-shooting from an End-user Perspective Les Cottrell, Connie Logg, Jiri Navratil SLAC.
PIPE Dreams Trouble Shooting Network Performance for Production Science Data Grids Presented by Warren Matthews at CHEP’03, San Diego March 24-28, 2003.
Interoperable Measurement Frameworks: Internet2 E2E piPEs and NLANR Advisor Eric L. Boyd Internet2 17 April 2019.
Internet2 E2E piPEs Project
Net Rat Network Reliability and Troubleshooting.
Warren Matthews (SLAC) Presented at the PIPEfitters Breakfast,
AMP Update and OC192 Monitor
Presentation transcript:

MAGGIE Monitoring and Analysis for the Global Grid and Internet End-to-end performance Warren Matthews (SLAC) Presented at the Measurement SIG ESCC/Internet2 Techs Workshop. Lawrence, Kansas, August 3-7, 2003.

2 What is MAGGIE? MAGGIE –Measurement and Analysis for the Global Grid and Internet End-to-end performance A proposal for a monitoring infrastructure for network/grid management –DoE/MICS –Worldwide HENP/Grid Closely related to e2epi and PIPES

3 Toward a Monitoring Infrastructure Certainly the need –DOE Science Community –Grid –Troubleshooting –E2Epi has recognized the need. Many of the ingredients –Many monitoring projects, Many tools –PIPES

4 Network Management “Unfortunately, network management research has historically been very under- funded, because it is difficult to get funding bodies to recognize this as legitimate networking research.” Sally Floyd IAB Concerns & Recommendations Regarding Internet Research & Evolution.

5 MAGGIEMAGGIE MAGGIE NIMI Security and scheduling IEPM-BW Measurement Engine Publishing Fault Finding Analysis Engine Other tools NMWG AMP RIPE SLAC FNAL PSCICIR LBNL SLAC ANL SCIDAC UCL

6 IEPM-BWIEPM-BW SLAC package for monitoring and analysis Currently 10 monitoring sites SLAC, FNAL, GATech (SOX), INFN (Milan), NIKHEF, APAN (Japan) UMich, Internet2 (Michigan), UManchester, UCL (Both UK) 2-36 targets

7 SNV SLAC CHI ESnet NY Stanford CalREN NERSC LANL JLAB TRIUMF KEK Abilene SLAC SNV FNAL ANL NIKHEF CERN IN2P3 CERN CALTECH SDSC BNL JAnet HSTN SEA ATL CLV IPLS RAL UCL UManc DL NNW NY Rice UTDallas NCSA UMich I2 SOX UFL APAN RIKEN INFN-Roma INFN-Milan CESnet APAN Geant EDG PPDG/GriPhyN Monitoring Site ORNL

8 Measurement Engine Ping, Traceroute Iperf, Bbftp, Bbcp (mem and disk) Abwe Gridftp, UDPmon Web100 Passive (netflow)

9

10 Analysis Engine Publishing –Usual method is on the web –Too much to view frequently, Time delay (resolve problem before user complains) Alarm System based on based Web/Grid Service –GGF NMWG Schema Detect performance hits without human intervention Find location of fault as a starting point –PIPES contact database

11 TroubleshootingTroubleshooting RIPE-TT Testbox alarm –Rolling average (morning-afternoon- evening-night) AMP Automatic Event Detection –Mean and variance Our approach is diurnal changes –Median and standard deviation of measurements on Monday 7pm-8pm

12 Case Study Iperf measurements between SLAC and Internet2 office in Ann Arbor. Determine base line. What should be considered a performance hit. If you don’t measure, you don’t know (Kevin Walsh)

13

14 Busiest Routes 26,969 traceroutes 116 routes, 9 seen >1%, 3 seen >4% SLAC Stanford CalREN dnvr-sunv kscy-dnvripls-kscyclev-ipls mich.net Thunderbird sunvng-sunv kscyng-sunvng iplsng-kscyng mich.net Route 19 – 3447 (12.8%) Route 60 – 2865 (10.6%) Route 70 – (54.3%)

15

16

17 All Data Good Bad Better

18

19

20 ConclusionConclusion Not a network problem! Systematic error Competing periodic test/transfer Fix it! What if I don’t/can’t? Users will hit this even if I move the test

21 Mean=85.97 Mbps Median=90 Mbps Std Dev=15.1 Mbps

22 Concern Threshold (ctresh ) Median – 1 std dev Alarm Threshold (atresh) Median – 2 std dev Concern Alarm Calm Long Term

23 Calm Concern Alarm

24 To be continued … Determine statistics for all end-to-end pairs Long-term (all data), Medium term (last 30 days) and Short term (last 5 measurements) Not manually.

25 Next Steps Continue publishing Web/Grid Services –GGF/NMWG Workshop Parameters for Autoshooting NetRat (fault finding) Monalisa (visualization) –IEPM-BW is plugged in, also w/service interface Advisor Funding for MAGGIE

26 LinksLinks IEPM-BW ABwE AMP NIMI RIPE-TT SLAC Web Services GGF NMWG AMP TroubleShooting

27 CreditsCredits Les Cottrell, Connie Logg, Jerrod Williams, Jiri Navratil, Fabrizio Coccetti Frank Nagy, Maxim Grigoriev Brian Tierney Eric Boyd, Jeff Boote Vern Paxson, Andy Adams Tony McGregor Iosif Legrand Local Admins and other volunteers DoE/MICS