Download presentation
Presentation is loading. Please wait.
Published byJasmin Ashlynn Hardy Modified over 8 years ago
1
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen, David Quarrie, Brian Tierney, Craig Tull HCG/NERSC/LBNL CHEP 2003 La Jolla, CA - March 24, 2003
2
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) The Problem The Atlas Athena Framework has a large number of components When running in a Grid environment, and something goes wrong (e.g.: the job runs slower than expected or crashes) it is very difficult to determine which component is at fault Constant, verbose logging generates too much information Solution: We are using NetLogger and pyGMA to instrument and monitor Athena
3
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Athena/GAUDI Architecture Converter Algorithm Event Data Service Persistency Service Data Files Algorithm Transient Event Store Detec. Data Service Persistency Service Data Files Transient Detector Store Message Service JobOptions Service Particle Prop. Service Other Services Histogram Service Persistency Service Data Files Transient Histogram Store Application Manager Converter
4
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Grid Testbed Topologies (2002) EDG Testbed (star) US ATLAS (mesh) NorduGrid (mesh)
5
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Review: Grid Monitoring Architecture (GMA): Terminology and Architecture (Performance) Event: —Typed collection of data with a specific structure Producer Interface: —makes performance data (events) available Consumer Interface: —receives performance data (events) Directory Service: —supports information publication and discovery —must be distributed and/or replicated http://www.ggf.org/Documents/GFD/GFD-I.7.pdf
6
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Athena Distributed Instrumentation Part of SuperComputing 2002 ATLAS demo IGMASvc IMonitorSvc extension? —Abstract application monitoring service. NetLogger (http://www-didc.lbl.gov/NetLogger/) —End-to-End Monitoring & Analysis of Distributed Systems —C, C++, Java, Python, Perl, Tcl APIs —Web Service Activation Prophesy (http://prophesy.mcs.anl.gov/) —An Infrastructure for Analyzing & Modeling the Performance of Parallel & Distributed Applications —Normally a Parse & auto-instrument approach (C & FORTRAN).
7
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) DIDC Technologies Used LBNL's Data Intensive Distributed Computing Group NetLogger provides —Easy to use instrumentation library —Ability to correlate data from varies sources based on time —Easy way to collect data from multiple clients/servers reliably —Visualization and analysis tools pyGMA provides —Easy to use producer and consumer python library for constructing GGF-defined GMA services Activation Service provides —Ability to remotely trigger and collect monitoring data in running Grid applications
8
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) NetLogger Toolkit DIDC have developed the NetLogger Toolkit (short for Networked Application Logger), which includes: —tools to make it easy for distributed applications to log interesting events at every critical point NetLogger client library (C, C++, Java, Perl, Python) —tools for host and network monitoring —event visualization tools that allow one to correlate application events with host/network events —NetLogger event archive and retrieval tools (new) NetLogger combines network, host, and application-level monitoring to provide a complete view of the entire system. Open Source (http://www-didc.lbl.gov/NetLogger/)
9
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) GMASvc Service Typical Athena Abstract Interface design. —Dual Use Library Linking Algorithms, etc & Loading DL —Concrete implementation using NetLogger —Properties to adjust: NetLogger: On/Off/Level, Distinguished User Name, Activation Service —Controlled by Environment Variables. —Use in Algorithms, Converters, StoreGate Store/Retreive, etc. GMAAuditor —Typical Athena Auditor bracketing standard Algorthm methods (initialize, execute, finalize)
10
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Atlas Athena Monitoring Activation: SC02 Demo
11
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Activation Service Architecture
12
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Activation Service GUI
13
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) NetLogger Analysis: Key Concepts NetLogger visualization tools are based on time correlated and object correlated events. —precision timestamps (default = microsecond) If applications specify an “object ID” for related events, this allows the NetLogger visualization tools to generate an object “lifeline” In order to associate a group of events into a “lifeline”, you must assign an “Event ID” to each NetLogger event —Sample Event ID: file name, block ID, frame ID, etc.
14
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) NLV Athena Example
15
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Completed Tasks Instrumented several Athena components with NetLogger Developed prototype activation service Developed prototype interface to the activation service for Athena monitoring events Demonstrated at SC02
16
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Current Work We are now working on expanding on the components used in the SC02 demo —Develop a “proof of concept” general purpose Grid troubleshooting architecture in concert with GANGA, Athena, DOE Science Grid Tasks include —Further integration of Atlas Software with Globus (Large ITR work related) —Further NetLogger instrumentation of Globus, GANGA, and Athena —Redesign of activation service for increased performance —Integration with Karlo Berket’s scalable and secure peer-to-peer resource discovery service will be used to locate producers
17
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) For More Information NetLogger: http://www-didc.lbl.gov/NetLogger/http://www-didc.lbl.gov/NetLogger/ SC02 Demo: http://annwm.lbl.gov/henp/meet/sc02_nov02/ http://annwm.lbl.gov/henp/meet/sc02_nov02/ Athena: http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWA RE/OO/architecture/General/index.html http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWA RE/OO/architecture/General/index.html Email: BLTierney@LBL.GOV, CETull@LBL.GOVBLTierney@LBL.GOVCETull@LBL.GOV
18
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Extra Slides if you want more details
19
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Monitoring Components
20
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Activation Service
21
CETull@lbl.gov - GMA Athena (24mar03 - CHEP 2003 @ La Jolla, CA) Ganglia Cluster Monitoring
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.