Download presentation
Presentation is loading. Please wait.
Published byJudith Lucas Modified over 9 years ago
1
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 1 Grid Monitoring Services Robin Middleton RAL/PPD24-May-01
2
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 2 Overview What is Monitoring ? GGF Perf-WG DataGrid WP3 Example : Netlogger Summary
3
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 3 Introduction Information Services part dealt with separately today DataGrid WorkPackage 3 (WP3) UK leadership / responsibility WP3 = Grid Monitoring AND Information Services Global Grid Forum - Perf Mon Workgroup http://www-didc.lbl.gov/GridPerf/
4
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 4 What is Monitoring ? Application performance Fabric availability Network availability / performance Event / Alert Archives Forecasting (e.g NWS) Issues update/read frequency information streaming hierarchical.vs. relational relaxed coherence; timestamps scalable; non-invasive non-repeatable Monitoring.vs. Monitoring & Information ?
5
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 5 Boundaries Mass Storage Computing Fabric Network Monitoring Application Workload Mgt DataMan End-Users Sys/Grid-Admin
6
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 6 GGF : Perf-WG “The Grid Performance working group is focused on defining standards and best practices for the gathering, representation, storage, distribution, and query of performance information about Grid resources and applications.” Four Projects (!) 1.Define a schema for data formats for performance monitoring. This would be a common interchange format that tools could use to interoperate. 2.Taxonomy / classification of performance monitoring and analysis tools. 3.Survey of existing tools classified by the above taxonomy. 4.Recommendations on the aspects of grid applications, services and resources that should be monitored. 5.The development of performance monitoring tools based upon the survey of tools.
7
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 7 GGF Perf-WG : Use Cases 1: Instrumented library for performance measurement (e.g. I/O system) 2: Netlogger/DPSS monitoring streams to log file 3: JAMM (Java) sensors stream data to a GUI 4: JAMM/Port Monitor 5: Fault detection & analysis 6: Job progress monitoring 7: Distributed system performance analysis 8: Network-aware, self-tuning applications 9: Data replication (choice of “best” location) 10: Scheduling & prediction services 11: Auditing systems 12: Configuration monitoring 13: User application monitoring 14: Application self-tuning 15: Real-time adaptive simulation & presentation
8
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 8 DataGrid : WorkPackage 3 The aim of this workpackage is to specify, develop, integrate and test tools and infrastructure to enable end-user and administrator access to status and error information in a Grid environment and to provide an environment in which application monitoring can be carried out. This will permit both job performance optimisation as well as allowing for problem tracing and is crucial to facilitating high performance Grid computing.
9
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 9 Architecture (GGF : Perf-WG) Architecture (GGF : Perf-WG) Producer Sensor Host - A Sensor Host - B Consumer Directory Service Producer Publish Subscribe Discovery
10
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 10 WP3 : Tasks Umbrellas Task 3.1: Requirements & Design (month 1-12) Task 3.2: Current Technology (month 1-12) Task 3.3: Infrastructure (month 7-24) Task 3.4: Analysis & Presentation (month 7-24) Task 3.5: Test & Refinement (month 19-36)
11
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 11 WP3 : Deliverables (as in the TA) D3.1 (Report) Month 12: Evaluation Report of current technology D3.2 (Report) Month 9 : Detailed architectural design report and evaluation criteria (also input to WP12 architecture deliverable) D3.3 (Prototype) Month 9: Components and documentation for the First Project Release (see WP 6) D3.4 (Prototype) Month 21: Components and documentation for the Second Project Release (see WP 6) D3.5 (Prototype) Month 33: Components and documentation for the Final Project Release (see WP 6) D3.6 (Report) Month 36: Final evaluation report
12
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 12 WP3 : Milestones (as in the TA) M3.1 Month 6: Decide baseline architecture & technologies. M3.2 Month 9: Provide requirements for collation by Project Architect M3.3 Month 9: Prototype components integrated into First Project release (see WP 6) M3.4 Month 21: Interim components integrated into Second Project Release (see WP 6) M3.5 Month 33: Final components integrated into Final Project Release (see WP 6)
13
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 13 WP3 : First Release (PM9) Information services based on a new version of the Globus MDS (soon to be in alpha release).Information services based on a new version of the Globus MDS (soon to be in alpha release). Rudimentary implementation of a relational approach to information services.Rudimentary implementation of a relational approach to information services. A set of APIs in support of both MDS and GMA approaches.A set of APIs in support of both MDS and GMA approaches. Basic presentation of performance monitoring data based around Netlogger
14
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 14 WP3 : Effort FundedUnfundedTotal PPARC3.01.834.83 SZTAKI (HU)2.080.923.0 INFN (IT)0.01.161.16 IBM-UK1.00.01.0 Total6.083.9110.0 + Trinity College Dublin (NB : for both Monitoring and Information Services )
15
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 15 WP3 : Use Cases WP3 : Use Cases Fault Detection & Analysis, Heartbeats [5] Job Status & Progress Monitoring [6] Application Performance Monitoring [1,13] Performance Analysis of Distributed Systems [7] Scheduling Services and Self Tuning Applications [8,10,14,(] Scheduling Services and Self Tuning Applications [8,10,14,(15)] Data Replication Services [9] Accounting & Auditing [11] Configuration monitoring [12]
16
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 16 WP3 : Decisions (end 2000) Try to track standards & best practice from Global Grid Forum evaluate, steer, adopt, … Other WPs should provide the majority of sensors network, fabric, mass-storage WP3 will provide the instrumentation API Key deliverables will be Performance Services Error / Alert Services Status / Parameter Services Logging / Archival Services (forecasting) - information to enable other WPs to do this WP3 subcontracts archival services (in terms of the data management aspects) ?
17
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 17 Netlogger Supervisor Processing Node Readout Buffer Acknowledgement : Weidong Li
18
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 18 Netlogger Supervisor Processing Node Readout Buffer Acknowledgement : Weidong Li
19
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 19 Sequence Diagram Supervisor Readout Buffer Processing Node 12 3 5 4 6 7 Request Fetch Data Return data Result TIMETIME
20
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 20 Results 1 2 3 4 1 2 3 4 5 6 7 5 6 7 X : secs Y : “count” Acknowledgment : Weidong Li
21
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 21 Netlogger Summary Example deployment Time resolution NTP (~5ms) Custom h/w (~50 s) Thread safety ? Variety of visualisation methods “non-invasive” ? Moving towards the GMA e.g. integration of directory service
22
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 22 Summary Information Service is KEY to Monitoring …and nature of service to be determined ! Unified Information Architecture is important …otherwise duplication and inconsistencies Align with Global Grid Forum for “standards”, etc. Starting point is Netlogger DataGrid deliverable details are testbed “driven” Cross-DataGrid WP - service to many areas
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.