GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 1 Grid Monitoring Services Robin Middleton RAL/PPD24-May-01.

Slides:



Advertisements
Similar presentations
24-May-01D.P.Kelsey, GridPP WG E: Security1 GridPP Work Group E Security Development David Kelsey CLRC/RAL, UK
Advertisements

WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
WP2: Data Management Gavin McCance University of Glasgow.
Fabric and Storage Management GridPP Fabric and Storage Management GridPP 24/24 May 2001.
Grid Application Builders Teach In31/01/02Antony Wilson Information & Monitoring Services WP3.
Enabling Access to Sound Archives through Integration, Enrichment and Retrieval WP1. Project Management.
Software Quality Assurance Plan
Grid Monitoring Discussion Dantong Yu BNL. Overview Goal Concept Types of sensors User Scenarios Architecture Near term project Discuss topics.
PAGIS: An Architecture for Programming on the Grid Andrew Wendelborn Distributed & High Performance Computing Group Department of Computer Science, University.
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
A conceptual model of grid resources and services Authors: Sergio Andreozzi Massimo Sgaravatto Cristina Vistoli Presenter: Sergio Andreozzi INFN-CNAF Bologna.
EU-GRID Work Program Massimo Sgaravatto – INFN Padova Cristina Vistoli – INFN Cnaf as INFN members of the EU-GRID technical team.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Cracow Grid Workshop, November 5-6, 2001 Towards the CrossGrid Architecture Marian Bubak, Marek Garbacz, Maciej Malawski, and Katarzyna Zając.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
The CrossGrid project Juha Alatalo Timo Koivusalo.
CrossGrid WP3 Task 3.3 Grid Monitoring Trinity College Dublin (TCD, AC14 - CR11) Brian Coghlan, Stuart Kenny CYFRONET Academic Computer Centre, Krakow.
Task 3.5 Tests and Integration ( Wp3 kick-off meeting, Poznan, 29 th -30 th January 2002 Santiago González de la.
CrossGrid Task 3.3 Grid Monitoring Trinity College Dublin (TCD) Brian Coghlan Paris MAR-2002.
Workload Management Massimo Sgaravatto INFN Padova.
Grid Monitoring By Zoran Obradovic CSE-510 October 2007.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
08/11/908 WP2 e-NMR Grid deployment and operations Technical Review in Brussels, 8 th of December 2008 Marco Verlato.
GridPP9 – 5 February 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC WP2+5: Data and Storage Management.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
DataGrid is a project funded by the European Commission under contract IST GridPP-2 Middleware 4 th -5 th Mar 2004 Information and Monitoring.
Olof Bärring – WP4 summary- 6/3/ n° 1 Partner Logo WP4 report Status, issues and plans
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
An Integrated Instrumentation Architecture for NGI Applications Ian Foster, Darcy Quesnel, Steven Tuecke Argonne National Laboratory The University of.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Resource Management Working Group SSS Quarterly Meeting November 28, 2001 Dallas, Tx.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
DataGrid is a project funded by the European Commission under contract IST rd EU Review – 19-20/02/2004 WP1 activity, achievements and plans.
May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
Grid Workload Management Massimo Sgaravatto INFN Padova.
InterGrid Meeting 7 Oct 2001 Tony Doyle. Tony Doyle - University of Glasgow GridPP Status  Financial Background  Deliverables  Recruitment  Regional.
GridPP Presentation to AstroGrid 13 December 2001 Steve Lloyd Queen Mary University of London.
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
CLRC and the European DataGrid Middleware Information and Monitoring Services The current information service is built on the hierarchical database OpenLDAP.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
WP3 Information and Monitoring Rob Byrom / WP3
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,
An Active Security Infrastructure for Grids Stuart Kenny*, Brian Coghlan Trinity College Dublin.
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
DataTAG is a project funded by the European Union International School on Grid Computing, 23 Jul 2003 – n o 1 GridICE The eyes of the grid PART I. Introduction.
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
Bob Jones – Project Architecture - 1 March n° 1 Project Architecture, Middleware and Delivery Schedule Bob Jones Technical Coordinator, WP12, CERN.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
14 June 2001LHCb workshop at Bologna1 LHCb and Datagrid - Status and Planning F Harris(Oxford)
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
WP2: Data Management Gavin McCance University of Glasgow.
Workload Management Workpackage
Gavin McCance University of Glasgow GridPP2 Workshop, UCL
Fabric and Storage Management
WP1 activity, achievements and plans
University of Technology
Report on GLUE activities 5th EU-DataGRID Conference
Wide Area Workload Management Work Package DATAGRID project
Global Grid Forum (GGF) Orientation
I Datagrid Workshop- Marseille C.Vistoli
Presentation transcript:

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Grid Monitoring Services Robin Middleton RAL/PPD24-May-01

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Overview  What is Monitoring ?  GGF Perf-WG  DataGrid WP3  Example : Netlogger  Summary

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Introduction  Information Services part dealt with separately today  DataGrid WorkPackage 3 (WP3)  UK leadership / responsibility  WP3 = Grid Monitoring AND Information Services  Global Grid Forum - Perf Mon Workgroup 

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May What is Monitoring ?  Application performance  Fabric availability  Network availability / performance  Event / Alert  Archives  Forecasting (e.g NWS)  Issues  update/read frequency  information streaming  hierarchical.vs. relational  relaxed coherence; timestamps  scalable; non-invasive  non-repeatable  Monitoring.vs. Monitoring & Information ?

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Boundaries Mass Storage Computing Fabric Network Monitoring Application Workload Mgt DataMan End-Users Sys/Grid-Admin

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May GGF : Perf-WG  “The Grid Performance working group is focused on defining standards and best practices for the gathering, representation, storage, distribution, and query of performance information about Grid resources and applications.”  Four Projects (!) 1.Define a schema for data formats for performance monitoring. This would be a common interchange format that tools could use to interoperate. 2.Taxonomy / classification of performance monitoring and analysis tools. 3.Survey of existing tools classified by the above taxonomy. 4.Recommendations on the aspects of grid applications, services and resources that should be monitored. 5.The development of performance monitoring tools based upon the survey of tools.

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May GGF Perf-WG : Use Cases 1: Instrumented library for performance measurement (e.g. I/O system) 2: Netlogger/DPSS monitoring streams to log file 3: JAMM (Java) sensors stream data to a GUI 4: JAMM/Port Monitor 5: Fault detection & analysis 6: Job progress monitoring 7: Distributed system performance analysis 8: Network-aware, self-tuning applications 9: Data replication (choice of “best” location) 10: Scheduling & prediction services 11: Auditing systems 12: Configuration monitoring 13: User application monitoring 14: Application self-tuning 15: Real-time adaptive simulation & presentation

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May DataGrid : WorkPackage 3 The aim of this workpackage is to specify, develop, integrate and test tools and infrastructure to enable end-user and administrator access to status and error information in a Grid environment and to provide an environment in which application monitoring can be carried out. This will permit both job performance optimisation as well as allowing for problem tracing and is crucial to facilitating high performance Grid computing.

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Architecture (GGF : Perf-WG) Architecture (GGF : Perf-WG) Producer Sensor Host - A Sensor Host - B Consumer Directory Service Producer Publish Subscribe Discovery

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May WP3 : Tasks  Umbrellas  Task 3.1: Requirements & Design (month 1-12)  Task 3.2: Current Technology (month 1-12)  Task 3.3: Infrastructure (month 7-24)  Task 3.4: Analysis & Presentation (month 7-24)  Task 3.5: Test & Refinement (month 19-36)

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May WP3 : Deliverables (as in the TA) D3.1 (Report) Month 12: Evaluation Report of current technology D3.2 (Report) Month 9 : Detailed architectural design report and evaluation criteria (also input to WP12 architecture deliverable) D3.3 (Prototype) Month 9: Components and documentation for the First Project Release (see WP 6) D3.4 (Prototype) Month 21: Components and documentation for the Second Project Release (see WP 6) D3.5 (Prototype) Month 33: Components and documentation for the Final Project Release (see WP 6) D3.6 (Report) Month 36: Final evaluation report

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May WP3 : Milestones (as in the TA) M3.1 Month 6: Decide baseline architecture & technologies. M3.2 Month 9: Provide requirements for collation by Project Architect M3.3 Month 9: Prototype components integrated into First Project release (see WP 6) M3.4 Month 21: Interim components integrated into Second Project Release (see WP 6) M3.5 Month 33: Final components integrated into Final Project Release (see WP 6)

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May WP3 : First Release (PM9) Information services based on a new version of the Globus MDS (soon to be in alpha release).Information services based on a new version of the Globus MDS (soon to be in alpha release). Rudimentary implementation of a relational approach to information services.Rudimentary implementation of a relational approach to information services. A set of APIs in support of both MDS and GMA approaches.A set of APIs in support of both MDS and GMA approaches. Basic presentation of performance monitoring data based around Netlogger

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May WP3 : Effort FundedUnfundedTotal PPARC SZTAKI (HU) INFN (IT) IBM-UK Total Trinity College Dublin (NB : for both Monitoring and Information Services )

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May WP3 : Use Cases WP3 : Use Cases  Fault Detection & Analysis, Heartbeats [5]  Job Status & Progress Monitoring [6]  Application Performance Monitoring [1,13]  Performance Analysis of Distributed Systems [7]  Scheduling Services and Self Tuning Applications [8,10,14,(]  Scheduling Services and Self Tuning Applications [8,10,14,(15)]  Data Replication Services [9]  Accounting & Auditing [11]   Configuration monitoring [12]

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May WP3 : Decisions (end 2000)  Try to track standards & best practice from Global Grid Forum  evaluate, steer, adopt, …  Other WPs should provide the majority of sensors  network, fabric, mass-storage  WP3 will provide the instrumentation API  Key deliverables will be  Performance Services  Error / Alert Services  Status / Parameter Services  Logging / Archival Services  (forecasting) - information to enable other WPs to do this  WP3 subcontracts archival services (in terms of the data management aspects) ?

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Netlogger Supervisor Processing Node Readout Buffer Acknowledgement : Weidong Li

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Netlogger Supervisor Processing Node Readout Buffer Acknowledgement : Weidong Li

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Sequence Diagram Supervisor Readout Buffer Processing Node Request Fetch Data Return data Result TIMETIME

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Results X :  secs Y : “count” Acknowledgment : Weidong Li

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Netlogger Summary  Example deployment  Time resolution  NTP (~5ms)  Custom h/w (~50  s)  Thread safety ?  Variety of visualisation methods  “non-invasive” ?  Moving towards the GMA  e.g. integration of directory service

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Summary  Information Service is KEY to Monitoring  …and nature of service to be determined !  Unified Information Architecture is important  …otherwise duplication and inconsistencies  Align with Global Grid Forum for “standards”, etc.  Starting point is Netlogger  DataGrid deliverable details are testbed “driven”  Cross-DataGrid WP - service to many areas