EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.

Slides:



Advertisements
Similar presentations
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROD model assessment ROC SEE By E. Atanassov,
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Operations Dashboard Workplan Cyril.
INFSO-RI Enabling Grids for E-sciencE Operational Security OSCT JSPG March 2006 Ian Neilson, CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Grid Infrastructure and Operations Maite.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Release Process Maria Alandes Pradillo.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks From ROCs to NGIs The pole1 and pole 2 people.
EGEE-III INFSO-RI Enabling Grids for E-sciencE COD June 2009 COD-20 Hélène Cordier COD-20, CNRS-IN2P3, CSC.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Steven Newhouse EGEE’s plans for transition.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The network monitoring in grid context Operations.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROD model assessment ROC UKI John Walsh.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GStat 2.0 Joanna Huang (ASGC) Laurence Field.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Service Availability Monitoring – Status.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Site Monitoring with Nagios E. Imamagic,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
CERN IT Department CH-1211 Geneva 23 Switzerland t GDB CERN, 4 th March 2008 James Casey A Strategy for WLCG Monitoring.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Pre-production in EGEEIII Operation principles Antonio Retico EGEE-II / EGEE II SA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation in EGEE-III What does.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Dashboard Cyril L’Orphelin - CNRS/IN2P3.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
EGEE-II INFSO-RI Enabling Grids for E-sciencE GStat Work Plans for EGEE-III Joanna Huang, ASGC/OPS EGEE SA1 F2F Meetings, Abingdon.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Robin McConnell NA3 Activity Manager 02.
Ian Bird LCG Project Leader On the transition to EGI – Requirements from WLCG WLCG Workshop 24 th April 2008.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Resource Allocation in EGEEIII Overview &
EGEE-II INFSO-RI Enabling Grids for E-sciencE Operations procedures: summary for round table Maite Barroso OCC, CERN
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Monitoring Tools E. Imamagic, SRCE CE.
AEGIS Academic and Educational Grid Initiative of Serbia Antun Balaz (NGI_AEGIS Technical Manager) Dusan Vudragovic (NGI_AEGIS Deputy.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Alistair.
EGEE-III INFSO-RI Enabling Grids for E-sciencE COD20. June 2009 Helsinki R-COD in UKI Claire Devereux, Jeremy Coles & Co. COD-20,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks User Support for Distributed Computing Infrastructures.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-17
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Overview of Operations in EGEE-III Marcin.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA2 Networking support for EGEE III Xavier.
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1 & SA2-ENOC Interactions status and plans.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks NA5: Policy and International Cooperation.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Pole 2 : Restructuration of the OPS Manual.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Configuration Data or “What should be.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks What all NGIs need to do: Helpdesk / User.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Best Practices and Use cases David Bouvet,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROC model assessment AP ROC ShuTing Liao.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks NA5: Policy and International Cooperation.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-16 (Transition to EGEE-III) Report to.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-17
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations automation team presentazione.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GOCDB4 Gilles Mathieu, RAL-STFC, UK An introduction.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks IT ROC: Vision for EGEE III Tiziana Ferrari.
Enabling Grids for E-sciencE EGEE-II INFSO-RI ROC managers meeting at EGEE 2007 conference, Budapest, October 1, 2007 Admin Matters Vera Hanser.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operational Tools M2 Update James Casey.
EGEE-III INFSO-RI Enabling Grids for E-sciencE COD EGEE09 Barcelona Pole-2 Restructuring of Procedures Vera Hansper.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status of the SAM/Nagios/GSTAT Components.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MyEGEE David Horat (
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios Grid Monitor E. Imamagic, SRCE OAT.
Transition to EGI PSC-06 Istanbul Ioannis Liabotis Greece GRNET
Ian Bird GDB Meeting CERN 9 September 2003
Introduction to OAT presentations
Evolution of SAM in an enhanced model for monitoring the WLCG grid
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James Casey, OAT EGEE’08 Istanbul, Turkey

Enabling Grids for E-sciencE EGEE-II INFSO-RI Why are we here… EGEE’08 – Multi-level Monitoring 2

Enabling Grids for E-sciencE EGEE-II INFSO-RI What is the Operations Automation Team (OAT) EGEE MSA1.1 : Operations Automation Strategy –Due end of PM1 –Delivered mid-June –In review – comment welcome Abstract: In EGEE-III, within the SA1 activity, a group called the ‘Operations Automation Team’ was formed with the task of coordinating operational tools and their development, with the specific goal of advising on the strategic directions to take in terms of automating the operations effort. This will entail replacing manual processes with automated ones in order that the overall staffing level of operations can be significantly reduced in a long- term, sustainable infrastructure. This document outlines a strategy for achieving this automation using an integration architecture based on messaging. It describes how current tools and processes, such as operational alarming and ticketing will evolve during the lifetime of EGEE-III and lays out a roadmap for this evolution. 3 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Operational Tools in EGEE-III 4 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Current Operational Model Several teams involved –Operations Management (OCC) –Monitoring system operators (SAM) –Grid operators (COD) –Regional Operations Centres (ROC) –First line support teams (ROC) –Resource Centres/sites (RC) –User support team (GGUS) 5 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Current operational model (s) 6 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Future operational model 7 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Multi-level monitoring Based on existing work in CE ROC –Replace central SAM with Nagios at ROC and site –Tie together with the messaging system (see later) –Regional operations dashboard and alarms DB –Link into regional ticketing  E.g., via GGUS Follow new operational model –Raise alarms immediately at the site –1 st level support sees them and can respond if needed –Central COD only involved after 2-3 weeks e.g. site banning Data is aggregated at the ROC for availability calculation 8 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Multi level monitoring framework 9 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Messaging for integration Use commodity messaging middleware (Apache ActiveMQ) to integrate systems –Reliable, scalable, industry standard, open protocols Broker already in production 10 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Roadmap for tools Milestone ‘Messaging 1’: August 2008 –Production level messaging broker in production. This should have internal failover capabilities, but will not have the WAN failover capabilities of a network of broker Milestone ‘Messaging 2’: December 2008 –A scalable and reliable network of brokers, consisting of a deployment over at least 3 sites is in place Milestone ‘Site Monitoring 1’: September 2008 –A release of the site components for the multi-level monitoring, including packaging and configuration as part of a EGEE middleware release exists and is ready for deployment to the sites. Milestone ‘ROC Monitoring 1’: December 2008 –The ROC components for the multi-site monitoring are ready for deployment to sites. Milestone ‘ROC Monitoring 2’: February 2009 –The alarm component has been integrated with the regionalized dashboard Milestone ‘ROC Monitoring 3’: July 2009 –The regional dashboard is now available to be deployed at the ROCs 11 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Roadmap for distributed COD Milestone ‘rCOD 1’: September 2008 –4 ROCs carry out r-COD and 1st line support roles directly. This will be done with a ‘regionalized’ version of the current operations dashboard, and with SAM as the alarm generation system Milestone ‘rCOD 2’: April 2009 –4 additional ROCs carry out r-COD and 1st line support roles using the regionalized dashboard Milestone ‘rCOD 3’: April 2009 – 2 additional ROCs carry out r-COD and 1st line support roles directly using the new multi-level monitoring framework Milestone ‘rCOD 4’: September 2009 –All 11 ROCs carry out r-COD and 1st line support roles directly. The c-COD is fully established Milestone ‘rCOD 5’: December 2009 –All 11 ROCs carry out r-COD and 1st line support roles using the new multi-level monitoring framework 12 EGEE’08 – Multi-level Monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Summary EGEE-III is moving to a new monitoring model Key concept is that sites : –are responsible for the reliability of their sites  with the help of their ROC as 1 st line support –are provides with the tools to allow them to run reliable services  Site monitoring component is provided, based on Nagios Part of an overall strategy Since Nagios will become a core component within SA1 for administrators, we need to provide training… Now onto the Nagios specific bits from the experts… EGEE’08 – Multi-level Monitoring 13