EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks The network monitoring in grid context Operations.

Slides:



Advertisements
Similar presentations
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROD model assessment ROC SEE By E. Atanassov,
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Operations Dashboard Workplan Cyril.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Pole 3 – COD TOOLS Cyril L’Orphelin - CNRS/IN2P3.
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Grid Infrastructure Monitoring System Based on Nagios E. Imamagic, D. Dobrenic SRCE HPDC.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
02/07/09 1 WLCG NAGIOS Kashif Mohammad Deputy Technical Co-ordinator (South Grid) University of Oxford.
INFSO-RI Enabling Grids for E-sciencE GRID sites connectivity database design Anthony Teslyuk, RRC KI JRA4, SA2 Meeting 4 th EGEE.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GStat 2.0 Joanna Huang (ASGC) Laurence Field.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios for Grid Services E. Imamagic, SRCE.
INFSO-RI Enabling Grids for E-sciencE Experience with monitoring of Prague T2 site Tomáš Kouba NEC 2007, Varna, Bulgaria
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid Monitoring Tools Alexandre Duarte CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA2 Quality Plan for EGEE III Geneviève.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Service Availability Monitoring – Status.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Site Monitoring with Nagios E. Imamagic,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks, Novelties and Features around the GridWay.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Batch Systems and the Info (Dynamic) Provider.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Wojciech Lapka SAM Team CERN EGEE’09 Conference,
EGEE-III INFSO-RI Enabling Grids for E-sciencE Operations Automation Team KoM, May ROC VIEW (SWE)‏ Javier Lopez Cacheiro/
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Dashboard Cyril L’Orphelin - CNRS/IN2P3.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
EGEE-II INFSO-RI Enabling Grids for E-sciencE GStat Work Plans for EGEE-III Joanna Huang, ASGC/OPS EGEE SA1 F2F Meetings, Abingdon.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Xavier Jeannin (CNRS/UREC Paris, FR) 24.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Vassiliki Pouli
ATP Future Directions Availability of historical information for grid resources: It is necessary to store the history of grid resources as these resources.
Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Monitoring Tools E. Imamagic, SRCE CE.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Xavier Jeannin (CNRS/UREC Paris, FR) 24.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-17
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1 & SA2-ENOC Interactions status and plans.
INFSO-RI Enabling Grids for E-sciencE Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Configuration Data or “What should be.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Best Practices and Use cases David Bouvet,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROC model assessment AP ROC ShuTing Liao.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Research Infrastructures Grant Agreement n
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations automation team presentazione.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Towards an Information System Product Team.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GOCDB4 Gilles Mathieu, RAL-STFC, UK An introduction.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks New WLCG Grid Service Monitoring Displays.
TSA1.4 Infrastructure for Grid Management Tiziana Ferrari, EGI.eu EGI-InSPIRE – SA1 Kickoff Meeting1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operational Tools M2 Update James Casey.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks An insight into GOCDB for ROD Operators.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status of the SAM/Nagios/GSTAT Components.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MyEGEE David Horat (
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios Grid Monitor E. Imamagic, SRCE OAT.
NGI and Site Nagios Monitoring
Use of Nagios in Central European ROC
POW MND section.
Evolution of SAM in an enhanced model for monitoring the WLCG grid
Operational Tools & Middleware Versions Monitoring
EGEE Middleware: gLite Information Systems (IS)
Kashif Mohammad Deputy Technical Co-ordinator (South Grid) Oxford
Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The network monitoring in grid context Operations Perspective Emir Imamagic /SRCE EGEE’09, Barcelona, Spain

Enabling Grids for E-sciencE EGEE-III INFSO-RI Overview Monitoring In Operations Service Availability Monitoring –Architecture –Network Monitoring Performance Monitoring Possible Future Work Conclusion 2

Enabling Grids for E-sciencE EGEE-III INFSO-RI Monitoring In Operations Provide means to site and grid operators to monitor their resources Focus on improving availability and reliability by spotting problems and issuing alarms Define procedures for escalation and resolution of more complex problems 3

Enabling Grids for E-sciencE EGEE-III INFSO-RI Service Availability Monitoring 4 Schema provided by Karolis Eigelis

Enabling Grids for E-sciencE EGEE-III INFSO-RI The New Architecture 5 Schema provided by Karolis Eigelis

Enabling Grids for E-sciencE EGEE-III INFSO-RI The New Architecture 6

Enabling Grids for E-sciencE EGEE-III INFSO-RI Which Other Systems Are Used? Database components –Aggregated Topology Provider (ATP) –Metric Description Database (MDDB) Operations services –GOCDB, ENOC, OIM Grid information services –BDII 7

Enabling Grids for E-sciencE EGEE-III INFSO-RI What Do We Check? SAM probes –various grid services (CE, WN and SRM) WLCG probes (SRCE, CERN) –various grid services (e.g. GridFTP, LFC) BDII & Gstat probes –validation of content in information system BDII Nagios native probes –standard services (e.g. web, ftp, ssh servers) 8

Enabling Grids for E-sciencE EGEE-III INFSO-RI Network Monitoring Collaboration with ENOC –integration of ENOC Downcollector features into SAM Added lightweight service checks –based on nmap –executed with high frequency –used for masking other alarms 9

Enabling Grids for E-sciencE EGEE-III INFSO-RI Network Monitoring Integrated network topology data –ENOC provided static list of border routers for all sites –Nagios supports network hierarchy –in case of router failure site resources flagged as unreachable 10

Enabling Grids for E-sciencE EGEE-III INFSO-RI Performance Monitoring - Grid Several grid systems gather performance –BDII, GridFTP transfers –Dashboards and VO-specific systems Some raise alarms based on performance data 11

Enabling Grids for E-sciencE EGEE-III INFSO-RI Performance Monitoring - Network Majority of sites are without dedicated links –without SLAs what should we alarm on? Severe degradation of network performance –e.g. failure of primary link –interpreted as service unavailability 12

Enabling Grids for E-sciencE EGEE-III INFSO-RI Possible Future Work – Availability Monitoring Lightweight checks improvement? Dynamic network topology info? Better integration with networking monitoring systems? End-to-end monitoring between sites? 13

Enabling Grids for E-sciencE EGEE-III INFSO-RI Possible Future Work – Performance Monitoring Dynamic performance testing –to distinguish between failure and severe degradation –interesting for grid services (job & file transfer management) With dedicated links –monitoring network parameters –raising alarms in case of degradation Monitoring dynamic link reservation 14

Enabling Grids for E-sciencE EGEE-III INFSO-RI Conclusion Multilevel monitoring provide the means for administrators to better monitor their services Integration with existing components to automate operations of monitoring instances Network monitoring mainly focused on end-to-end links 15

Enabling Grids for E-sciencE EGEE-III INFSO-RI Links OAT web page OAT Multi-level monitoring architecture itoringOverview itoringOverview 16

Enabling Grids for E-sciencE EGEE-III INFSO-RI Thank You! Questions? 17