EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.

Slides:



Advertisements
Similar presentations
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Advertisements

ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Grid Infrastructure Monitoring System Based on Nagios E. Imamagic, D. Dobrenic SRCE HPDC.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Spyros Kopsidas Center for Research and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
A.Guarise – F.Rosso 1 Enabling Grids for E-sciencE INFSO-RI Comprehensive Accounting Views on large computing farms. Andrea Guarise & Felice Rosso.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The network monitoring in grid context Operations.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GStat 2.0 Joanna Huang (ASGC) Laurence Field.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
Enabling Grids for E- sciencE EGEE and gLite are registered trademarks EGEE-III INFSO-RI Analysis of Overhead and waiting times.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stuart Kenny and Stephen Childs Trinity.
Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE Training Follow-on survey 2009 Conducted on past EGEE training course participants 120 respondents.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Wojciech Lapka SAM Team CERN EGEE’09 Conference,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
EGEE is a project funded by the European Union under contract INFSO-RI Practical approaches to Grid workload management in the EGEE project Massimo.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
INFSO-RI Enabling Grids for E-sciencE GridICE: Grid and Fabric Monitoring Integrated for gLite-based Sites Sergio Fantinel INFN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Recent improvements in HLRmon, an accounting portal suitable for national Grids Enrico Fattibene (speaker), Andrea Cristofori, Luciano Gaido, Paolo Veronesi.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Update Authorization Service Christoph Witzig,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks User traceability and log analysis tools.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Computational chemistry with ECCE on EGEE.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CF Monitoring: Lemon, LAS, SLS I.Fedorko(IT/CF) IT-Monitoring.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Monitoring Tools E. Imamagic, SRCE CE.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Xavier Jeannin (CNRS/UREC Paris, FR) 24.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CharonGUI A Graphical Frontend on top of.
EGEE is a project funded by the European Union under contract INFSO-RI Grid accounting with GridICE Sergio Fantinel, INFN LNL/PD LCG Workshop November.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite, the EGEE middleware Mike Mineter Training Outreach Education National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Management Claudio Grandi.
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
HLRmon Enrico Fattibene INFN-CNAF 1EGI-TF Lyon, France19-23 September 2011.
Using HLRmon for advanced visualization of resource usage Enrico Fattibene INFN - CNAF ISCG 2010 – Taipei March 11 th, 2010.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Astrophysical Cluster Session Claudio Vuerli,
Enabling Grids for E-sciencE INFN Workshop – May 7-11 Rimini 1 Grid Accounting Status at INFN Riccardo Brunetti INFN-TORINO.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
A Statistical Analysis of Job Performance on LCG Grid David Colling, Olivier van der Aa, Mona Aggarwal, Gidon Moont (Imperial College, London)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Argus: command line usage and banning Christoph.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status of the SAM/Nagios/GSTAT Components.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MyEGEE David Horat (
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios Grid Monitor E. Imamagic, SRCE OAT.
Job monitoring and accounting data visualization
Danilo Dongiovanni INFN-CNAF
Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB cluster status and job workflow Daniele Cesini, Danilo Dongiovanni, Enrico Fattibene INFN-CNAF EGEE08 – Sept Istanbul

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 2 Motivation of the work Workload Management System (WMS) and Logging and Bookkeeping (LB) Service have a complex internal structure and knowing their status, who and how is using them is challenging A site can run many WMS/LB instances At the same time WMS/LB services are an interesting source of information about Job Lifecycle and resource usage by the VOs The middleware is not currently providing any monitoring facilities Importance of having an efficient monitoring system aggregating information from internal components and from various instances

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 3 WMS/LB administrators to check the cluster status, who is using it and how WMS developers and advanced users to benchmark the service performance and test its scalability Resource Center managers that need per-VO aggregated statistics on usage and service availability VO managers to obtain aggregated job statistics, e.g. to cross check their monitoring systems Target Users

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 4 Web presentation: cluster overview

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 5 WMS/LB instance details view Textual boxes report latest series of acquired data Top charts represent status history of Condor Jobs (left) and WMS internal components queues (right) Bottom charts represent history of job flow rates between components A CMS use case using collections and BulkMM

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 6 WMS instance details/ Daily Report Daily summary of Job flow through the WMS components, including: -Resubmission of failed jobs -Number of jobs in successful final state -Number of jobs in aborted final status.

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 7 WMS instance details/ Custom Plot

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 8 WMS cluster VO stats Statistics on per WMS usage by a single VO (chart or tabular format). Time interval is configurable

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 9 Working on… User level statistics  Dynamical VO discover Resource Usage Statistics: –Destination CE –Number of matched CE per job DB redesign Distributed instances monitoring

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 10 Architecture/Implementation SNMP based data transport MySQL backend Sensors and data collector written mostly in PYTHON Web interface developed in PHP Open Flash Chart libraries based plots Periodically sends information to a NAGIOS server which acts as a notification system

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 11 Contacts and Acknowledgments CNAF Production Instance: PADOVA/EU-INDIA Production Instance: Wiki, Documentation, Download, Support: wms-support cnaf.infn.it Special Thanks to all gLite WMS / LB developers

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 12 Backup slides

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 13 gLiteWMS / gLiteLB architecture Interface Core Back End Cache of Grid Information System Logging & Bookkeeping

Enabling Grids for E-sciencE EGEE-III INFSO-RI WMSMonitor - EGEE08, Istanbul - Turkey 14 Metrics considered Adopted metrics are of three types: –Grid service metrics: daemons status, number of opened file descriptors, entries in component queues, number of available CE queues, open connections on ports, Condor Job stats –System metrics: CPU load average, % occupacy of disk partitions –Job flow metrics: Job submitted from users, Job Input/Output for each component in the WMS, Job Successfully Completed / Aborted Jobs inJobs out - Daemons Status - File Descriptors - Queues - Open connections on ports - Available Grid Information - Treated Job status WMS Service Component