CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t The Experiment Dashboard ISGC 2008 9-11 th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.

Slides:



Advertisements
Similar presentations
Experience In Developing Dynamic Web Interfaces: The Case Study of the ALICE Job Reliability Dashboard Eamonn Maguire IT-PSS 30-Aug
Advertisements

CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services GS group meeting Monitoring and Dashboards section Activity.
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
Enabling Grids for E-sciencE Overview of System Analysis Working Group Julia Andreeva CERN, WLCG Collaboration Workshop, Monitoring BOF session 23 January.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks VO-specific systems for the monitoring of.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
CERN - IT Department CH-1211 Genève 23 Switzerland t DB Development Tools Benthic SQL Developer Application Express WLCG Service Reliability.
CERN IT Department CH-1211 Geneva 23 Switzerland t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Successful Common Projects: Structures and Processes WLCG Management.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Overlook of Messaging.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
Dashboard program of work Julia Andreeva GS Group meeting
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
Julia Andreeva, CERN IT-ES GDB Every experiment does evaluation of the site status and experiment activities at the site As a rule the state.
ATLAS Production System Monitoring John Kennedy LMU München CHEP 07 Victoria BC 06/09/2007.
WLCG Monitoring Roadmap Julia Andreeva, CERN , WLCG workshop, CERN.
CERN IT Department CH-1211 Geneva 23 Switzerland t CCRC’08 Tools for measuring our progress CCRC’08 F2F 5 th February 2008 James Casey, IT-GS-MND.
Monitoring for CCRC08, status and plans Julia Andreeva, CERN , F2F meeting, CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
ATLAS Dashboard Recent Developments Ricardo Rocha.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Visualization Ideas for Management Dashboards
CERN IT Department CH-1211 Geneva 23 Switzerland t A proposal for improving Job Reliability Monitoring GDB 2 nd April 2008.
ATP Future Directions Availability of historical information for grid resources: It is necessary to store the history of grid resources as these resources.
Julia Andreeva on behalf of the MND section MND review.
Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Hammercloud and Nagios Dan Van Der Ster Nicolò Magini.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI User-centric monitoring of the analysis and production activities within.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Monitoring Tools E. Imamagic, SRCE CE.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Author etc Dashboard Latency monitoring (Update) Alexandre Beche.
GridView - A Monitoring & Visualization tool for LCG Rajesh Kalmady, Phool Chand, Kislay Bhatt, D. D. Sonvane, Kumar Vaibhav B.A.R.C. BARC-CERN/LCG Meeting.
Enabling Grids for E-sciencE Grid monitoring from the VO/User perspective. Dashboard for the LHC experiments Julia Andreeva CERN, IT/PSS.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Open Science Grid OSG Resource and Service Validation and WLCG SAM Interoperability Rob Quick With Content from Arvind Gopu, James Casey, Ian Neilson,
WLCG Transfers Dashboard A unified monitoring tool for heterogeneous data transfers. Alexandre Beche.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
CERN - IT Department CH-1211 Genève 23 Switzerland t IT-GD-OPS attendance to EGEE’09 IT/GD Group Meeting, 09 October 2009.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
MND section. Summary of activities Job monitoring In collaboration with GridView and LB teams enabled full chain from LB harvester via MSG to Dashboard.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
CERN IT Department CH-1211 Genève 23 Switzerland t Bamboo users meeting IT-CS-CT.
ConTZole Tomáš Kubeš, 2010 atlas-tz-monitoring.cern.ch An Interactive ATLAS Tier-0 Monitoring.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
XRootD Monitoring Report A.Beche D.Giordano. Outlines  Talk 1: XRootD Monitoring Dashboard  Context  Dataflow and deployment model  Database: storage.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Author etc Alarm framework requirements Andrea Sciabà Tony Wildish.
CERN IT Department CH-1211 Genève 23 Switzerland t Load testing & benchmarks on Oracle RAC Romain Basset – IT PSS DP.
WLCG Transfers monitoring EGI Technical Forum Madrid, 17 September 2013 Pablo Saiz on behalf of the Dashboard Team CERN IT/SDC.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
Daniele Bonacorsi Andrea Sciabà
Key Activities. MND sections
POW MND section.
FTS Monitoring Ricardo Rocha
New monitoring applications in the dashboard
Monitoring of the infrastructure from the VO perspective
Presentation transcript:

CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin Gaidioz, Anastasia Ivanchecnko, Gerhild Maier, Ricardo Rocha, Irina Sidirova IT-GS-MND

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Overview Dashboard structure Dashboard in production –Job Monitoring –Grid reliability –Prodsys –Data Management –SAM –FTS monitoring –Site status board Future development Conclusions ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Dashboard Framework Web / HTTP Interface Data Access Layer (DAO) Agents Oracle DB DB reading and writing via DAO layer Connection pooling Easy to add interface for a different backend Collectors of information Common configuration and management Multiple clients: cli, web Multiple output formats: plain text, csv, xml, xhtml ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Transfer monitoring for ALICE Data management monitoring for ATLAS Production monitoring for ATLAS and CMS (prototypes) IO rate monitoring between WN and SE (prototype) Site availability based on the results of SAM tests Job Robot monitoring Accounting information from Apel and Gratia for ATLAS (prototype) Task monitoring for CMS analysis users (ATLAS on the way) Job monitoring Site reliability Experiment Dashboard COMMON applications ALICE, ATLAS, CMS, LHCb, Vlemed CMS Integration and commissioning Experiment specific applications Dashboard activities ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring  Display all the jobs submitted by a VO o Follow the status of the jobs  Collect information from different sources o RGMA, IC Real Time Monitor, BDII, MonALISA, …  Very useful for VO managers, site admin, users  Possibility to get the output in different formats  Deployed for ALICE, ATLAS, CMS, LHCb and VleMed ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Site Reliability  Efficiency of the different sites o Jobs and Job Attempts  List of most common errors o And recipes to the solutions!!  Generic application  Automatic generation of monthly reports ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Site reliability ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Production System  ATLAS Prodsys  Identify failing tasks and jobs  Evaluate the performance of the sites  Daily/weekly/monthly statistics  User guide ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Production System ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Production System ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Data Management  Monitor of T0 and Production system  Report of transfers to the different sites  Integrated with the ATLAS management system  Information of the clouds, sites, SE and datasets  History of errors ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Data Management ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Data Management ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services FTS reliability  Daily report on the success of transfers  Drill down list of errors  Integrated in the ALICE environment  Extremely useful during the different ALICE challenges: PDC06, PDC07, CRC08  Working on making it generic ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services FTS reliability ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services SAM monitoring  Service Availability Monitoring  Clickable plots to drill down:  Site availability  Service availability  Service tests  Links to the SAM results  At the moment, only for CMS  ATLAS requested a similar interface  Ongoing work to make it generic ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services SAM monitoring ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services SAM monitoring ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Site Status Board  Table with status of the different sites for CMS  Easy definition of new ‘metrics’ o The ‘metrics’ can come from different sources  Links to more detailed information  At the moment, deployed for CMS o It could be used by other VO  Working on providing history o And aggregation… ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Site Status Board ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Site Status Board ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Experiment Dashboard plans  Include more data sources: condor_g, L&B,  Security: X509 authentication  New application:  Pilot jobs  Input collections  Improve existing applications  Make the SAM interface generic  More in depth failure analysis  User requests and suggestions  Integration with the GridMap technology ISGC

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Conclusions  The Experiment Dashboard provides:  Several monitor applications  Integration of information from different sources  Multiple output format: html, xml, csv, txt..  Generic appliations:  Job Monitoring, Grid reliability  Experiment specific  DDM, ProdSys, Site Status Board, SAM, …  Used in production by multiple VO  User, installation and developer guides ISGC