On behalf of D. Colling, G. Moont, M. Aggarwal

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

London Tier2 Status O.van der Aa. Slide 2 LT 2 21/03/2007 London Tier2 Status Current Resource Status 7 GOC Sites using sge, pbs, pbspro –UCL: Central,
University of Southampton Electronics and Computer Science M-grid: Using Ubiquitous Web Technologies to create a Computational Grid Robert John Walters.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
Making Better Decisions Faster © 2000 DAI, Inc. Workcell Quality ® Demonstration Making Better Decisions Faster © November 2000 Derby Associates International,
CADDLAB Medical Imaging on Remote Compute Servers.
A tool to enable CMS Distributed Analysis
Analysis demos from the experiments. Analysis demo session Introduction –General information and overview CMS demo (CRAB) –Georgia Karapostoli (Athens.
FireRMS SQL Audit, Archiving & Purging Presented by Laura Small FireRMS Quality Assurance.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Hsu Chun-Hung Network Benchmarking Lab
LT 2 London Tier2 Status Olivier van der Aa LT2 Team M. Aggarwal, D. Colling, A. Fage, S. George, K. Georgiou, W. Hay, P. Kyberd, A. Martin, G. Mazza,
Real Time Monitor of Grid Job Executions Janusz Martyniak Imperial College London.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
Grid infrastructure analysis with a simple flow model Andrey Demichev, Alexander Kryukov, Lev Shamardin, Grigory Shpiz Scobeltsyn Institute of Nuclear.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
LHCb-Italy Farm Monitor Domenico Galli Bologna, June 13, 2001.
Dzero MC production on LCG How to live in two worlds (SAM and LCG)
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
July 25, 20071/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green OSG Information Services, VO Monitoring Services and Resource Selection.
AFFAIR – fabric monitoring ROOT 2005 Tome Antičić Ruđer Bošković Institute, Zagreb,Croatia ALICE,CERN Tome Antičić Ruđer Bošković.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
Fabric Monitoring at the INFN Tier1 Felice Rosso on behalf of INFN Tier1 Joint OSG & EGEE Operations WS, Culham (UK)
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Monitoring for CCRC08, status and plans Julia Andreeva, CERN , F2F meeting, CERN.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
1 A lightweight Monitoring and Accounting system for LHCb DC'04 production V. Garonne R. Graciani Díaz J. J. Saborido Silva M. Sánchez García R. Vizcaya.
Your university or experiment logo here Performance Monitoring Gidon Moont e-Science, HEP, Imperial College London Talk to JRA1.
M. Gheata ALICE offline week, October Current train wagons GroupAOD producersWork on ESD input Work on AOD input PWG PWG31 (vertexing)2 (+
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
Mark Leese Daresbury Laboratory GridMon EGEE JRA4 R-GMA workshop Thursday 22 nd July 2004 University College London Mark Leese.
Analysis of job submissions through the EGEE Grid Overview The Grid as an environment for large scale job execution is now moving beyond the prototyping.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
EGEE is a project funded by the European Union under contract IST The GILDA Testbed Valeria Ardizzone INFN Catania EGEE tutorial,
First South Africa Grid Training June 2008, Catania (Italy) GILDA t-Infrastructure Valeria Ardizzone INFN Catania.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
WLCG Transfers monitoring EGI Technical Forum Madrid, 17 September 2013 Pablo Saiz on behalf of the Dashboard Team CERN IT/SDC.
A Statistical Analysis of Job Performance on LCG Grid David Colling, Olivier van der Aa, Mona Aggarwal, Gidon Moont (Imperial College, London)
Servizi core INFN Grid presso il CNAF: setup attuale
LHC T0/T1 networking meeting
Grid Operations Centre Progress to Aug 03
System Monitoring with Lemon
GridPP DIRAC Daniela Bauer & Simon Fayer.
Key Activities. MND sections
Workload Management System ( WMS )
Summary on PPS-pilot activity on CREAM CE
POW MND section.
INFN-GRID Workshop Bari, October, 26, 2004
ShareTheTraining TRR ARB Presentation Team 11
Accounting at the T1/T2 Sites of the Italian Grid
ALICE Physics Data Challenge 3
The CREAM CE: When can the LCG-CE be replaced?
Update on gLite WMS tests
Nicolas Jacq LPC, IN2P3/CNRS, France
JRA2 Pisa, Tuesday, 25 October 2005
Monitoring Of XRootD Federation
Analysis Operations Requirements
Monitoring of the infrastructure from the VO perspective
D. van der Ster, CERN IT-ES J. Elmsheuser, LMU Munich
The GENIUS portal and the GILDA t-Infrastructure
gLite Job Management Christos Theodosiou
Job Application Monitoring (JAM)
EGEE Operation Tools and Procedures
Presentation transcript:

On behalf of D. Colling, G. Moont, M. Aggarwal RTM for monitoring http://gridportal.hep.ph.ic.ac.uk/rtm/ O. van der Aa o.vanderaa@imperial.ac.uk e-Science, HEP, Imperial College London On behalf of D. Colling, G. Moont, M. Aggarwal

RTM for monitoring – o. van der Aa Changes in the RTM Big changes in underlying design allowing for more flexibility 51 Resource Brokers now monitored Other EGEE Grid Projects have requested to be monitored; EUMED, EUCHINA, EELA Historical data available and taken by several groups Real Time data being visualised in new ways 15/09/2006 RTM for monitoring – o. van der Aa

RTM for monitoring – o. van der Aa RTM, the Applet The original form of the Monitor - popular as a demo Problem in users are unaware of full capabilities via clicking in the Key; selection by VO and/or RB 15/09/2006 RTM for monitoring – o. van der Aa

RTM for monitoring – o. van der Aa RTM, Google earth Static view of the grid Shows a plot of running jobs for each site you click on. 15/09/2006 RTM for monitoring – o. van der Aa

RTM for monitoring – o. van der Aa RTM, real time plots The RTM keeps all job states in a Postgresql database Round-robin archives are then produced to allow real time plotting of the number of jobs in any given state. Good for real time monitoring of the Grid activity 15/09/2006 RTM for monitoring – o. van der Aa

RTM for monitoring – o. van der Aa How does it look like See https://gfe03.hep.ph.ic.ac.uk:4175 Select a set of VO and CE and the time period for the plot One plot stacked by VO On plot stacked by CE 15/09/2006 RTM for monitoring – o. van der Aa

RTM, running jobs 1month back Last month, running jobs for the whole Grid lhcb cms atlas alice biomed 15/09/2006 RTM for monitoring – o. van der Aa

RTM for monitoring – o. van der Aa View per country UK France Italy swiss 15/09/2006 RTM for monitoring – o. van der Aa

Embedding graphs in your web pages https://gfe03.hep.ph.ic.ac.uk:4175/cgi-bin/googlegraph.cgi? Arguments are ce=[yource1]&ce=[yource2] If no ce is given all the existing ones are plotted If filter=[country] is used only the ce in that country are shown Date=-1w W=800 (width) H=400 (height) Examples: Googlegraph.cgi?ce=gw39.ph.ic.ac.uk&date=-1w&w=800&h=400 Googlegraph.cgi?filter=uk&date=-1w&w=800&h=400 15/09/2006 RTM for monitoring – o. van der Aa

RTM for detailed analysis Round robin is fast to render real time data view over long periods It contains averages of the number of job in a given state For more detailed analysis we need the full data on a per job basis (jobid) Use root to store the timings of the job state transitions Also store all the states the job went in 15/09/2006 RTM for monitoring – o. van der Aa

Where to find the root and ascii data http://gridportal.hep.ph.ic.ac.uk/rtm/resource-brokers/reports/ascii_report_data_2006-05-01.dat http://gridportal.hep.ph.ic.ac.uk/rtm/resource-brokers/reports/root_report_data_2006-05-01.root The daily data is that of jobs which are considered as "finished" by the RTM within a 24 hour period (local time UK midnight-midnight). Finished means either they were CLEARED by a user, or had been sitting in a DONE / ABORTED / CANCELLED state for over 2 hours. 15/09/2006 RTM for monitoring – o. van der Aa

RTM for monitoring – o. van der Aa Job states data http://gridportal.hep.ph.ic.ac.uk/rtm/resource-brokers/all.dat http://gridportal.hep.ph.ic.ac.uk/rtm/resource-brokers/update.dat Their format is (Java code snippet) - the all.dat does NOT have the rtm_timestamp ; println( rbAddress + "\t" + jobid + "\t" + status + "\t" + state_entered + "\t" + registered + "\t" + ui + "\t" + ce + "\t" + queue + "\t" + wn + "\t" + vo + "\t" + rtm_timestamp ) ; By reading the all.dat, and rereading the update.dat exactly once a minute afterwards, you should be able to maintain a current view. 15/09/2006 RTM for monitoring – o. van der Aa

Examples (jan-june data) Fractional useful time for atlas Total Succesful Hours/Total Hours 15/09/2006 RTM for monitoring – o. van der Aa

More examles: Fractional usefull time per vo Fractional useful time 15/09/2006 RTM for monitoring – o. van der Aa

Example WMS monitoring Job scheduling (Match Time) versus load (mean number of jobs/sec during the matching) 15/09/2006 RTM for monitoring – o. van der Aa

RTM for monitoring – o. van der Aa Conclusion RTM is more than the applet It can provide rrd archives for real time plotting Number of job in a given state. Per CE view Per VO view Could measure abort rate and trigger alarms It also provides root files for detailed historical analysis Timing analysis of job cycles WMS monitoring Efficiency (Usefull Time) Please fell free to use the root/ascii and round robin data 15/09/2006 RTM for monitoring – o. van der Aa

RTM for monitoring – o. van der Aa URLS http://gridportal.hep.ph.ic.ac.uk/rtm/ https://gfe03.hep.ph.ic.ac.uk:4175 The historical data http://gridportal.hep.ph.ic.ac.uk/rtm/resource-brokers/reports/ascii_report_data_2006-05-01.dat http://gridportal.hep.ph.ic.ac.uk/rtm/resource-brokers/reports/root_report_data_2006-05-01.root The real time data (job states,ce, rb, etc) http://gridportal.hep.ph.ic.ac.uk/rtm/resource-brokers/all.dat http://gridportal.hep.ph.ic.ac.uk/rtm/resource-brokers/update.dat 15/09/2006 RTM for monitoring – o. van der Aa