Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.

Slides:



Advertisements
Similar presentations
Monitoring and Accounting in EGEE/LCG Dave Kant GridPP 15 RAL.
Advertisements

GridPP Monitoring & Accounting Dave Kant CCLRC, e-Science Centre.
Dave Kant Grid Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPiX at Brookhaven 18 th – 22 nd Oct GOSC Oct 28.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
HTCondor and the European Grid Andrew Lahiff STFC Rutherford Appleton Laboratory European HTCondor Site Admins Meeting 2014.
Accounting in LCG Dave Kant & John Gordon CCLRC, e-Science Centre.
London Tier 2 Status Report GridPP 13, Durham, 4 th July 2005 Owen Maroney, David Colling.
John Gordon and LCG and Grid Operations John Gordon CCLRC e-Science Centre, UK LCG Grid Operations.
Accounting Update Dave Kant Grid Deployment Board Nov 2007.
Accounting in EGEE … and beyond John Gordon and David Kant CCLRC, e-Science Centre.
Monitoring and Accounting in EGEE/LCG Jeremy Coles (for Dave Kant) ARM-6 Barcelona Based on GridPP15 talk.
Dave Kant Grid Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPiX at Brookhaven 18 th – 22 nd Oct 2004.
Dave Kant LCG Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPSYSMAN April 2005.
Summary of Accounting Discussion at the GDB in Bologna Dave Kant CCLRC, e-Science Centre.
A.Guarise – F.Rosso 1 Enabling Grids for E-sciencE INFSO-RI Comprehensive Accounting Views on large computing farms. Andrea Guarise & Felice Rosso.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
Dave Kant Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK GridPP 12 Jan 31 st - Feb 1 st 2005.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Security and Job Management.
Dave Kant Grid Operations Centre LCG Workshop CERN 24/3/04.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
Steve Traylen PPD Rutherford Lab Grid Operations PPD Christmas Lectures Steve Traylen RAL Tier1 Grid Deployment
Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid Monitoring Tools Alexandre Duarte CERN.
Some Title from the Headrer and Footer, 19 April Overview Requirements Current Design Work in Progress.
GDB March User-Level, VOMS Groups and Roles Dave Kant CCLRC, e-Science Centre.
Evolution of a High Performance Computing and Monitoring system onto the GRID for High Energy Experiments T.L. Hsieh, S. Hou, P.K. Teng Academia Sinica,
Dave Kant Monitoring ROC Workshop Milan 10-11/5/04.
Enabling Grids for E-sciencE INFSO-RI Tools for CIC Operations, Bologna, 24th May Monitoring workflow in EGEE GOC DB is used to get the list.
LCG Accounting John Gordon Grid Deployment Board 13 th January 2004.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract INFSO-RI Grid Accounting.
INFSO-RI Enabling Grids for E-sciencE GridICE: Grid and Fabric Monitoring Integrated for gLite-based Sites Sergio Fantinel INFN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
LCG workshop on Operational Issues CERN November, EGEE CIC activities (SA1) Accounting: current status
Your university or experiment logo here Performance Monitoring Gidon Moont e-Science, HEP, Imperial College London Talk to JRA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
Interfacing Grid-Canada to LCG M.C. Vetterli, R. Walker Simon Fraser Univ. Grid Deployment Area Mtg August 2 nd, 2004.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG User Level Accounting John Gordon CCLRC-RAL LCG Grid Deployment Board October 2006.
EGEE is a project funded by the European Union under contract INFSO-RI Grid accounting with GridICE Sergio Fantinel, INFN LNL/PD LCG Workshop November.
LCG Accounting/Reporting John Gordon, STFC MB November 9 th 2011.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
APEL Accounting Update Dave Kant CCLRC, e-Science Centre.
HLRmon accounting portal The accounting layout A. Cristofori 1, E. Fattibene 1, L. Gaido 2, P. Veronesi 1 INFN-CNAF Bologna (Italy) 1, INFN-Torino Torino.
Analysis of job submissions through the EGEE Grid Overview The Grid as an environment for large scale job execution is now moving beyond the prototyping.
Dave Kant LCG Accounting Overview GDA 7 th June 2004.
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
INFN GRID Production Infrastructure Status and operation organization Cristina Vistoli Cnaf GDB Bologna, 11/10/2005.
INFSO-RI Enabling Grids for E-sciencE Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations.
INFSO-RI Enabling Grids for E-sciencE DGAS, current status & plans Andrea Guarise EGEE JRA1 All Hands Meeting Plzen July 11th, 2006.
John Gordon Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
II EGEE conference Den Haag November, ROC-CIC status in Italy
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
Enabling Grids for E-sciencE APEL Accounting update Dave Kant (presented by Jeremy Coles) 2 nd EGEE/LCG Operations Workshop Bologna 25.
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
DGAS Accounting – toward national grid infrastructures HPDC workshop on Monitoring, Logging and Accounting, (MLA) in production Grids 10/06/2009, Munich.
Accounting Update Dave Kant, John Gordon RAL Javier Lopez, Pablo Rey Mayo CESGA.
A Statistical Analysis of Job Performance on LCG Grid David Colling, Olivier van der Aa, Mona Aggarwal, Gidon Moont (Imperial College, London)
18/12/03PPD Christmas Lectures 2003 Grid in the Department A Guide for the Uninvolved PPD Computing Group Christmas Lecture 2003 Chris Brew.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Acccounting Portal Javier Lopez Cacheiro/
Accounting at the T1/T2 Sites of the Italian Grid
JRA2 Pisa, Tuesday, 25 October 2005
Short update on the latest gLite status
Site availability Dec. 19 th 2006
User Accounting Integration Spreading the Net.
Presentation transcript:

Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre

EGEE’04 PISA, Oct APEL in LCG/EGEE Overview Accounting in LCG/EGEE Data Consolidation Determining Grid Usage at the RB Level.

EGEE’04 PISA, Oct APEL, Job Accounting Flow Diagram [1] Build Job Accounting Records at site. [2] Send Job Records to a central repository [3] Data Aggregation

EGEE’04 PISA, Oct Accounting for Grid Jobs Build Job Records at Site APEL mapping grid users to the resource usage on local farms

Job Records In via RGMA RGMA MON SQL QUERY TO Accounting Server 1 Query / Hour On-Demand Accounting Pages based on SQL queries to summary data 1 Record per Grid Job (Millions of records expected) Summary data refreshed every hour (Max records about 100K per year) Home Page User queries Graphs GOC Consolidation of Data

Accounting Home Page 127 Sites publishing data (Oct ) 3.9 Million Job records ~ 100K records per week (period June – Oct 2005) /

EGEE’04 PISA, Oct Batch Support in APEL Currently Available in LCG 2.6 OpenPBS, Torque, PBSPro and Vanilla PBS  ~90% Sites in LCG/EGEE Load Share Facility (Versions 5 and 6)  CERN, Italy Work in Progress (LCG 2.7) Condor  Canada, Cambridge Sun Grid Engine  Imperial College

EGEE’04 PISA, Oct Demos of Accounting Aggregation Global views of resource consumption. LCG View  Shows Aggregation for each LHC VO Requirements driven by CRRB / Kors Bos Tier-1 and Country entry points All data normalised in units of SI2000. Hour Tabular Summaries per Tier1/ Country GridPP View  Shows Aggregation for Rocs and Organisations CESGA View  Prototype for EGEE View of resource consumption?

LHC View: Data Aggregation For VOs per Tier1, per Country

Organisation Structures Consolidated views of resource usage Tier1 and Tier2 within an EGEE ROC

Aggregation of Data for GridPP Community

Aggregation of Data for Tier2

Data Aggregation at Site Level Breakdown of data per Vo per month showing Njobs, CPUt, WCT, record history Total CPU Usage per VO Gantt Chart NB:Gaps across all VOs consistent with scheduled downdowns in GocDB

Provides reporting for the SWE federation. Protected area for users to get usage at dn level.

EGEE’04 PISA, Oct Determining Usage of the Grid How many different ways are there to submit a job on the grid? Broadly speaking, two categories: Indirect and Direct INDIRECT: using a resource broker and a JDL  Example: via a known “core” RB for SFT Monitoring Jobs  Or via a private RB to target national resources DIRECT: globus-job-run by specifying the target resource  Example: Atlas production run earlier in 2005 which required 1 hr cpu time, but took 7 seconds to submit via a RB. Thus, the RBs were a bottleneck. CPU Usage per site for this run available from the Atlas production database. Significant usage via this path: 50% Rome LCG Production

EGEE’04 PISA, Oct Determining Usage of the Grid COMPUTING ELEMENT/ BATCH FARM MANAGER wn Worker Node Pool RB L&B L&B captures usage via state change APEL/DGAS capture job usage at the end point. DIRECT JOB SUBMISSION globus_job_run Indirect via RB

EGEE’04 PISA, Oct Comparing Numbers: APEL vs JRA2 Can RB L&B be used to gauge use of the grid?  Not all RBs have RGMA publishers installed  RBs are not the only gateway to grid resoucres Sum over RBs to find total usage for each site and compare with the numbers at the end point. Example: SWE (PIC) Tier-1 that has deployed APEL (complete dataset). PERIODTotal (Njobs) ATLASCMSDTEAMLHCBBIOMED FEB 2005LB = 20 APEL = AUG 2005LB = 4327 APEL =

EGEE’04 PISA, Oct Service Metrics for RB Rather than gauge grid usage on a site or VO basis, why not try it on a job by job basis? Metric WG wants to monitor availability of RBs using test jobs that have a know finite WCT on the target resoucre. Compare APEL information with L&B information for these test jobs.

EGEE’04 PISA, Oct RB Service Monitoring : Metrics WG JOB Script RB to test EDG_WL_JOBID Timestamp_1 RB edg-job-submit GOC (UI) Select good CEs from SFT GOCDB CEs RBs CE Other.GlueCEUniqueID Received Acknowledgement Timestamp_2 WN OK Step 1: Submit a Known Job via a known RB to a Known CE.

EGEE’04 PISA, Oct RB L&B Data RGMA query of JobStatusRaw table which holds L&B data Record States of RB for the specified Job  Submitted  Waiting  Ready  Scheduled  Running  Done  Cleared Correlate time in running state with the APEL WCT

EGEE’04 PISA, Oct Alternative Approaches Compare on a job-by-job basis via the WMS job ID  DGAS – APEL integration  APEL support for Condor Any other ways?