Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004.

Slides:



Advertisements
Similar presentations
John Gordon CCLRC eScience centre Grid Support and Operations John Gordon CCLRC GridPP9 - Edinburgh.
Advertisements

FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
The Community Authorisation Service – CAS Dr Steven Newhouse Technical Director London e-Science Centre Department of Computing, Imperial College London.
A conceptual model of grid resources and services Authors: Sergio Andreozzi Massimo Sgaravatto Cristina Vistoli Presenter: Sergio Andreozzi INFN-CNAF Bologna.
Password? CLASP Phase 2: Revised Proposal C5 Meeting, 16 February 2001 Denise Heagerty, IT/IS.
John Gordon and LCG and Grid Operations John Gordon CCLRC e-Science Centre, UK LCG Grid Operations.
Accounting Update Dave Kant Grid Deployment Board Nov 2007.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
John Gordon CCLRC RAL Grid Operations Centre Update Trevor Daniels LCG Grid Deployment Board 10 th November 2003.
Introduction on R-GMA Shi Jingyan Computing Center IHEP.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Dave Kant Grid Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPiX at Brookhaven 18 th – 22 nd Oct 2004.
Dave Kant LCG Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPSYSMAN April 2005.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Summary of Accounting Discussion at the GDB in Bologna Dave Kant CCLRC, e-Science Centre.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
JSPG: User-level Accounting Data Policy David Kelsey, CCLRC/RAL, UK LCG GDB Meeting, Rome, 5 April 2006.
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
APEL & MySQL Alison Packer Richard Sinclair. APEL Accounting Processor for Event Logs extracts job information by parsing batch system (PBS, LSF, SGE.
Dave Kant Grid Operations Centre LCG Workshop CERN 24/3/04.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
Steve Traylen PPD Rutherford Lab Grid Operations PPD Christmas Lectures Steve Traylen RAL Tier1 Grid Deployment
Some Title from the Headrer and Footer, 19 April Overview Requirements Current Design Work in Progress.
GDB March User-Level, VOMS Groups and Roles Dave Kant CCLRC, e-Science Centre.
Overview of Privilege Project at Fermilab (compilation of multiple talks and documents written by various authors) Tanya Levshina.
Grid Deployment Board – 10 February GD LCG Workshop Goals Give overview where we are Stimulate cooperation between the centres Improve the communication.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Grid Operations Centre LCG SLAs and Site Audits Trevor Daniels, John Gordon GDB 8 Mar 2004.
Dave Kant Monitoring ROC Workshop Milan 10-11/5/04.
Grid Security Vulnerability Group Linda Cornwall, GDB, CERN 7 th September 2005
Presenter Name Facility Name UK Testbed Status and EDG Testbed Two. Steve Traylen GridPP 7, Oxford.
LCG Accounting John Gordon Grid Deployment Board 13 th January 2004.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Local Job Accounting Cristina del Cano Novales STFC-RAL.
Accounting Update John Gordon and Stuart Pullinger January 2014 GDB.
John Gordon CCLRC RAL Grid Operations LCG Grid Deployment Board FNAL, 9th October 2003.
APEL Cloud Accounting Status and Plans APEL Team John Gordon.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG User Level Accounting John Gordon CCLRC-RAL LCG Grid Deployment Board October 2006.
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
APEL Accounting Update Dave Kant CCLRC, e-Science Centre.
Dave Kant LCG Accounting Overview GDA 7 th June 2004.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
INFSO-RI Enabling Grids for E-sciencE DGAS, current status & plans Andrea Guarise EGEE JRA1 All Hands Meeting Plzen July 11th, 2006.
John Gordon Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Enabling Grids for E-sciencE APEL Accounting update Dave Kant (presented by Jeremy Coles) 2 nd EGEE/LCG Operations Workshop Bologna 25.
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
Accounting Update John Gordon. Outline Multicore CPU Accounting Developments Cloud Accounting Storage Accounting Miscellaneous.
Accounting Update Dave Kant, John Gordon RAL Javier Lopez, Pablo Rey Mayo CESGA.
Grid Operations Centre Progress to Aug 03
U.S. ATLAS Grid Production Experience
Andreas Unterkircher CERN Grid Deployment
Accounting at the T1/T2 Sites of the Italian Grid
Raw Wallclock in APEL John Gordon, STFC-RAL
Artem Trunov and EKP team EPK – Uni Karlsruhe
Cristina del Cano Novales STFC - RAL
Discussions on group meeting
Presentation transcript:

Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004

LCG Accounting Overview 1.PBS log processed daily on site CE to extract required data, filter acts as R-GMA DBProducer -> PbsRecords table 2.Gatekeeper log processed daily on site CE to extract required data, filter acts as R-GMA DBProducer -> GkRecords table 3.Job Manager log processed daily on site CE to extract required data, filter acts as R-GMA DBProducer -> PbsJobIds 4.Site GIIS interrogated daily on site CE to obtain SpecInt and SpecFloat values for CE, acts as DBProducer -> SpecRecords table, one dated record per day 5.These four tables joined daily on MON to produce LcgRecords table. As each record is produced program acts as StreamProducer to send the entries to the LcgRecords table on the GOC site. 6.Site now has table containing its own accounting data; GOC has aggregated table over whole of LCG. 7.Interactive and regular reports produced by site or at GOC site as required.

GOC Site LCG Site MON LCG Site CEMON PBS log gk log site GIIS filter GOC Reports LCG Site Accounting DB LCG Accounting Flow

Changes to GK Logs  The way in which the gatekeeper records information relevant to accounting was changed in LCG1_1_1_0 issued on 24 Oct 2003 when the lcgpbs job manager was introduced. The implications of this change were not communicated to GOC, and they were not discovered until February during integration tests of the final system with live data. The code had been developed using the earlier log formats, and required substantial changes to accommodate the new formats.  An extra log file now has to be processed to generate a fourth intermediate table which then has to entered into the final 4- table join.  The code to do this has been designed and is now being written.  The result is a delay in deploying the software of perhaps 2 weeks.

R-GMA Infrastructure  The Grid Deployment Area meeting on 23 Feb 2004 discussed the requirement to deploy an R-GMA infrastructure. It was needed for the immediate applications of accounting, network monitoring and CMS monitoring.  It was proposed that RAL would provide effort to package, deploy and support R-GMA and work with the Deployment Team to achieve this. This was agreed, with a target of end- March for deployment to a few test sites following certification on the CERN testbed.  RAL would also assist with installing and testing the applications, particularly accounting.

Interim Arrangements  Until R-GMA becomes operational throughout LCG, RAL will make arrangements to upload accounting files to the GOC system so they may be processed there.  All sites have been instructed to preserve the relevant logs from 1 Feb 2004 until they can be uploaded.  A script has been written to automate the uploading of the required files. These are  the gzipped globus gatekeeper logs in /var/log/  the gzipped gatekeeper job manager messages files in /var/log  the pbs log files in /var/spool/pbs/server_priv/accounting/  The script has been packaged and will shortly be deployed to a few test sites.  The files will be uploaded to a set of directories on the GOC system using a mutually authenticated transfer to ensure that the files in a particular directory can only come from the CE associated with that directory.

Progress  Status on 2 Mar 2004:  The code to extract data from the pbs and gk logs and to obtain the estimate of CE power is written, working and tested  Code to extract the mapping from GK JM ID to pbs jobID from the job manager messages and globus gatekeeper logs is being written  Scripts to automate the uploading of files to GOC are written, packaged and awaiting deployment  Directory structure at GOC to receive files is set up -  Logs are being preserved from 1 Feb for later processing – Some lost  An option flag has been added to suppress the publication of the DN if sites are unable to do this due to data protection or privacy laws  To do:  Complete rewriting of accounting client following log change - Done (5/3)  Package accounting client for deployment – 3 days  Write the report generators – 30 days (estimate – they are not yet designed)

9

10

Remaining Issues 1.The VO associated with a user is not available in the batch or gatekeeper logs. It will be assumed that the group ID used to execute user jobs, which is available, is the same as the VO name. This needs to be acknowledged as an LCG requirement. 2.The global jobID assigned by the Resource Broker is not available in the batch or gatekeeper logs. This global jobID cannot therefore appear in the accounting reports. The RB Events Database contains this, but that is not accessible nor is it designed to be easily processed. 3.At present the logs provide no means of distinguishing sub- clusters of a CE which have nodes of differing processing power. Changes to the information logged by the batch system will be required before such heterogeneous sites can be accounted properly. At present it is believed all sites are homogeneous.

LcgRecords Table  Where possible, the fieldnames in the LcgRecords table have been chosen to correspond with the schema developed by the Global Grid Forum’s Usage Record Working Party. There is one record per job.  SiteNamesite at which the job executed  JobName(as known to the executing site)  LocalUserID(as known to the executing site for that job)  GlobalUserNamesubmitting user’s Distinguished Name  ProjectNameuser’s Groupname; assumed to be VO name  WallDurationelapsed time while job was running  CpuDurationcpu time used by job  EndTimetime job finished} { in ISO 8601 format..  StartTimetime job started } {..local time and UTC  SubmitHostdomain name of CE  MemoryRealreal memory used  MemoryVirtualvirtual memory used  SpecInt2000of Cluster/SubCluster associated with CE  SpecFloat2000of Cluster/SubCluster associated with CE

Accounting Reports  The way in which the accounting records will be summarised in reports has not yet been designed in detail. The following show our current thoughts, and comments from GDB on these points would be helpful:  Regular summary reports (monthly?) will be published automatically showing usage by site and by VO.  Interactive reports generated with various selection criteria, including DN, site, VO, dates, will be available from the GOC website.  ‘Usage’ could include raw cpu time, cpu time normalised to SpecIntSeconds (or some agreed combination of SpecInt and SpecFloat powers) (probably the default, once agreed), a notional ‘charge’ based on any combination of cputime, real memory, and virtual memory (however, the available logs do not include data on storage used, nor on the queue through which the job was submitted, both of which would be desirable for calculating a notional charge.)  The same reports will be available for running at sites on local data and at the GOC on aggregated data.

Summary  An accounting prototype has been deployed at GOC  Using logs transferred from sites  Need all sites to transfer records  Have written a tool but it needs deploying.  GOC will start to publish a few standard reports  Next Steps  Package and distribute  Develop reports  Consider normalisation of heterogeneous clusters  Include more batch types.