Accounting in EGEE … and beyond John Gordon and David Kant CCLRC, e-Science Centre.

Slides:



Advertisements
Similar presentations
Monitoring and Accounting in EGEE/LCG Dave Kant GridPP 15 RAL.
Advertisements

GridPP Monitoring & Accounting Dave Kant CCLRC, e-Science Centre.
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
Accounting in LCG Dave Kant & John Gordon CCLRC, e-Science Centre.
Accounting, ‘the last A’ John Gordon Amsterdam Workshop, May 13 th 2005.
Accounting Update Dave Kant Grid Deployment Board Nov 2007.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
Monitoring and Accounting in EGEE/LCG Jeremy Coles (for Dave Kant) ARM-6 Barcelona Based on GridPP15 talk.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
Dave Kant Grid Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPiX at Brookhaven 18 th – 22 nd Oct 2004.
GGF12 – 20 Sept LCG Incident Response Ian Neilson LCG Security Officer Grid Deployment Group CERN.
Summary of Accounting Discussion at the GDB in Bologna Dave Kant CCLRC, e-Science Centre.
A.Guarise – F.Rosso 1 Enabling Grids for E-sciencE INFSO-RI Comprehensive Accounting Views on large computing farms. Andrea Guarise & Felice Rosso.
Monitoring the Grid at local, national, and Global levels Pete Gronbech GridPP Project Manager ACAT - Brunel Sept 2011.
JSPG: User-level Accounting Data Policy David Kelsey, CCLRC/RAL, UK LCG GDB Meeting, Rome, 5 April 2006.
Dave Kant Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK GridPP 12 Jan 31 st - Feb 1 st 2005.
1 OSG Accounting Service Requirements Matteo Melani SLAC for the OSG Accounting Activity.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
Steve Traylen PPD Rutherford Lab Grid Operations PPD Christmas Lectures Steve Traylen RAL Tier1 Grid Deployment
Some Title from the Headrer and Footer, 19 April Overview Requirements Current Design Work in Progress.
GDB March User-Level, VOMS Groups and Roles Dave Kant CCLRC, e-Science Centre.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
LCG Accounting John Gordon Grid Deployment Board 13 th January 2004.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract INFSO-RI Grid Accounting.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Storage Accounting John Gordon, STFC GDB March 2013.
LCG workshop on Operational Issues CERN November, EGEE CIC activities (SA1) Accounting: current status
HLRmon accounting portal DGAS (Distributed Grid Accounting System) sensors collect accounting information at site level. Site data are sent to site or.
EMI INFSO-RI Accounting John Gordon (STFC) APEL PT Leader.
Recent improvements in HLRmon, an accounting portal suitable for national Grids Enrico Fattibene (speaker), Andrea Cristofori, Luciano Gaido, Paolo Veronesi.
Accounting Update John Gordon and Stuart Pullinger January 2014 GDB.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG User Level Accounting John Gordon CCLRC-RAL LCG Grid Deployment Board October 2006.
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
The National Grid Service User Accounting System Katie Weeks Science and Technology Facilities Council.
APEL Accounting Update Dave Kant CCLRC, e-Science Centre.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
INFSO-RI Enabling Grids for E-sciencE DGAS, current status & plans Andrea Guarise EGEE JRA1 All Hands Meeting Plzen July 11th, 2006.
Open Science Grid OSG Accounting System Matteo Melani SLAC 9/28/05 Joint OSG and EGEE Operations Workshop.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 December 2007.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
John Gordon Grid Accounting Update John Gordon (for Dave Kant) CCLRC e-Science Centre, UK LCG Grid Deployment Board NIKHEF, October.
Accounting in LCG Dave Kant CCLRC, e-Science Centre.
Accounting John Gordon WLC Workshop 2016, Lisbon.
OSG Status and Rob Gardner University of Chicago US ATLAS Tier2 Meeting Harvard University, August 17-18, 2006.
Enabling Grids for E-sciencE APEL Accounting update Dave Kant (presented by Jeremy Coles) 2 nd EGEE/LCG Operations Workshop Bologna 25.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
DGAS Accounting – toward national grid infrastructures HPDC workshop on Monitoring, Logging and Accounting, (MLA) in production Grids 10/06/2009, Munich.
Accounting Update Dave Kant, John Gordon RAL Javier Lopez, Pablo Rey Mayo CESGA.
Ian Bird GDB Meeting CERN 9 September 2003
Accounting at the T1/T2 Sites of the Italian Grid
Long-term Grid Sustainability
Cristina del Cano Novales STFC - RAL
User Accounting Integration Spreading the Net.
Presentation transcript:

Accounting in EGEE … and beyond John Gordon and David Kant CCLRC, e-Science Centre

Operations Workshop, Sept History EDG – EU DataGrid  developed DGAS a full economic scheduling and accounting package developed in Italy  wasn’t mature enough to be deployed by end of EDG LCG – 2004-….  wanted resource reporting across the grid  commissioned APEL from RAL SWEGrid  developed SGAS for Swedish Supercomputing

Operations Workshop, Sept Types of Accounting Job Accounting AFTER the event (APEL Domain) Concept of a “Job” as a unit of resource consumption Determination of value after job execution Job usage record as a complete description of resource consumption Suitable for post paid services. Real Time Accounting (DGAS, SGAS Domain) Incremental determination of resource value while job being executed Incremental decrement of account balance Can enforce user quotas Suitable for pre-paid services

Operations Workshop, Sept APEL, Job Accounting Flow Diagram [1] Build Job Accounting Records at site. [2] Send Job Records to a central repository [3] Data Aggregation

Operations Workshop, Sept Accounting for Grid Jobs Build Job Records at Site APEL mapping grid users to the resource usage on local farms

Job Records In via RGMA RGMA MON SQL QUERY TO Accounting Server 1 Query / Hour On-Demand Accounting Pages based on SQL queries to summary data 1 Record per Grid Job (Millions of records expected) Summary data refreshed every hour (Max records about 100K per year) Home Page User queries Graphs GOC Consolidation of Data

Accounting Home Page 159 Sites publishing data (9 Jan 2006) 5.5 Million Job records ~ 100K records per week (period June – Dec 2005) /

Operations Workshop, Sept Demos of Accounting Aggregation Global views of resource consumption. LCG View  Shows Aggregation for each LHC VO Requirements driven by RRB / Kors Bos Tier-1 and Country entry points LHC VO only All data normalised in units of SI2000. Hour Tabular Summaries per Tier1/ Country GridPP View  Shows Aggregation for EGEE partner  Prototype for EGEE View

LHC View: Data Aggregation For VOs per Tier1, per Country

Aggregation of Data for GridPP

Aggregation of Data for Tier2

Data Aggregation at Site Level Breakdown of data per Vo per month showing Njobs, CPUt, WCT, record history Total CPU Usage per VO Gantt Chart NB:Gaps across all VOs consistent with scheduled downdowns in GocDB

Operations Workshop, Sept Batch Support in APEL Currently Available in LCG 2.6 OpenPBS, Torque, PBSPro and Vanilla PBS  ~90% Sites in LCG/EGEE Load Share Facility (Versions 5 and 6)  CERN, Italy Available in LCG 2.7 Condor  Canada Sun Grid Engine in development  Imperial College

Operations Workshop, Sept APEL Summary APEL is not a banking system.  Job accounting AFTER the event; Not in real-time. APEL designed to build accounting records at a site Supports PBS and LSF; SGE (done) Condor in development Middleware Independent. Although APEL uses R-GMA in LCG/EGEE, it could quite happily use any other mechanism for transportation (e.g. MySQL, WebServices, GridFTP). Can be deployed on other grids e.g. OSG Implementation is simple. One database per site One central repository APEL provides high level views usage Data Can also show usage at the dn level with restricted access via ACLs (GridSite) APEL has been running on the production EGEE grid for >1 year

Operations Workshop, Sept DGAS vs. APEL (?) DGAS and Apel aims are different: DGAS:  Focused on storing detailed accounting information and controlling authorised access to it.  Provides resource&user(VO) level accounting.  Can serve as a basis for economic accounting and quota management.  Provides security and authorisation to information access. APEL:  Focused on publishing accounting data and providing an easy graphical view to aggregate information.  Provides accounting suitable to upper (VO) level management view.  Focuses on after the fact, resource oriented accounting. DGAS & APEL!  We believe that these two softwares are not competitors, altough they have some (needed) overlapping, If used together they can furnish what is actually needed for grid accounting and benefit from cooperation.

Operations Workshop, Sept Issues Full Deployment  political, legal, security, paranoia  batch system support Validation  are all records captured?  is normalisation correct?  is site meeting commitments? Account other resources  storage, memory, network Standards Interoperability Global Repository

Operations Workshop, Sept Challenges Ahead Recognise that accounting isn’t just about “job usage” its about Resource usage which encompasses many things:-  CPU Usage  Also Storage & Network Usage  How do we describe this data?  Luckily there is a GGF Usage Record which provides a generic description of resource usage  Are these descriptors stable?  Are they sufficient to describe the data?  Can we get Network and Storage people to use the same schema?  CPU is consumed; Storage is Occupied and can be recycled How important is accounting?  Compute resource viewed as a grid currency  Need a guarantee that the data has not been tampered with in an un fair way  How does normalisation fit into this? The concept of a raw usage records has no meaning if internal scaling is applied to Heterogeneous farms.  GGF UR allows a “cost” descriptor  Do we need an agreement of cost?

Operations Workshop, Sept Challenges Ahead Data Collection  Many implementations for collecting accounting data in LCG World; APEL/DGAS in EGEE SGAS in SweGrid Sites that implement their own systems (FermiILab, IN2P3, SARA: multiple grid job managers from different grids feed a single condor pool) Discussion with OSG on deploying APEL with their own transport mechanism.  Switching one for another doesn’t resolve the problem of data sharing across the project. No mechanism in place to share this data in a consistent way in place.  GGF Working on a Resource Usage Service  What would the model for data sharing look like? Low level or high level?  Low Level: sensors publishing data via a web service?  High level: Data collected within the infrastructure, aggregated in a meaningful way, reviewed and approve data before it can be passed on (FermiLab)  Some Tier-1 centres have concerns about data association “LCG not EGEE” “Will the service be separate?”

Operations Workshop, Sept Challenges Ahead Usage Reporting at what Level?  Anonymous level: How much resource has been provided to each VO  Aggregation across: VOs, Countries, Regions, Grids, Organisations  Granularity: summed over units of Hours, Days, Weeks, Months? User Level Reporting?  If 10,000 CPU hours were consumed by Atlas VO, who are the users that submitted the work?  Data privacy laws  A Grid “DN” is personal information which could be used to target an individual.  Who has access to this data and how do you get it?  Can CA policies change to support anonymous DNs and reverse DN mappings?  What are the consequences? Are there any lawyers in the audience?

Operations Workshop, Sept World Wide Accounting Service for LCG Project involves combining results from all three peer infrastructures and presenting an aggregated view of resource usage for LHC VOs to the RRB  Peer Infrastructures in LCG Open Science Grid + Others (Ruth Pordes, Philippe Canal, Matteo Melani) Nordugrid (Per Oster, Thomas Sandholm) LCG/EGEE (Kors Bos, Dave Kant)

Operations Workshop, Sept Resource Usage Service Based on emerging GGF standards and Web Services  GGF UR, OGSI An implementation exists in “Market for Computational Science” – UK e- Science project Use case might be:  A user invokes the query service through a web browser, using SSL for client authentication, to ensure that usage information at user level belongs to the user. Servlet sends query to RUS web service and gets user data. Service Interface RUS WS Application ACL DB Web Service Container

Operations Workshop, Sept Possible Roadmap  Stage 1: Lets try to get some data from each of Tier-1s summary records describing VO usage over a finite period of time Before end 2005 SweGrid and Fermilab and DGAS ARE providing Data!  Stage 2: Centralised database with a web service interface (RUS) to publish/query accounting data (summary records) Sometime in 2006  Stage 3: Distributed databases with a complete RUS implementation including permission model. Sometime early 2007

Operations Workshop, Sept Summary EGEE has had a production accounting infrastructure in place since 2004  but still has a long way to go We are developing a central repository  to sit above all the grid infrastructures  to meet the requirement for global reporting on LHC Computing Accounting is a controversial subject Thank you to everyone who has cooperated