EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Enabling Grids for E-sciencE Service Level Agreement Metrics SLA SA1 Working Group Łukasz Skitał.

Slides:



Advertisements
Similar presentations
Polish Infrastructure for Supporting Computational Science in the European Research Space EUROPEAN UNION Services and Operations in Polish NGI M. Radecki,
Advertisements

1 Deployment of an LCG Infrastructure in Australia How-To Setup the LCG Grid Middleware – A beginner's perspective Marco La Rosa
08/11/908 WP2 e-NMR Grid deployment and operations Technical Review in Brussels, 8 th of December 2008 Marco Verlato.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Pilot Test-bed Operations and Support Work.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE II - Network Service Level Agreement (SLA) Establishment EGEE’07 Mary Grammatikou.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Romanian SA1 report Alexandru Stanciu ICI.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
SEE-GRID-SCI Regional Grid Infrastructure: Resource for e-Science Regional eInfrastructure development and results IT’10, Zabljak,
SEE-GRID-SCI SEE-GRID-SCI Operations Procedures and Tools Antun Balaz Institute of Physics Belgrade, Serbia The SEE-GRID-SCI.
Deployment Issues David Kelsey GridPP13, Durham 5 Jul 2005
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
INFSO-RI Enabling Grids for E-sciencE VO BOX Summary Conclusions from Joint OSG and EGEE Operations Workshop - 3 Abingdon, 27 -
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
INFSO-RI Enabling Grids for E-sciencE GRID sites connectivity database design Anthony Teslyuk, RRC KI JRA4, SA2 Meeting 4 th EGEE.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks David Kelsey RAL/STFC,
Training and Dissemination Enabling Grids for E-sciencE Jinny Chien, ASGC 1 Training and Dissemination Jinny Chien Academia Sinica Grid.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Bazaar Vision Ideas of RC/VO coordination,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
Report on Installed Resource Capacity Flavia Donno CERN/IT-GS WLCG GDB, CERN 10 December 2008.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
8 th CIC on Duty meeting Krakow /2006 Enabling Grids for E-sciencE Feedback from SEE first COD shift Emanoil Atanassov Todor Gurov.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
Rutherford Appleton Lab, UK VOBox Considerations from GridPP. GridPP DTeam Meeting. Wed Sep 13 th 2005.
LCG workshop on Operational Issues CERN November, EGEE CIC activities (SA1) Accounting: current status
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Operational Architecture of PL-Grid project M.Radecki,
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
WLCG Laura Perini1 EGI Operation Scenarios Introduction to panel discussion.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Update Authorization Service Christoph Witzig,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Resource Allocation in EGEEIII Overview &
EGEE-II INFSO-RI Enabling Grids for E-sciencE Operations procedures: summary for round table Maite Barroso OCC, CERN
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
IAG – Israel Academic Grid, EGEE and HEP in Israel Prof. David Horn Tel Aviv University.
Last update 22/02/ :54 LCG 1Maria Dimou- cern-it-gd Maria Dimou IT/GD VO Registration procedure Presented by.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
INFSO-RI Enabling Grids for E-sciencE Introduction to Grid Computing, EGEE and Bulgarian Grid Initiatives, Sofia, South.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
INFSO-RI Enabling Grids for E-sciencE gLite Certification and Deployment Process Markus Schulz, SA1, CERN EGEE 1 st EU Review 9-11/02/2005.
EGEE is a project funded by the European Union under contract IST New VO Integration Fabio Hernandez ROC Managers Workshop,
CERN - IT Department CH-1211 Genève 23 Switzerland t IT-GD-OPS attendance to EGEE’09 IT/GD Group Meeting, 09 October 2009.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
EGEE is a project funded by the European Union under contract IST Issues from current Experience SA1 Feedback to JRA1 A. Pacheco PIC Barcelona.
II EGEE conference Den Haag November, ROC-CIC status in Italy
SEE-GRID-SCI Grid Operations Procedures Antun Balaz Institute of Physics Belgrade Serbia The SEE-GRID-SCI initiative.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid is a Bazaar of Resource Providers and.
EGEE is a project funded by the European Union under contract IST Aims and organization of the Biomedical VO Yannick Legré CNRS/IN2P3 NA4/SA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarksEGEE-III INFSO-RI MPI on the grid:
Maria Alandes Pradillo, CERN Training on GLUE 2 information validation EGI Technical Forum September 2013.
CESGA QR2 SA1-SWE Partner Coordination Meeting 2 CICA, Sevilla
Bob Jones EGEE Technical Director
Regional Operations Centres Core infrastructure Centres
EGEE-II SLA Progress Report & Initial Proposal
Ian Bird GDB Meeting CERN 9 September 2003
Brief overview on GridICE and Ticketing System
Grid Operations Procedures
Service Level Agreement/Description between CE ROC and Sites
Report on SLA progress Ioannis Liabotis <ilaboti at grnet.gr>
Nordic ROC Organization
EGEE Operation Tools and Procedures
Site availability Dec. 19 th 2006
Presentation transcript:

EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Enabling Grids for E-sciencE Service Level Agreement Metrics SLA SA1 Working Group Łukasz Skitał Central European ROC ACK CYFRONET AGH

2 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Introduction Objectives To provide formal description of resources/services provided by Resource Centers (RCs) for: –EGEE (SLA included in contract between RCs and EGEE) –Virtual Organisation Allows to evaluate sites operation in EGEE as well as enforce declared service level.

3 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Areas of Interest The SLA will/should cover following areas: –Resources and performance (CPU, Storage) –Connectivity –Availability –Software/Middleware –VO Support –Support and expertise –Data privacy Metrics should be defined for each area. Available and editable in SA1 SLA WG twiki:

4 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Resource and performance MetricUnitDescription CPU count# Total number of CPUs available for EGEE VOs (excluding service CPUs) Cores per CPU #Number of cores in single CPU CPU Performance Si2KBenchmark, should be measured, not (easily) configurable by site admin If multicore CPU are used, each core can have one job slot. Total count of job slots can not be greater than number of CPU cores. If site publish N-core CPU as N job slots, then CPU performance should have average value of values returned by N benchmarks run simultaneously on a single N-core CPU.

5 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Resource and performance MetricUnitDescription RAM per nodeMb Amount of memory installed on single node (shared by all jobs running on the node) RAM per job slotMbAmount of memory available for single (not MPI) job

6 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Resource and performance MetricUnitDescription Type Type of interconnection (Ethernet, Infiniband, etc.) LatencymsMessage delivery latency BandwidthMbitInterconnection bandwidth Cluster interconnection This metrics are crucial for MPI jobs.

7 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Resource and performance MetricUnitDescription TypeType of storage (disk, tape, etc.) SizeTbSize of storage (of given type) Avg. access timems Depends on storage type and performance Storage bandwidthMbitMaximum bandwidth for reading and writing Storage

8 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Resource and performance Measuring methods? –GSTAT ( GSTAT uses BDII, which can be easly altered by site admins ) –GridICE with WN monitoring Do we trust sites? SLA is not about trust, it is a contract and should be effectively enforced. How to treat heterogeneity? –Define each resource type –Define minimum guaranteed resources

9 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Network and connectivity MetricUnitDescription Minimum connectivity Site should provide enough connectivity to allow correct execution of SAM test jobs Outbound connectivity from WN Required Inbound connectivity to WN Optional (recommended?)

10 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Network and connectivity MetricUnitDescription Site’s uplink bandwidth MbitBandwidth to a backbone network Bandwidth to GEANT2 MbitBandwidth between site and Genat 2 Packet loss%Per cent of lost packet Latencies*msBetween site and … (?) MTU*KbMaximum Transmit Unit Reordering*Packet reordering * Is it in site responsibility?

11 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Network and connectivity Minimal acceptable inbound/outbound bandwidth should be relative to CPU count How this “Network and connectivity” metrics are related to SA2 Network SLA?

12 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Availability MetricUnitDescription Site availability% Per cent of time when site was available – all SAM critical test were OK Site declared downtime %Per cent of time when site was in downtime Is SAM accurate enough? Taking long term average (month, year) is should be enough. Error relevance should be taken under consideration (from site reports).

13 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Software/Middleware MetricUnitDescription Middleware flavor Required midleware to be installed on site (gLite, LCG, version?) Time for updatedays Time to install latest middleware patches and updates Time for new service deployment daysThis vary depending on type of service Coreservices provided by site Should this SLA cover Coreservices? Coreservices should be covered by separate SLA, because of higher relevance for the infrastructure.

14 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 VO Support MetricUnitDescription Support of mandatory VOs Ops and Dteam Time for new VO configuration days It should take days (hours?), not months Supported VosList of supported VOs Support for “catch- all” Vos VOCE, … Minimum number of not mandatory VOs which should be supported?

15 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Support and expertise MetricUnitDescription Ticket response timehTaken from GUS Effectiveness in ticket solving %Per cent of ticket solved Site administrators and security officers FTEs, working hours Incident response procedures Reaction time, conformance to EGEE/ROC procedures Number of ticket and its severity For monitoring only. Site does not have any control over tickets it received, therefore it can to be taken under consideration during site operation evaluation.

16 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Other metrics/requirements Data privacy –storage –pool accounts Sites should configure it's resources (storage element, pool accounts,...) to prevent any unauthorized data access.

17 Enabling Grids for E-sciencE EGEE SA1 Operations Workshop Stockholm, 13-15/06/2007 Summary and conclusions SLA areas: –Resource and performance, Network and connectivity, Availability, Software/Middleware, Support and expertise, Data privacy –anything else? Measurements methods –easy to use, difficult to cheat How to effectively enforce SLA? –ROC responsibility –appropriate tools are necessary