Grid Operations Centre Progress to Aug 03

Slides:



Advertisements
Similar presentations
Last update 01/06/ :23 LCG 1Maria Dimou- cern-it-gd Maria Dimou IT/GD Site Registration policy & procedures
Advertisements

INFSO-RI Enabling Grids for E-sciencE Update on LCG/EGEE Security Policy and Procedures David Kelsey, CCLRC/RAL, UK
WLCG Cloud Traceability Working Group progress Ian Collier Pre-GDB Amsterdam 10th March 2015.
John Gordon and LCG and Grid Operations John Gordon CCLRC e-Science Centre, UK LCG Grid Operations.
RomeWorkshop on eInfrastructures 9 December LCG Progress on Policies & Coming Challenges Ian Bird IT Division, CERN LCG and EGEE Rome 9 December.
John Gordon CCLRC RAL Grid Operations Centre Update Trevor Daniels LCG Grid Deployment Board 10 th November 2003.
May 8, 20071/15 VO Services Project – Status Report Gabriele Garzoglio VO Services Project – Status Report Overview and Plans May 8, 2007 Computing Division,
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
3 June 2004GridPP10Slide 1 GridPP Dissemination Sarah Pearce Dissemination Officer
John Gordon STFC-RAL Tier1 Status 9 th July, 2008 Grid Deployment Board.
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
9-Sep-03D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security (Report from the LCG Security Group) CERN, 9 September 2003 David Kelsey CCLRC/RAL, UK
8-Jul-03D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security (Report from the LCG Security Group) RAL, 8 July 2003 David Kelsey CCLRC/RAL, UK
Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004.
BNL Tier 1 Service Planning & Monitoring Bruce G. Gibbard GDB 5-6 August 2006.
15-Dec-04D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security Update (Report from the Joint Security Policy Group) CERN 15 December 2004 David Kelsey CCLRC/RAL,
Grid Operations Centre LCG SLAs and Site Audits Trevor Daniels, John Gordon GDB 8 Mar 2004.
9-Oct-03D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security (Report from the LCG Security Group) FNAL 9 October 2003 David Kelsey CCLRC/RAL, UK
Grid Security Vulnerability Group Linda Cornwall, GDB, CERN 7 th September 2005
Proposal of interface between GUS + Call Center and Experiments GDB Meeting – Klaus-Peter Mickel GridKa Karlsruhe.
INFSO-RI Enabling Grids for E-sciencE GridICE: Grid and Fabric Monitoring Integrated for gLite-based Sites Sergio Fantinel INFN.
Rutherford Appleton Lab, UK VOBox Considerations from GridPP. GridPP DTeam Meeting. Wed Sep 13 th 2005.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
John Gordon CCLRC RAL Grid Operations LCG Grid Deployment Board FNAL, 9th October 2003.
CERN LCG Deployment Overview Ian Bird CERN IT/GD LCG Internal Review November 2003.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
DataTAG is a project funded by the European Union International School on Grid Computing, 23 Jul 2003 – n o 1 GridICE The eyes of the grid PART I. Introduction.
Grid Operations Centre Proposal (bis) Trevor Daniels, John Gordon GDB 8 July 2003.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
IEEE /r5 Submission November 2008 John Notor, Cadence Design Systems, Inc.Slide 1 IEEE IMT-Advanced Review Process Date:
Accounting Update Dave Kant, John Gordon RAL Javier Lopez, Pablo Rey Mayo CESGA.
15-Jun-04D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security Update (Report from the LCG Security Group) CERN 15 June 2004 David Kelsey CCLRC/RAL, UK
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
Grid Operations Centre LHCC Comprehensive Review Trevor Daniels, John Gordon 25 Nov 2003.
Bow Basin Watershed Management Plan Revised Terms of Reference
T0-T1 Networking Meeting 16th June Meeting
Service Availability Monitoring
What’s New in ProMonitor 9
Status of Task Forces Ian Bird GDB 8 May 2003.
Job monitoring and accounting data visualization
Regional Operations Centres Core infrastructure Centres
David Kelsey CCLRC/RAL, UK
SA1 Execution Plan Status and Issues
Key Activities. MND sections
EO Applications Parallel Session
Global Grid Forum GridForge
Accounting at the T1/T2 Sites of the Italian Grid
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
David Kelsey CCLRC/RAL, UK
System performance and cost model working group
Workshop Summary Dirk Duellmann.
LCG Operations Centres
a VO-oriented perspective
Overview of the FEPAC Accreditation Process
Leigh Grundhoefer Indiana University
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
Cloud Management Mechanisms
LHC Data Analysis using a worldwide computing grid
Report on GLUE activities 5th EU-DataGRID Conference
Danilo Dongiovanni INFN-CNAF
IEEE IMT-Advanced Review Process
Assessment Needs Analysis
IEEE IMT-Advanced Review Process
Comments on IMT-Advanced Review Process
IEEE IMT-Advanced Review Process
Presentation transcript:

Grid Operations Centre Progress to Aug 03 Trevor Daniels, John Gordon GDB 2 Sept 2003

GOC Group The June GDB agreed that a task force should be created to define the requirements and agree on a prototype for a Grid Operations Service The members of this GOC Steering Group are Trevor Daniels (RAL) RAL, Convenor Markus Shultz (CERN) CERN John Gordon (RAL) RAL Rolf Rumler (IN2P3) IN2P3 Cristina Vistoli (INFN) INFN Claude Wang Taipei (observer) Eric Yen Taipei Ian Fisk FNAL, US-CMS Bruce Gibbard BNL, US-Atlas Trevor.Daniels@rl.ac.uk

GOC Group The views of the group have been sought on several topics: Revised proposal for GOC resulted in submission to July GDB Prototype website general layout restrictions on certain pages monitoring pages Approaches to monitoring SLAs possible tests for CE and RB services Security proposals as presented to Sept GDB Trevor.Daniels@rl.ac.uk

GOC Phase 1 Jul 03 – Oct 03 Set up initial monitoring centre by end-Jul 03 using monitoring tools available for immediate deployment Develop Grid operations security policy in consultation with security officers Define the service level parameters which must be published and monitored for each of the critical grid services Develop draft reporting formats and establish a monitoring regime for determining and presenting service level information Evaluate and select tools which will be deployed in Phase 2 Done In progress Started About to start Not yet begun Trevor.Daniels@rl.ac.uk

GOC Website http://www.grid-support.ac.uk/GOC/ Main Areas: GOC Overview Phase 1 complete Participating Institutions Up to date LCG Home Complete (link) Contact us Phase 1 complete Service Level Parameters Marker Change Notification Marker Configuration Awaiting details Monitoring Phase 1 complete Security In progress News Marker Meetings Marker Links Partly done Trevor.Daniels@rl.ac.uk

Monitoring This page brings together the several LCG monitoring tools which are readily available, together with a touch-sensitive map which links to pertinent information about each LCG site, including a link to each site’s published status. The currently running and displaying monitors are: GridICE monitoring of LCG-1 (at CERN) GridICE monitoring of LCG-0 (at CNAF) MapCenter monitoring of LCG-1 (at RAL) LCG-1 overall rollout status page (at CERN) LCG-1 status measured with GridPP (at RAL) Each of these provides multiple views of status information Trevor.Daniels@rl.ac.uk

GridICE VO view Partial view of DTEAM VO showing infn, fzk and sinica Shows info on cpu loading, jobs, and storage by cluster Trevor.Daniels@rl.ac.uk

MapCenter Performs low-level tests and aggregates these up through several levels to country, showing best and worst status at each level. This is the top level world view showing individual sites. Trevor.Daniels@rl.ac.uk

MapCenter Part of the MapCenter full list view showing aggregation up to country. Tests include icmp, gk, gsiftp, nfs, ssh Trevor.Daniels@rl.ac.uk

GridPP Monitor Submits job via globus-job-run and via CERN RB, displays coloured dot to indicate recent results on map and also in list form. Gives user-level view of status Trevor.Daniels@rl.ac.uk

Monitoring Issues Monitors must be able to rely on published information about the configuration (services in production) at a site. Static lists are too difficult to maintain. At present the information being published is incomplete, so this is being gleaned from a variety of sources. All the monitors present views which are potentially useful for operational monitoring. They are complementary and it is expected that all will have a place in the GOC. Not all are immediately suited to the end-user, so some monitors may be hidden from the general user. It is not yet clear which monitor, if any, will be most suited to monitoring compliance with SLAs. One which can provide historical information of Availability, Reliability and Performance for each Service type will be required. Trevor.Daniels@rl.ac.uk

Security Policy Security and Availability Policy drafted late August Discussed with Security Group on 28 Aug 03 Revised and extended draft prepared and circulated to Security Group for comment 2 Sep 03 Final draft presented to GDB at this meeting Further discussion under that agenda item Trevor.Daniels@rl.ac.uk

Approach to Service SLAs Formal Contract with GOC? – No, because GOC is not (likely to be) a legal body GOC will not (be likely to) have any formal powers over Service Providers GOC will not (be likely to) pay for any Services So difficult for GOC to enforce a traditional SLA Instead, prefer a virtual contract between Service Provider and the LCG Grid Community Any Centre wishing to provide a Service must publish its design levels for the specified service level parameters of that Service GOC will then monitor the actual levels achieved and publish them so they may be compared with the design levels Service Providers (Centres) will then compete on quality or possibly quality/cost, either to attract work or enhance reputation Trevor.Daniels@rl.ac.uk

Form of SLA One for each instance of a LCG Service Published on the GOC website in standard format exactly as provided by the Service Administrator Format yet to be developed and agreed, but likely to contain as a minimum Identification of Service (type, release, etc) Statement on compliance with Security and Availability Policy (standard wording) Limitations on use (if any) Designed Availability Designed Reliability Designed Performance (Service-specific; to be defined for each type of Service) Trevor.Daniels@rl.ac.uk

Next steps Continue to develop GOC website and extend configuration of monitors as rollout continues Work with Security Group on Policy, Procedures, Codes of Conduct and Guides Incorporate drafts of these in GOC website as they become available for community comment Devise precise form of SLAs and develop GOC website to publish them Define service level parameters for Compute Element, Resource Broker, Job Submission and Information Services Develop monitoring regime to measure service level parameters for CE, RB, JSS and IS Trevor.Daniels@rl.ac.uk