WLCG Operations Coordination and Commissioning Maria Girone, CERN IT On behalf of the Operations Coordination Team 11 th March 2013 1OSG All Hands Meeting,

Slides:



Advertisements
Similar presentations
Operations Coordination Team Maria Girone, CERN IT-ES GDB 10 th October 2012.
Advertisements

Operations Coordination Team Maria Girone, CERN IT-ES Kick-off meeting 24 th September 2012.
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Feasibility Study on a Common Analysis Framework for ATLAS & CMS.
May 9, 2008 Reorganization of the OSG Project The existing project organization chart was put in place at the beginning of It has worked very well.
Integrating Network and Transfer Metrics to Optimize Transfer Efficiency and Experiment Workflows Shawn McKee, Marian Babik for the WLCG Network and Transfer.
Jan 2010 Current OSG Efforts and Status, Grid Deployment Board, Jan 12 th 2010 OSG has weekly Operations and Production Meetings including US ATLAS and.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Assessment of Core Services provided to USLHC by OSG.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES WLCG operations: communication channels Andrea Sciabà WLCG operations.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Future support of EGI services Tiziana Ferrari/EGI.eu Future support of EGI.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Network and Transfer WG Metrics Area Meeting Shawn McKee, Marian Babik Network and Transfer Metrics Kick-off Meeting 26 h November 2014.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
Workshop summary Ian Bird, CERN WLCG Workshop; DESY, 13 th July 2011 Accelerating Science and Innovation Accelerating Science and Innovation.
Alex Read, Dept. of Physics Grid Activity in Oslo CERN-satsingen/miljøet møter MN-fakultetet Oslo, 8 juni 2009 Alex Read.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Successful Common Projects: Structures and Processes WLCG Management.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Impact of end of EMI+EGI-SA3 April 2013: EMI project finishes EGI-Inspire-SA3 finishes (mainly CERN affected) EGI-Inspire continues until April 2014 EGI.eu.
WLCG operations A. Sciabà, M. Alandes, J. Flix, A. Forti WLCG collaboration workshop July , Barcelona.
MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons 10/12/2014.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
Handling ALARMs for Critical Services Maria Girone, IT-ES Maite Barroso IT-PES, Maria Dimou, IT-ES WLCG MB, 19 February 2013.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
Alex Read, Dept. of Physics Grid Activities in Norway R-ECFA, Oslo, 15 May, 2009.
Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.
Ian Bird GDB CERN, 9 th September Sept 2015
PanDA & BigPanDA Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, CERN October 21, 2013.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
Julia Andreeva on behalf of the MND section MND review.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Data management demonstrators Ian Bird; WLCG MB 18 th January 2011.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
WLCG Operations Coordination Andrea Sciabà IT/SDC 10 th July 2013.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
User Support of WLCG Storage Issues Rob Quick OSG Operations Coordinator WLCG Collaboration Meeting Imperial College, London July 7,
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Evolution of WLCG infrastructure Ian Bird, CERN Overview Board CERN, 30 th September 2011 Accelerating Science and Innovation Accelerating Science and.
Ian Bird LCG Project Leader Status of EGEE  EGI transition WLCG LHCC Referees’ meeting 21 st September 2009.
Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
WLCG Operations Coordination report Maria Dimou Andrea Sciabà IT/SDC On behalf of the WLCG Operations Coordination team GDB 12 th November 2014.
Campana (CERN-IT/SDC), McKee (Michigan) 16 October 2013 Deployment of a WLCG network monitoring infrastructure based on the perfSONAR-PS technology.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks IT ROC: Vision for EGEE III Tiziana Ferrari.
LHCbComputing Update of LHC experiments Computing & Software Models Selection of slides from last week’s GDB
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
Operations Coordination Team Maria Girone, CERN IT-ES GDB, 11 July 2012.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
WLCG IPv6 deployment strategy
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
WLCG Operations Coordination
Update on Plan for KISTI-GSDC
Update from the HEPiX IPv6 WG
Cloud Computing R&D Proposal
Monitoring of the infrastructure from the VO perspective
Presentation transcript:

WLCG Operations Coordination and Commissioning Maria Girone, CERN IT On behalf of the Operations Coordination Team 11 th March OSG All Hands Meeting, March 2013

Mandate and Scope Organization Ongoing activities Opportunities during LS1 Conclusions Maria Girone, CERN2 Outline

The Operations Coordination Team was formed in the middle of 2012 after the Technical Evaluation Groups were finished – It was recognized that there were many deployment and commissioning activities ongoing and needed to be organized 3 Introduction Maria Girone, CERN

WLCG Operations Coordination and Commissioning Team “This group will be the core operations and deployment coordination team in the future, and will manage ongoing operational issues as well as new deployments. It will replace some of the existing operations/deployment meetings and teams. Intent is to work together with EGI+OSG operations teams.” Maria Girone, CERN4 Mandate

Even though the WLCG infrastructure has been in constant operations for years, it is also continuously evolving – Infrastructure improvements like CVMFS – Changes in monitoring and availability – Data Management Improvements like FTS3 and the broader deployment of data federations – Security changes like glexec and SHA-2 – Transitions of middleware to resource provisioning tools like Open Stack and Agile Infrastructure projects 5 Evolution Maria Girone, CERN

Operations Coordination Work Group activities have been kicked-off on 24 th September 2012 Large spectrum of activities to be ensured in operations – Internal task forces are the driving force and source of specific and dedicated expertise – Primary communication channel on operational issues to WLCG MB – Regular reports to the monthly GDB Regular fortnightly meetings established on 1 st and 3 rd Thursdays as main forum – First Operation Coordination meeting held on 4th October 2012 – Minutes and action list 6 Organization Maria Girone, CERN

Operations Coordination handles the daily operations meetings (now twice weekly) – Operations reports and tickets Bi-weekly Coordination meetings – Follow task force progress. Report experiment issues Quarterly Experiment Planning Meetings – Prepares plans and proposes them to the MB – Reviews operational needs from experiments and sites – Creates and dissolves internal ops task forces E.g. Squid Monitoring task force concluded Large and diverse group participates in the task forces and the operations. Only works with contributions from many areas 7 Logistics Maria Girone, CERN

All the grid and Infrastructure projects (OSG, EGI, NorduGrid, etc.) have to expand their focus to continue to be viable – Expand into campus infrastructures, focus on new communities and areas, and focus in new resource providers Operations Coordination has the ability to focus entirely on the LHC – Form a resource for the LHC experiments in their interactions with the sites, the infrastructure providers, and the projects Makes Operations Coordination unique 8 Focusing on LHC Maria Girone, CERN

Moving the effort towards the WLCG collaboration – Chairs and task forces leaders from CERN, Tier1 and Tier2 – Significant fraction of the coordination effort resides in CERN-IT Forthnightly Meeting Chairs: M. Girone (CERN), J. Flix (PIC), A. Forti (Manchester) Secretary: A. Sciabà (CERN) Experiment representatives: M. Litmaath (ALICE), A. Klimentov (ATLAS), I. Ueda (ATLAS), I. Fisk (CMS), O. Gutsche (CMS), M. Cattaneo (LHCb), S. Roiser (LHCb) Tier-1 representatives Site/region representatives: M. Jouvin (GDB chair), J. Flix (PIC), J. Coles (GridPP), I. Collier (GridPP), A. Forti (GridPP), L. Dell'Agnello (CNAF), R. Santana (ROC-LA), M. Barroso (CERN) Infrastructure projects: T. Ferrari (EGI), R. Quick (OSG), Tim Cartwright (OSG) 9 Team Composition Maria Girone, CERN

CVMFS deployment gLExec deployment Tracking tools evolution perfSONAR deployment Middleware deployment FTS 3 integration and deployment Squid monitoring (concluded) SHA-2 migration Xrootd deployment Scientific Linux 6 migration 10 Ongoing Activities Maria Girone, CERN

CVMFS Clear from the progress on tasks that having people to take ownership of the activities is an effective strategy OSG has a program to support opportunistic Computing through user mounts of CVMFS using Parrot; only works with a healthy and scalable CVMFS system to start Maria Girone, IT-ES11 Ongoing Activities Deployment Status Dec ’12 (last GDB)

glExec – The long running story that this is mandated, but not installed – The task force is closing out the deployment Expect to ramp up after winter conferences Already used in production at several CMS sites (20 EGI sites, 15 OSG sites) 193 CE tested (93 successfully) milestones to be defined with experiment input – CMS would like to have at least 90% of sites with gLExec enabled by July 1 st – LHCb needs testing gLExec support (already in the code) Will start soon – Integration in PanDA well proceeding 12 Ongoing Activities Maria Girone, CERN

PerfSONAR Coordinated by the WLCG Ops Task Force – Good representation from USATLAS/USCMS and internet2 (4+ people) Sites indicated by experiments have been asked to install perfSONAR – And configure through the central mesh configuration – 70 sites of 113 have PS installed (100% of OSG sites) OSG involvement is critical – Liaison with PS developers (ESnet and internet2) – Operational support for deployment in the OSG infrastructure – Development of the perfSONAR Modular Dashboard 13 Ongoing Activities Maria Girone, CERN

Xrootd deployment To help ATLAS (FAX) and CMS (AAA) collaborations in deploying at sites Current focus is on systematically monitoring the health of the service at each site – Study the possibility to consolidate tests currently deployed separately by CMS and ATLAS Deployment of SAM tests, registration of the service in GOCDB File Access Monitoring is in an advanced state – Monalisa monitoring, Dashboard Transfer monitoring, data popularity monitoring – Discussion ongoing on the approach to follow for the maintenance in operation of the Xrootd UDP collectors Pool of collectors maintained by a single organization or regional collectors US (OSG) + EU (CERN) 14 Ongoing Activities Maria Girone, CERN

FTS 3 – Several new features demonstrated MySQL backend, new monitoring page, user and SE blacklists, etc. xrootd 3 rd party transfers, explicit file staging (for LHCb) Central FTS server (Oracle) – Stress tests ongoing and ramping up after winter conferences – Proposal for a deployment schedule will follow (April 2013) 15 Ongoing Activities Maria Girone, CERN

– Cloud infrastructure testing with experiment workflows Leveraging IT expertise on Agile infrastructure Agile Infrastructure still under testing but will evolve rapidly into production Commissioning of the remote Tier0 (Budapest) – SL6 deployment Task force formed. Target migration date by end 2013 – Data placement optimization of storage resources based on data popularity information – Data access Transparent access to remote and shared Tier-1 storage, WNs on the OPN – Operations of Common Analysis Framework 16 Opportunities during LS 1 Maria Girone, CERN

Operations Coordination is a convenient forum to interact between sites and experiments – EGI will use Operations Coordination for release discussions as well It would be good to get a closer interaction also with OSG through the Operations Coordination to collaborate Operations Coordination is also itself a resource – the task force model has proven to itself to be efficient in commissioning and deployment 17 Working Together Maria Girone, CERN

The Operations Coordination team has taken over the responsibility for WLCG Operations – ‘’ Impressive progress in deployment in the first 6 months ‘’ from Feb.2013 GDB minutes – Task forces progressing well. One concluded Use of common tools and approaches by the LHC experiments is the only way forward towards sustainable operations – Need help from experiments, sites, EGI & OSG Constructive collaboration spirit, sharing of responsibilities and tasks are key – Push for a stronger active participation of Tier-2 people in the task forces 18 Conclusions Maria Girone, CERN

Up-to-date twiki and action list at – Coordination Coordination Mailing lists – - general communication – - team member’s communication – - internal task forces communications - for – contain at least all contacts registered in GOCDB and OIM for the WLCG sites Maria Girone, CERN19 Twiki and Mailing lists

20 Backup Slides

Squid monitoring – Task force concluded its activity with some agreements Register all Squid servers in GOCDB/OIM Propose specific SAM tests based on MRTG monitoring information and possibly on hits to Frontier/CVMFS 21 Concluded tasks Maria Girone, CERN