Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t CF CERN Remote Hosting First Experiences Wayne Salter (with input.

Slides:



Advertisements
Similar presentations
Implementing a Regional CMMS (Computerized Maintenance Management System) Jerrard Whitten, GISP GIS/IT Manager.
Advertisements

CERN IT Department CH-1211 Genève 23 Switzerland t The Wigner Data Centre An Extension to the CERN Data Centre.
Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
CERN - IT Department CH-1211 Genève 23 Switzerland t Service-Now UDS training [Jan 2011] - 1 Service-now training for UDS Service-now training.
CERN IT Department CH-1211 Genève 23 Switzerland t Options for Expanding CERN’s Computing Capacity Without A New Building Medium term plans.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Business Continuity Overview Wayne Salter HEPiX April 2012.
CERN IT Department CH-1211 Geneva 23 Switzerland t T0 report WLCG operations Workshop Barcelona, 07/07/2014 Maite Barroso, CERN IT.
March 27, IndiaCMS Meeting, Delhi1 T2_IN_TIFR of all-of-us, for all-of-us, by some-of-us Tier-2 Status Report.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
IT Department 29 October 2012 LHC Resources Review Board2 LHC Resources Review Boards Frédéric Hemmer IT Department Head.
Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
CERN IT Department CH-1211 Genève 23 Switzerland t The new (remote) Tier 0 What is it, and how will it be used? The new (remote) Tier 0 What.
CERN IT Department CH-1211 Genève 23 Switzerland t Service Management GLM 15 November 2010 Mats Moller IT-DI-SM.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Facilities Evolution Wayne Salter / Vincent Doré.
Infrastructure Improvements 2010 – November 4 th – Hepix – Ithaca (NY)
R. Fantechi. TDAQ commissioning Status report on Infrastructure at the experiment PC farm Run control Network …
KISTI-GSDC SITE REPORT Sang-Un Ahn, Jin Kim On the behalf of KISTI GSDC 24 March 2015 HEPiX Spring 2015 Workshop Oxford University, Oxford, UK.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Update on Windows 7 at CERN & Remote Desktop.
Presentation to Portfolio Committee on Police on SAPS Communication Projects By: Mr Sithembiso Freeman Nomvalo
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Upgrade Project Wayne Salter HEPiX November.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
CERN IT Department CH-1211 Genève 23 Switzerland t Experience with new Service Management at CERN Hepix 2012 Conference Prague, April.
CERN - IT Department CH-1211 Genève 23 Switzerland t OIS Deployment of Exchange 2010 mail platform Pawel Grzywaczewski, CERN IT/OIS HEPIX.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Automatic server registration and burn-in framework HEPIX’13 28.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Consolidation Project Vincent Doré IT Technical.
Status of India CMS Grid Computing Facility (T2-IN-TIFR) Rajesh Babu Muda TIFR, Mumbai On behalf of IndiaCMS T2 Team July 28, 20111Status of India CMS.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
CERN - IT Department CH-1211 Genève 23 Switzerland t IT Dept Presentation [September 2009] - 1 User Support - Future Changes in Policy and.
CERN IT Department CH-1211 Geneva 23 Switzerland t WLCG Operation Coordination Luca Canali (for IT-DB) Oracle Upgrades.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Hammercloud and Nagios Dan Van Der Ster Nicolò Magini.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal at CERN Juraj Sucik Jarosław Polok.
CERN - IT Department CH-1211 Genève 23 Switzerland t A Quick Overview of ITIL John Shade CERN WLCG Collaboration Workshop April 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Alarming with GNI VOC WG meeting 12 th September.
Energy Savings in CERN’s Main Data Centre
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Hardware failures Wayne Salter on behalf of Olof B ärring.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Facilities Evolution Wayne Salter HEPiX May 2011.
OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
The Worldwide LHC Computing Grid Frédéric Hemmer IT Department Head Visit of INTEL ISEF CERN Special Award Winners 2012 Thursday, 21 st June 2012.
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1 how to profit of the ATLAS HLT farm during the LS1 & after Sergio Ballestrero.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
ICT Disaster Recovery Plan Monitoring & Audit Committee 23 rd November 2010.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN IT Facility Planning and Procurement HEPiX Fall 2010 Workshop.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
Farming Andrea Chierici CNAF Review Current situation.
Enterprise Vitrualization by Ernest de León. Brief Overview.
Academia Sinica Grid Computing Centre (ASGC), Taiwan
Remote Hosting Project
The Beijing Tier 2: status and plans
CERN Data Centre ‘Building 513 on the Meyrin Site’
Luca dell’Agnello INFN-CNAF
Olof Bärring LCG-LHCC Review, 22nd September 2008
Remote Hosting Project
HEPiX Fall 2017 CERN project Follow-up
Simulation use cases for T2 in ALICE
Implementation of a small-scale desktop grid computing infrastructure in a commercial domain    
Presentation transcript:

Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Remote Hosting First Experiences Wayne Salter (with input from many colleagues) HEPiX Autumn Meeting in Lincoln

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 2 Overview Brief History Installation Status Experience –General –Commercial –Procurement –Operations –Networking –End User Utilisation Lessons Learnt Conclusions

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 3 Brief History Call for interest launched in June 2010 Responses received in November 2010 followed up by many visits/meetings Decision to proceed taken in spring 2011 Tender sent out in September 2011 for adjudication in March 2012 Contract placed with the Wigner Research Centre for Physics in May 2012 and construction started First room ready and equipment delivery started in January 2013 Continual build up in capacity since Official inauguration in June 2013 Building works finished in September 2013

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 4 Brief History ( timeline not to scale ) Many visits/ meetings Continual build up in capacity Call for interest launched June 2010 Responses received Nov 2010 Decision to proceed taken Spring 2011 Tender sent out Sep 2011 FC adjudication March 2012 Contract placed with The Wigner Research Centre for Physics May 2012 First room ready and equipment delivery started January 2013 Official inauguration June 2013 Building works finished Sep 2013

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 5 Brief History in pictures

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 6 Installation Status Two rooms are in operation for CERN with 122 racks used 1276 CPU servers – 319 2U quads (25216 cores, GB RAM, 5904 TB disk) 568 external storage units – 4U JBODs each with 24 disks (52608 TB in total TB on 3TB drives and on 4TB drives) Network equipment; 7 high end routers, 43 10GbE and 47 1GbE switches, 1 management router and 107 management switches Additional large deliveries expected in December –More than doubling CPU capacity and adding 40% more disk storage capacity and requiring use of third room –Investigating possibility of having a 3 rd 100Gbps link

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 7 Experience - General On the whole good – generally works well –Remote operation and monitoring works well –No out of hours support for CERN equipment Teams visiting each other was very useful –Help given with initial setups Over reliance on one person Reporting –Regular bi-weekly operational telecom –Monthly reports (since 2014) Operations and Billing Can be time consuming dealing with new requirements, e.g. Russian Tier 1 link

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 8 Experience - Commercial Tendering process –Specification as open as possible –Adjudication based on a defined ramp up profile, failure rate estimates, and included networking from closest GEANT PoP VAT Exemption –Took many months to sort out and required help from Wigner Insurance split –Discussions still on-going!! Billing –First bill only in 2014 after more than one year of running –Detailed spreadsheet as part of monthly operations report

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 9 Experience - Procurement Detailed instructions to ease reception and installation –However, following deliveries is more complex Delivery directly to Wigner except for network switches –One case of damaged equipment during transport Need to provide detailed information in advance on deliveries as well as transport Issues with unloading of equipment at Wigner Effectively doubled the number of orders to be processed

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 10 Experience – Operations/I Late availability of room for storage and repairs Auto-registration and stress testing of machines works well Room/rack layout responsibilities ‘unclear’ Various infrastructure issues –Two HV incidents but protected by UPS/diesel –Cooling pressure issue causing all chillers to be switched off –Leak in cooling pipe –Complex new facility not completely understood. Review conducted by TÜV –Often slow to get detailed reports

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 11 Experience – Operations/II More difficult than expected to establish good workflows Formal procedures and approach being gradually introduced as experience is gained Difficult to use full available power –Tender estimate of power density does not reflect the reality Difficult to verify power consumption figures Non-standard setups and debugging of tricky issues are more complicated

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 12 Experience – Operations/III Role of the SysAdmins has not been affected Repair Service –Runs well –Good quality interventions –Good response time to SNOW tickets –Information flow is more complicated with more parties involved – has not been ideal –Data requested not always provided in a timely manner Still very limited usage of Wigner for business continuity –Lack of second network hub –Priority on moving to new critical room at CERN –Difficulties in getting allocation of resources for BC

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 13 Experience - Networking

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 14 Experience - Networking Long discussions on initial network setup in the rooms Takes longer to solve simple problems/lot of mail exchange/no out-of-hours support –Required changes to operational approach –Now giving Wigner access to SPECTRUM monitoring Less time for deployment of new equipment ( for CERN ) Availability of 100Gbps links not as expected –Long running problems with one of the links (took many months to debug) –Over past 100 days; link 1 (99.7%), link 2 (99.96%) Broken equipment takes longer to be replaced by manufacturer –Try to minimize the number of shipments –Shipments must come via CERN

CERN IT Department CH-1211 Geneva 23 Switzerland t CF HEPiX Autumn Meeting Lincoln - 15 Experience - End User Utilisation Complaints of performance of jobs at Wigner However –Mixture of SLC5/6, Intel/AMD, VM/Bare metal –Different type of jobs –Locality of data –Optimisation of S/W for Intel whilst most CPU server in Wigner were AMD –Configuration options, e.g. XROOT TTreeCache When comparing like with like only a minimal drop in efficiency –However, EOS servers deployed to Wigner –Will soon deploy CVMFS service at Wigner Investigations are still on-going

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lessons Learnt New facility and hence some teething problems as well as one design issue Lack of experience on both sides –but due to collaborative and flexible approach issues have generally been resolved quickly Personal contact is VERY important –Help with first installations –Teams meeting each other –Regular telecoms Good communication is important Good documentation helps a LOT –Still need to improve SLA and other formal arrangements Things always take longer than foreseen HEPiX Autumn Meeting Lincoln - 16

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Conclusions In general everything is running smoothly Issues have arisen –But in general have been resolved quickly due to flexibility and good relations on both sides –VAT and insurances have taken longer due to external parties 100Gbps links have not been as stable as expected Some questions raised regarding job efficiency Full power capacity usage will not be possible due to lower power density than expected With experience it should be possible to produce more detailed formal documents next time (….) Still waiting to implement more extensive BC Contract due to run until end of 2019 HEPiX Autumn Meeting Lincoln - 17

CERN IT Department CH-1211 Geneva 23 Switzerland t CF Thank you for your attention! Questions? HEPiX Autumn Meeting Lincoln - 18