CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 6 th July 2009.

Slides:



Advertisements
Similar presentations
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Advertisements

Sue Foffano LCG Resource Manager WLCG – Resources & Accounting LHCC Comprehensive Review November, 2007 LCG.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
RAL Site Report HEPiX Fall 2013, Ann Arbor, MI 28 Oct – 1 Nov Martin Bly, STFC-RAL.
Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
CERN - IT Department CH-1211 Genève 23 Switzerland t LCG Deployment GridPP 18, Glasgow, 21 st March 2007 Tony Cass Leader, Fabric Infrastructure.
CERN IT Department CH-1211 Genève 23 Switzerland t The new (remote) Tier 0 What is it, and how will it be used? The new (remote) Tier 0 What.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status Tony Cass (With thanks to Miguel Coelho dos Santos & Alex Iribarren) LCG-LHCC.
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
CERN IT Department CH-1211 Genève 23 Switzerland t Experience with Windows Vista at CERN Rafal Otto Internet Services Group IT Department.
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR Operational experiences HEPiX Taiwan Oct Miguel Coelho dos Santos.
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
CERN IT Department CH-1211 Genève 23 Switzerland t ITIL at CERN Tony Cass HEPiX LBL, 29 th October 2009.
John Gordon STFC-RAL Tier1 Status 9 th July, 2008 Grid Deployment Board.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Castor development status Alberto Pace LCG-LHCC Referees Meeting, May 5 th, 2008 DRAFT.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
CERN - IT Department CH-1211 Genève 23 Switzerland The Tier-0 Road to LHC Data Taking CPU ServersDisk ServersNetwork FabricTape Drives.
CERN - IT Department CH-1211 Genève 23 Switzerland t CASTOR Status March 19 th 2007 CASTOR dev+ops teams Presented by Germán Cancio.
Virtualised Worker Nodes Where are we? What next? Tony Cass GDB /12/12.
Ian Bird LCG Project Leader WLCG Status Report 23 rd February 2009 Overview Board.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Upgrade Project Wayne Salter HEPiX November.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
CERN - IT Department CH-1211 Genève 23 Switzerland Tier-0 CCRC’08 May Post-Mortem Miguel Santos Ricardo Silva IT-FIO-FS.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Automatic server registration and burn-in framework HEPIX’13 28.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Consolidation Project Vincent Doré IT Technical.
Project Status Report Ian Bird Computing Resource Review Board 20 th April, 2010 CERN-RRB
Tier-1 Andrew Sansum Deployment Board 12 July 2007.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.
DJ: WLCG CB – 25 January WLCG Overview Board Activities in the first year Full details (reports/overheads/minutes) are at:
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
1 Cloud Services Requirements and Challenges of Large International User Groups Laurence Field IT/SDC 2/12/2014.
SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
SRM v2.2 Production Deployment SRM v2.2 production deployment at CERN now underway. – One ‘endpoint’ per LHC experiment, plus a public one (as for CASTOR2).
LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
CERN - IT Department CH-1211 Genève 23 Switzerland t Power and Cooling Challenges at CERN IHEPCCC Meeting April 24 th 2007 Tony Cass.
CERN IT Department CH-1211 Genève 23 Switzerland t CCRC’08 Review from a DM perspective Alberto Pace (With slides from T.Bell, F.Donno, D.Duelmann,
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
CERN IT Department CH-1211 Genève 23 Switzerland t EIS Section input to GLM For GLM attended by Director for Computing.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
CERN IT Department CH-1211 Genève 23 Switzerland The CERN internal Cloud Sebastien Goasguen, Belmiro Rodrigues Moreira, Ewan Roche, Ulrich.
Service Challenge 3 CERN
Experiences and Outlook Data Preservation and Long Term Analysis
Update on Plan for KISTI-GSDC
Luca dell’Agnello INFN-CNAF
Olof Bärring LCG-LHCC Review, 22nd September 2008
1 VO User Team Alarm Total ALICE ATLAS CMS
Ákos Frohner EGEE'08 September 2008
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 6 th July 2009

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 2 Agenda Resources CASTOR status and performance Progress with new data centre project

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 3 Agenda Resources CASTOR status and performance Progress with new data centre project

CERN IT Department CH-1211 Genève 23 Switzerland t Procurements 2009 Status & 2010 outlook CPU & Disk – ~60% of foreseen 2009 pledges available in April –(Additional ATLAS request not included) – Balance to be operational in October Tight schedule, but agreed with Purchasing dept. Exploring options to purchase iSCSI disk storage –Greater cost/TB, but avoids interruption to CASTOR service due to disk server failure (#1 cause of incidents; disk failures are handled transparently) – 2010 procurement planning underway Tenders issued in June; adjudication in ~November. Tape – Expect ~20PB spare capacity by October. – Will purchase “high density” IBM robot in autumn 14,000 slots — 14PB – Can convert an existing IBM robot to “high density’ version in 2010 (with no service interruption) if additional capacity required. Tier0 Status - 4 February Status On Schedule  Purchased

CERN IT Department CH-1211 Genève 23 Switzerland t Resource Usage Efficiency Now have 167 boxes dedicate to running VO specific functions. CPU utilisation is poor. Clear opportunity to reduce server count (and power consumption) through virtualisation. Consolidation project underway – Requires reliable storage for virtual machine images as we need to be able to support virtual machine migration. – Production service expected by end Scheduling of virtual machine images for batch demonstrated end-June. – Expect autumn hardware to be installed with hypervisors – More work needed to allow LCG-wide scheduling of virtual machines, however. Tier0 Status - 5 (VO Boxes)

CERN IT Department CH-1211 Genève 23 Switzerland t SLC5 Migration Migration of batch resources underway – All new capacity introduced will be SLC5 based – Existing capacity migrated progressively. Migration of LXPLUS alias is an issue: – Principle is easy: switch when majority of batch capacity is SLC5. But measured CERN: switch early on grid: switch late. – No clear/obvious solution yet. [Rapid migration of other grid sites would help. And is maybe sensible before September anyway?] Tier0 Status - 6 Still an issue To be discussed at GDB on Wednesday February Status

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 7 Agenda Resources CASTOR status and performance Progress with new data centre project

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 8 Agenda Resources CASTOR status and performance – CASTOR status & plans – Metrics Progress with new data centre project

CERN IT Department CH-1211 Genève 23 Switzerland t Status – Generally quiet/good... –... except for tape repack BUT we are reasonably confident about our ability to support production; user analysis is the concern and there is no major load. – CASTOR 2.1.8, with integrated xrootd redirector, should deliver improvements for analysis LSF bypass & reduced latency, but also improved scalability as xrootd daemon has smaller footprint than rfio (to be deprecated?) Also delivers –end-to-end checksumming for rfio –User space accounting (required for later deployment of quotas) –operational improvements (notably automatic draining of disk servers) –fixes to problems identified by repack (main reason for deployment delays) Schedule: end-Feb release, in production on c2cernt3 end-March, deployment for experiment instances in April. CASTOR Status & Plans Tier0 Status - 9 February Status CASTOR deployment delayed, but in production for STEP. Improved xrootd implementation being tested by ALICE Excellent performance for STEP Repack much improved, even if some concerns remain CASTOR readiness for Tier0/Tier1 production confirmed. Still lack experience supporting heavy analysis load. Need this experience to understand if/where improvements needed.

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 10 Performance metrics Metrics have been implemented and deployed on preproduction cluster – Data collected in lemon – RRD graphs not yet implemented Production deployment delayed for several reasons – New metrics imply several changes to exception/alarms and automated actions used in production – An unexpected technical dependency on the late SRM 2.7 version Ongoing work to back-port the implementation All still true  November Status All but two of the agreed performance metrics now available via Lemon Exceptions are SRM time to TURL, but needs SRM 2.8 for reasonable time- stamp granularity Migration rate which was available but is currently broken after system update.

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status- 11 CASTORATLAS (courtesy Miguel Santos) Need to add plot with migration rate Currently missing, to be done by fixing iptables with next sensor. STEP’09 Adaptive migrations

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 12 CASTORATLAS (courtesy Miguel Santos) Average read open time of 4s (see disk cache read scores) Average write open time of 1.4s Peaks of 400 running transfers Peaks of 20 pending transfers Using ~22 tape drives ~5% of available transfer slots used. Only Tier0 function (t0atlas) exercised! SRM time to TURL

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 13 Agenda Resources CASTOR status and performance Progress with new data centre project

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 14 New data centre project Reminder: the selected strategy is to do a single tender for an overall solution Four phase process developed: 1.Request (many) conceptual designs 2.Commission 3-4 companies submitting conceptual designs to develop an outline design 3.In-house, turn a selected outline design into plans and documents enabling 4.Single tender for overall construction.

CERN IT Department CH-1211 Genève 23 Switzerland t Deadline: 28 th November – Contacts with all 4 companies during design phase – All 4 companies say deadline will be met Meetings to review proposed designs scheduled in week of December 8 th. Market Survey in preparation as first stage in selection of company for detailed design & construction. Discussions in Oslo on 28 th November to further investigate possible remote server installation in 2011 (and beyond) – RAL also have power available in 2011, but not as much and for a shorter period. Tier0 Status - 15 Outline Design Phase November Status

CERN IT Department CH-1211 Genève 23 Switzerland t Four designs reviewed – No clear winner, but consensus on leading design. New Management supports project. Good, but… – New requirements --- “Green” & Prévessin heat recovery option – New organisation brings new players to brief “Single Contract for construction” agreed Agreement to work with one company to deliver fully acceptable design with modifications for new requirements. – Will lead to ~6 month delay. – [Personal view] Plan to continue with only one company should be agreed by Directorate now to avoid potential hiccups later. Frédéric Hemmer discussing with Sergio Bertolucci. Will need to revisit option to install equipment at University of Oslo. Tier0 Status - 16 Current Status February Status

CERN IT Department CH-1211 Genève 23 Switzerland t Current Status “New organisation brings new players to brief” – Wolfgang von Rüden asked to review past assumptions (e.g. no power available on Meyrin site); reported early June. – Meeting with Sergio Bertolucci organised for July 20 th. New interest from Norway to provide centre for CERN near Stavanger; design and costs to be available by end-August. Project delayed by 6-9 months. However, latest power projections extend usable lifetime of B513 by 1 year. Tier0 Status - 17

CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 18 Questions? Comments?