User Board Input Tier Storage Review 21 November 2008 Glenn Patrick Rutherford Appleton Laboratory.

Slides:



Advertisements
Similar presentations
Storage Review David Britton,21/Nov/ /03/2014 One Year Ago Time Line Apr-09 Jan-09 Oct-08 Jul-08 Apr-08 Jan-08 Oct-07 OC Data? Oversight.
Advertisements

RAL Tier1: 2001 to 2011 James Thorne GridPP th August 2007.
User Board - Supporting Other Experiments Stephen Burke, RAL pp Glenn Patrick.
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
1 Data Storage MICE DAQ Workshop 10 th February 2006 Malcolm Ellis & Paul Kyberd.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
CASTOR Upgrade, Testing and Issues Shaun de Witt GRIDPP August 2010.
Glenn Patrick Rutherford Appleton Laboratory GridPP22 1 st April 2009.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
Operation of CASTOR at RAL Tier1 Review November 2007 Bonny Strong.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
Concept: Well-managed provisioning of storage space on OSG sites owned by large communities, for usage by other science communities in OSG. Examples –Providers:
Tier1 Status Report Martin Bly RAL 27,28 April 2005.
Status of 2015 pledges 2016 requests RRB Report Concezio Bozzi INFN Ferrara LHCb NCB, November 3 rd 2014.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
GridPP3 Project Management GridPP20 Sarah Pearce 11 March 2008.
Project Management Sarah Pearce 3 September GridPP21.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
11 March 2008 GridPP20 Collaboration meeting David Britton - University of Glasgow GridPP Status GridPP20 Collaboration Meeting, Dublin David Britton,
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
RAL Site Report Castor Face-to-Face meeting September 2014 Rob Appleyard, Shaun de Witt, Juan Sierra.
Dan Tovey, University of Sheffield User Board Overview Dan Tovey University Of Sheffield.
25th October 2006Tim Adye1 RAL Tier A Tim Adye Rutherford Appleton Laboratory BaBar UK Physics Meeting Queen Mary, University of London 25 th October 2006.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Review of Recent CASTOR Database Problems at RAL Gordon D. Brown Rutherford Appleton Laboratory 3D/WLCG Workshop CERN, Geneva 11 th -14 th November 2008.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
CMS Computing Model Simulation Stephen Gowdy/FNAL 30th April 2015CMS Computing Model Simulation1.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
UK Tier 1 Centre Glenn Patrick LHCb Software Week, 28 April 2006.
Your university or experiment logo here Future Disk-Only Storage Project Shaun de Witt GridPP Review 20-June-2012.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
Tier-1 Andrew Sansum Deployment Board 12 July 2007.
Future Plans at RAL Tier 1 Shaun de Witt. Introduction Current Set-Up Short term plans Final Configuration How we get there… How we plan/hope/pray to.
Storage Classes report GDB Oct Artem Trunov
WLCG Service Report ~~~ WLCG Management Board, 31 st March 2009.
LHCb report to LHCC and C-RSG Philippe Charpentier CERN on behalf of LHCb.
WLCG Service Report ~~~ WLCG Management Board, 18 th September
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,
Computing Operations Report 29 Jan – 7 June 2015 Stefan Roiser NCB 8 June 2015.
CASTOR Status at RAL CASTOR External Operations Face To Face Meeting Bonny Strong 10 June 2008.
Your university or experiment logo here User Board Glenn Patrick GridPP20, 11 March 2008.
01. December 2004Bernd Panzer-Steindel, CERN/IT1 Tape Storage Issues Bernd Panzer-Steindel LCG Fabric Area Manager CERN/IT.
The ATLAS Computing & Analysis Model Roger Jones Lancaster University ATLAS UK 06 IPPP, 20/9/2006.
SRM v2.2 Production Deployment SRM v2.2 production deployment at CERN now underway. – One ‘endpoint’ per LHC experiment, plus a public one (as for CASTOR2).
LHCb Grid MeetingLiverpool, UK GRID Activities Glenn Patrick Not particularly knowledgeable-just based on attending 3 meetings.  UK-HEP.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
LHCb Computing activities Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group.
RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.
Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006.
UK Status and Plans Catalin Condurache – STFC RAL ALICE Tier-1/Tier-2 Workshop University of Torino, February 2015.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
SRM2 Migration Strategy
Luca dell’Agnello INFN-CNAF
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
Castor services at the Tier-0
Olof Bärring LCG-LHCC Review, 22nd September 2008
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
The LHCb Computing Data Challenge DC06
Presentation transcript:

User Board Input Tier Storage Review 21 November 2008 Glenn Patrick Rutherford Appleton Laboratory

2 UB: Castor Migration Path 21 December December CMS CSA06 worked and full production Castor service expected from Jan Plan to switch off dCache 30 June June June Original schedule unrealistic. Agreed that dCache would not be terminated until at least end 2007 and a minimum of 6 months notice to be given. 20 June June Separate ATLAS, CMS, LHCb & Gen Castor instances proposed. 24 June June Migration to be completed by end of November November Still on track...New building also looms. Castor Data

3 Name Server 2 stager DLF LSF stager DLF LSF 1 Diskserver Tape Server Oracle stager Oracle NS+ vmgr Name Server 1 +vmgr CMS Stager Instance Atlas Stager Instance LHCb Stager Instance Repack and Small User Stager Instance Diskservers Oracle DLF Oracle stager Oracle DLF Oracle stager Oracle DLF Oracle DLF Oracle repack Oracle stager Tape Server Tape Server Tape Server Tape Server Tape Server repack Shared Services UK Tier 1 – Castor2 Mass Storage UB Total = 2222TB

4 Background of Shrinking Capacity Terabyte(10 12 )/Tebibyte(2 40 ) amnesty. ~10% inflation for those experiments which applied. From 2007/Q4. Disk0Tape1 caches – overhead not included in (some experiment requests). Currently ATLAS TB, LHCb – 16.9TB, ALICE – 5.6TB (CMS – n/w buffers?). dCache & Castor Duplicate resources in dCache and Castor for experiment migration/testing/etc. From 2008/Q1 and before. 5% Castor storage inefficiency hit taken (plus capacity audit). Experiments get what their data requires! From 5 Nov and a background of Experiment Uncertainties

5... But Still Made it! No headroom! No reserve! LHC pledges ~met LHCb Start-up Allocations 2008/Q4

6 Who has what? ATLAS CMS LHCb Not much left over for other experiments! Reminder: For “Other”, GridPP3 proposal only had: T1 Disk 2008 = 18TB T1 Disk 2009 = 31TB T1 Tape 2008 = 180TB T1 Tape 2009 = 310TB T1 CPU = 0 CASTOR ONLY

7 Required xrootd (not been highest Castor priority). Finally, in ALICE production Oct Low on manpower. Suffer from no RAL involvement. Getting back on track for 2009 (5.6TB disk0 to go to ~90TB). Communications improved (e.g. Cristina). “Other” Experiments 1 ALICE: ALICE: No storage resources deployed in Castor for most of 2008 (requirements revised downwards, deprecated due to h/w delays and ATLAS/CMS/LHCb given priority for CCRC08). MINOS MINOS: Still migrating dCache/NFS data. Also, MC system draws down several hundred flux files from Castor at start of each job – Castor seems to manage load even from multiple jobs (except for some benign errors). Double disk allowance for migration – last overdraft left? Limited manpower. Batch jobs submitted to UK Tier 1

8 “Other” Experiments 2 Silicon Detector Design Study: Silicon Detector Design Study: Urgent simulation required for physics benchmarks in Letter of Intent (due April 2009). Enhanced CPU allocation (268KSI2K) in absence of LHC work. 4M events reconstructed out of simulated 8M. Castor server deployed+SRM for staging to SLAC, but took time to deploy (PPD Tier2 helped out in meantime). Need to be able to be flexible for this sort of sudden activity... MICE: MICE: Currently setting up. RAL is “Tier 0” for experiment – pseudo real-time beam tuning, data distribution, etc. Castor server deployed and MICE currently setting up. BaBar: BaBar: The end approaches … long story since ~Sept. 2006! 49.6TB in NFS disk + ADS tape (35TB alloc). UKQCD: UKQCD: Plan to access Tier 1 via SRM. Large bid for Tier 1 tape submitted to HPC call. Need to engage on technical deployment (require 30TB disk, 1.8TB NFS so far). VO enabled, memory requirements?

9 Farewell (sort of)... For lingering legacy experiments still in dCache, “others” and new small VOs, probably give some minimal deployment in Castor…. shared disk pool. Others!

10 On the Horizon... T2K: On the horizon for Some disk already pre-allocated. 16TB out of 20TB. NA48/3: On the horizon for Some disk pre-allocated. 25TB out of 50TB. SUPER-B?: Depends on UK proposals. No allocation yet. ~5 TB disk1tape1 growing from mid SUPERNEMO: Mainly Tier 2 so far. Yet to feature at Tier1 (except perhaps under “other”). No storage allocation.

11 Apr 2007 Oct 2008 Apr - Sep 2007 Castor Experiment Planning Once allocations and overall strategy are agreed in UB, it is up to experiments to engage with T1 team over storage classes, space tokens, SRM end-points, etc. Series of weekly meetings evolved to deal with Castor technical issues. Monthly meetings for (mainly) other Tier 1 issues. Success of experiments correlated with how well they engage at these. Castor Bumpy Ride

12 Looking Forward Cant keep everybody happy all of the time. PPRP numbers are assumed for LHC experiments and smaller experiments only get the scraps that are left over. What will real LHC data look like? Background, increased event sizes, trigger rates, etc. can swallow up resources for the smaller experiments. What are the error bars? Won’t know until later in 2009… Smaller experiments (including ALICE) limited by manpower. Good communications + technical help (e.g. Janusz, Shaun, Matt, Catalin, Derek…) and documentation (Stephen) vital. Evolution of regular (weekly/monthly) storage/T1 meetings very successful and efficient. Need to be more pro-active here (make it part of process and condition of deploying allocations). Improve disk deployment and make more flexible (server quantisation). Easy for me to say of course. Need to be realistic about effort all this takes (T1+expts)!