Your university or experiment logo here NextGen Storage Shaun de Witt (STFC) With Contributions from: James Adams, Rob Appleyard, Ian Collier, Brian Davies,

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…
Wahid Bhimji SRM; FTS3; xrootd; DPM collaborations; cluster filesystems.
DPM Basics and its status and plans Wahid Bhimji University of Edinburgh GridPP Storage Workshop – Apr 2010 Apr-101Wahid Bhimji – DPM.
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
CASTOR Upgrade, Testing and Issues Shaun de Witt GRIDPP August 2010.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
CASPUR Site Report Andrei Maslennikov Sector Leader - Systems Catania, April 2001.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
Ceph Storage in OpenStack Part 2 openstack-ch,
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
Latest Relevant Techniques and Applications for Distributed File Systems Ela Sharda
StoRM Some basics and a comparison with DPM Wahid Bhimji University of Edinburgh GridPP Storage Workshop 31-Mar-101Wahid Bhimji – StoRM.
RAL Site Report Castor F2F, CERN Matthew Viljoen.
Lustre File System Evaluation at FNAL CHEP'09, Prague March 23, 2009 Stephen Wolbers for Alex Kulyavtsev, Matt Crawford, Stu Fuess, Don Holmgren, Dmitry.
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
Your university or experiment logo here CEPH at the Tier 1 Brain Davies On behalf of James Adams, Shaun de Witt & Rob Appleyard.
RAL Site Report Castor Face-to-Face meeting September 2014 Rob Appleyard, Shaun de Witt, Juan Sierra.
CERN - IT Department CH-1211 Genève 23 Switzerland t CASTOR Status March 19 th 2007 CASTOR dev+ops teams Presented by Germán Cancio.
Storage and Storage Access 1 Rainer Többicke CERN/IT.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Your university or experiment logo here The Protocol Zoo A Site Presepective Shaun de Witt, STFC (RAL)
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
2012 Objectives for CernVM. PH/SFT Technical Group Meeting CernVM/Subprojects The R&D phase of the project has finished and we continue to work as part.
Your university or experiment logo here Future Disk-Only Storage Project Shaun de Witt GridPP Review 20-June-2012.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Future Plans at RAL Tier 1 Shaun de Witt. Introduction Current Set-Up Short term plans Final Configuration How we get there… How we plan/hope/pray to.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
EMI is partially funded by the European Commission under Grant Agreement RI Roadmap & Future Work Ricardo Rocha ( on behalf of the DPM team )
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
GridPP2 Data Management work area J Jensen / RAL GridPP2 Data Management Work Area – Part 2 Mass storage & local storage mgmt J Jensen
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
First Experiences with Ceph on the WLCG Grid Rob Appleyard Shaun de Witt, James Adams, Brian Davies.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
DDN Web Object Scalar for Big Data Management Shaun de Witt, Roger Downing (STFC) Glenn Wright (DDN)
Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief.
An Introduction to GPFS
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Standard Protocols in DPM Ricardo Rocha.
Predrag Buncic CERN Data management in Run3. Roles of Tiers in Run 3 Predrag Buncic 2 ALICEALICE ALICE Offline Week, 01/04/2016 Reconstruction Calibration.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
dCache Paul Millar, on behalf of the dCache Team
StoRM: a SRM solution for disk based storage systems
Status of the SRM 2.2 MoU extension
James Casey, IT-GD, CERN CERN, 5th September 2005
StoRM Architecture and Daemons
Introduction to Data Management in EGI
Taming the protocol zoo
CERN Lustre Evaluation and Storage Outlook
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
CTA: CERN Tape Archive Overview and architecture
Research Data Archive - technology
WLCG Storage planning Andrea Sciabà
Data Management cluster summary
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Your university or experiment logo here NextGen Storage Shaun de Witt (STFC) With Contributions from: James Adams, Rob Appleyard, Ian Collier, Brian Davies, Matthew Viljoen

Your university or experiment logo here Introduction Why are we doing this? What were the evaluation criteria? Candidates Selections –And omissions Tests Status Timeline –Aim for production service for 2014 data run

Your university or experiment logo here Why are we doing this CASTOR issues –CASTOR optimised for many disk servers –Scheduling overhead/’hot spotting’ TransferManager (LSF replacement) has improved this A LOT –Oracle Costs! –Nameserver is SPF –Not mountable file system Limits take-up outside of WLCG/HEP –Requires specific expertise Disk-only not as widely deployed –EOS (CERN ATLAS+CMS), DPM (ASGC) –Could cause delays in support resolution –Reduced ‘community support’ –Risk of meeting future requirements for disk-only

Your university or experiment logo here Criteria ‘Mandatory’: –Must make more effective use of existing h/w –Must not restrict hardware purchasing –Must be able to use ANY commodity disk –Must support end-to-end checksumming –Must support ADLER32 checksums –Must support xrootd and gridFTP –Must support X509 authentication –Must support ACLs –Must be scalable to 100s Petabytes –Must scale to > files –Must at least match I/O performance of CASTOR

Your university or experiment logo here Criteria Desirable –Should provide NFS mountable file system –Should be resilient to hardware failure (disk/memory/node) –Should support checksum algorithms other than ADLER32 –Should be independent of licensed software –Any required database should be supported by STFC DB Team –Should provide an HTTP or WebDAV interface –Should support SRM interface –Should provide a POSIX interface and file protocol –Should support username/password authentication –Should support kerberos authentication –Should support ‘hot file’ or ‘hot block’ replication

Your university or experiment logo here What else are we looking for? Draining –Speed up deployment/decommissioning of new hardware Removal –Great if you can get data in fast, but if deletes are slow... Directory entries –We already know experiments have large numbers of files in each directory Support –Should have good support and wide usage

Your university or experiment logo here Candidates HEP (ish) solutions –dCache, DPM, STORM+Lustre, AFS Parallel and ‘Scale-Out’ Solutions –HDFS, OpenStack, OrangeFS, MooseFS, Ceph, Tahoe-LAFS, XtreemFS, GfamrFS, GlustreFS,GPFS Integrated Solutions –IBM SONAS, Isilon, Panasas, DDN, NetApp Paper based review –plus experience where we know sites that run things in operations

Your university or experiment logo here Selection

Your university or experiment logo here Storage Group Some interaction with Storage Group –But will increase –SdW to attend biweekly and report as required –Already using Chris at QML for Lustre support –Used Sam’s experience in earlier version of CEPH

Your university or experiment logo here Selections dCache –Still has SPOFs CEPH –Still under development; no lifecycle checksumming HDFS –Partial POSIX support using FUSE orangeFS –No file replication, fault tolerant version in preparation Lustre –Requires server kernel patch (although latest doesn’t), no hot file replication?

Your university or experiment logo here Some rejections... DPM –Well known and widely deployed within HEP –No hot-file replication, SPOFs, NFS interface not yet stable AFS –Massively scalable (>25k clients) –Not deployed in ‘high throughput production’, cumbersome to administer, security GPFS –Excellent performance –To get all required features, locked into IBM hardware, licensed EOS –Good performance, auto file replication –No support

Your university or experiment logo here Tests IOZone tests –A-la disk server acceptance tests Read/Write throughput tests (sequential and parallel) –File/gridFTP/xroot Deletions Draining Fault tolerance –Rebuild times

Your university or experiment logo here Status dCache –Under deployment. Testing not yet started Lustre –Deployed –IOZone tests complete, functional tests ongoing OrangeFS –Deployed –IOZone tests complete, functional tests ongoing CEPH –RPMs built, being deployed. HDFS –Installation complete, Problems running IOZone tests, other tests ongoing

Your university or experiment logo here Timeline Provisional –Dec 2012 – Final Selection (primary and reserve) –Mar 2013 – Pre-Production Instance available WAN testing and tuning –Jun 2013 – Production Instance Depends on... –Hardware –Quattor Profiling –SL6 support (and IPv6 ready) –Results from Test instance Open Questions: –One instance or one/instance as now –Migration from CASTOR