Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t DSS CASTOR and EOS status and plans Giuseppe Lo Presti on behalf.

Slides:



Advertisements
Similar presentations
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR Status Alberto Pace.
Advertisements

Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS TSM CERN Daniele Francesco Kruse CERN IT/DSS.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
CERN IT Department CH-1211 Genève 23 Switzerland t Tape-dev update Castor F2F meeting, 14/10/09 Nicola Bessone, German Cancio, Steven Murray,
CERN IT Department CH-1211 Geneva 23 Switzerland t Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CERN and Computing … … and Storage Alberto Pace Head, Data.
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
Data management in grid. Comparative analysis of storage systems in WLCG.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
CERN IT Department CH-1211 Genève 23 Switzerland t Data Management Evolution and Strategy at CERN G. Cancio, D. Duellmann, A. Pace With input.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
CERN IT Department CH-1211 Genève 23 Switzerland t Castor development status Alberto Pace LCG-LHCC Referees Meeting, May 5 th, 2008 DRAFT.
CERN - IT Department CH-1211 Genève 23 Switzerland Castor External Operation Face-to-Face Meeting, CNAF, October 29-31, 2007 CASTOR2 Disk.
CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
Future home directories at CERN
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
CERN IT Department CH-1211 Genève 23 Switzerland t Load Testing Dennis Waldron, CERN IT/DM/DA CASTOR Face-to-Face Meeting, Feb 19 th 2009.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data architecture challenges for CERN and the High Energy.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
2011+ Massimo Lamanna / CERN (for the IT-DSS group) ATLAS retreat Napoli 2-4 Jan 2011.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
CERN IT Department CH-1211 Genève 23 Switzerland t Increasing Tape Efficiency Original slides from HEPiX Fall 2008 Taipei RAL f2f meeting,
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
Tape archive challenges when approaching Exabyte-scale CHEP 2010, Taipei G. Cancio, V. Bahyl, G. Lo Re, S. Murray, E. Cano, G. Lee, V. Kotlyar CERN IT-DSS.
Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR status and development HEPiX Spring 2011, 4 th May.
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Storage plans for CERN and for the Tier 0 Alberto Pace (and.
CTA: CERN Tape Archive Rationale, Architecture and Status
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
CASTOR: possible evolution into the LHC era
Jean-Philippe Baud, IT-GD, CERN November 2007
CASTOR Giuseppe Lo Presti on behalf of the CASTOR dev team
WP18, High-speed data recording Krzysztof Wrona, European XFEL
StoRM: a SRM solution for disk based storage systems
Status of the SRM 2.2 MoU extension
Managing Storage in a (large) Grid data center
Eos at 6,500 kilometres wide An Australian Experience
Tape Operations Vladimír Bahyl on behalf of IT-DSS-TAB
Service Challenge 3 CERN
Introduction to Data Management in EGI
CERN Lustre Evaluation and Storage Outlook
Luca dell’Agnello INFN-CNAF
Castor services at the Tier-0
Olof Bärring LCG-LHCC Review, 22nd September 2008
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
CTA: CERN Tape Archive Overview and architecture
WLCG Storage planning Andrea Sciabà
Data Management cluster summary
CASTOR: CERN’s data management system
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR and EOS status and plans Giuseppe Lo Presti on behalf of CERN IT-DSS group 20th HEPiX - Vancouver - October 2011

Data & Storage Services Outline CASTOR and EOS strategies CASTOR status and recent improvements –Disk scheduling system –Tape system performance –Roadmap EOS status and production experience –EOS Architecture –Operations at CERN –Roadmap/Outlook 20th HEPiX - Vancouver - October

Data & Storage Services CASTOR and EOS Strategy: Keep Tier0/production activity in CASTOR –Not necessarily only tape-backed data –Typically larger files –Focus on tape performance Moving xroot-based end-user analysis to EOS –Disk-only storage –Focus on light(er) metadata processing 20th HEPiX - Vancouver - October

Data & Storage Services Data in CASTOR 20th HEPiX - Vancouver - October

Data & Storage Services Key tape numbers 55 PB of data 320M files Peak writing speed: 6GiB/s (Heavy Ion run, 2010) Infrastructure: - 5 CASTOR stager instances - 7 libraries (IBM+STK), 46K 1TB tapes, ~5K 4TB or 5TB tapes enterprise drives (T10000B, TS1130, T10000C) 20th HEPiX - Vancouver - October PB

Data & Storage Services CASTOR new Disk Scheduler Transfer Manager, replacing LSF in CASTOR Stress tested –Performances ~10x higher than peak production levels –Production throttled at 75 Hz (25 Hz per node) In production in all instances at CERN and at ASGC –Staged roll-in: first ATLAS, then CMS, then everybody else –Current release includes fixes for all observed issues, smooth operations since then 20th HEPiX - Vancouver - October

Data & Storage Services Increasing Tape Performance Improving read performance –Recall policies already in production since ~1 year Improving write performance –Implemented buffered Tape Marks over multiple files Theoretically approaching drive native speed regardless file size Practically, different overheads limit this –Soon available for wide deployment Currently being burned-in on a stager dedicated to Repack operations Working on simplifying and optimizing the stager database, by using bulk interfaces Expected timeframe for production deployment: spring th HEPiX - Vancouver - October

Data & Storage Services Increasing Tape Performance Measuring tape drive speed –Current data rate to tape: MiB/s Dominated by the time to flush the Tape Mark for each file Average file size ~200 MB –Preliminary tests with an STK T10000C Tape server with 10GigE interface 195 MiB/s avg. 214 MiB/s peak 20th HEPiX - Vancouver - October

Data & Storage Services Roadmap Towards fully supporting small files –Buffered Tape Marks and bulk metadata handling –In preparation for the next repack exercise in 2012 (~40 PB archive to be moved) Further simplification of the database schema –Still keeping full consistency approach, No-SQL solutions deliberately left out Focus on operations 20th HEPiX - Vancouver - October

Data & Storage Services Outline CASTOR and EOS strategies CASTOR status and recent improvements –Disk scheduling system –Tape system performance –Roadmap EOS status and production experience –EOS Architecture –Operations at CERN –Roadmap/Outlook 20th HEPiX - Vancouver - October

Data & Storage Services EOS: What is it... Easy to use standalone disk-only storage for user and group data with in-memory namespace –Few ms read/write open latency –Focusing on end-user analysis with chaotic access –Based on XROOT server plugin architecture –Adopting ideas implemented in Hadoop, XROOT, Lustre et al. –Running on low cost hardware no high-end storage –Complementary to CASTOR 20th HEPiX - Vancouver - October

Data & Storage Services Architecture MGM MGM FST FST MQ Management Server Pluggable Namespace, Quota Strong Authentication Capability Engine File Placement File Location Message Queue Service State Messages File Transaction Reports Shared Objects (queue+hash) File Storage File & File Meta Data Store Capability Authorization Check-summing & Verification Disk Error Detection (Scrubbing) xrootdserver xrootdserver xrootdserver MGMPlugin MQPlugin FSTPlugin Implemented as plugins in xrootd NS NS sync async 20th HEPiX - Vancouver - October FST FST FST Client

Data & Storage Services Access Protocol EOS uses XROOT as primary file access protocol –The XROOT framework allows flexibility for enhancements Protocol choice is not the key to performance as long as it implements the required operations –Client caching matters most Actively developed, towards full integration in ROOT SRM and GridFTP provided as well –BeStMan, GridFTP-to-XROOT gateway 20th HEPiX - Vancouver - October

Data & Storage Services Features Storage with single disks (JBODs, no RAID arrays) –redundancy by s/w using cheap and unreliable h/w Network RAID within disk groups –Currently file-level replication Online file re-replication –Aiming at reduced/automated operations Tunable quality of service –Via redundancy parameters Optimized for reduced latency –Limit on namespace size and number of disks to manage Currently operating with 40M files and 10K disks Achieving additional scaling by partitioning the namespace –Implemented by deploying separated instances per experiment 20th HEPiX - Vancouver - October

Data & Storage Services Self-healing Failures don’t require immediate human interventions –Metadata server (MGM) failover –Disks drain automatically triggered by I/O or pattern scrubbing errors after a configurable grace period Drain time on production instance < 1h for 2 TB disk (10-20 disks per scheduling group) –Sysadmin team replaces disks ‘asynchronously’, using admin tools to remove and re-add filesystems Procedure & software support is still undergoing refinement/fixing Goal: run with best effort support 20th HEPiX - Vancouver - October

Data & Storage Services Entering production Field tests done (Oct 2010 – May 2011) with ATLAS and CMS, production since summer EOS currently used in EOSCMS/EOSATLAS –Software in bug-fixing mode, frequent releases though Pools migration from CASTOR to EOS ongoing –Currently at 2.3 PB usable in CMS, 2.0 PB in ATLAS –Required changes in the experiment frameworks User + quota management, user mapping Job wrappers Etc. –Several pools already decommissioned in CASTOR E.g. CMSCAF 20th HEPiX - Vancouver - October

Data & Storage Services Statistics 20th HEPiX - Vancouver - October ATLAS instance: throughput over 1 month (entire traffic & GridFTP gw) ATLAS instance: file ops per second Pool throughput during a node drain CMS instance: hardware evolution

Data & Storage Services Roadmap EOS expected by end of the year Main Features –File-based redundancy over hosts Dual Parity Raid Layout Driver (4+2) ZFEC Driver (Reed-Solomon, N+M, user defined) Integrity & recovery tools –Client bundle for User EOS mounting (krb5 or GSI) MacOSX Linux 64bit 20th HEPiX - Vancouver - October

Data & Storage Services Conclusions CASTOR is in production for the Tier0 –New disk scheduler component in production –New buffered Tape Marks soon to be deployed EOS is in production for analysis –Two production instances running result of very good cooperation with experiments –Expand usage and gain more experience –Move from fast development and release cycles to reliable production mode 20th HEPiX - Vancouver - October