Ákos Frohner EGEE'08 September 2008

Slides:



Advertisements
Similar presentations
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Advertisements

Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
CERN IT Department CH-1211 Genève 23 Switzerland t Data Management Evolution and Strategy at CERN G. Cancio, D. Duellmann, A. Pace With input.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Flexibility, Manageability and Performance in a Grid Storage Appliance John Bent, Venkateshwaran Venkataramani, Nick Leroy, Alain Roy, Joseph Stanley,
Report from CASTOR external operations F2F meeting held at RAL in February Barbara Martelli INFN - CNAF.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
WLCG Grid Deployment Board, CERN 11 June 2008 Storage Update Flavia Donno CERN/IT.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,
01. December 2004Bernd Panzer-Steindel, CERN/IT1 Tape Storage Issues Bernd Panzer-Steindel LCG Fabric Area Manager CERN/IT.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
Global Science experimental Data hub Center April 25, 2016 Seo-Young Noh Status Report on KISTI’s Computing Activities.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR and EOS status and plans Giuseppe Lo Presti on behalf.
Validation tests of CNAF storage infrastructure Luca dell’Agnello INFN-CNAF.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SE Security Rémi Mollon, Ákos Frohner EGEE'08,
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
EGEE Data Management Services
Federating Data in the ALICE Experiment
CASTOR: possible evolution into the LHC era
Databases and DBMSs Todd S. Bacastow January 2005.
Jean-Philippe Baud, IT-GD, CERN November 2007
Dynamic Extension of the INFN Tier-1 on external resources
WLCG IPv6 deployment strategy
File Syncing Technology Advancement in Seafile -- Drive Client and Real-time Backup Server Johnathan Xu CTO, Seafile Ltd.
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
Managing Storage in a (large) Grid data center
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
StoRM Architecture and Daemons
Introduction to Data Management in EGI
Bernd Panzer-Steindel, CERN/IT
CERN Lustre Evaluation and Storage Outlook
Luca dell’Agnello INFN-CNAF
Castor services at the Tier-0
Olof Bärring LCG-LHCC Review, 22nd September 2008
Network Requirements Javier Orellana
The INFN Tier-1 Storage Implementation
CTA: CERN Tape Archive Overview and architecture
Computing Infrastructure for DAQ, DM and SC
Research Data Archive - technology
WLCG Storage planning Andrea Sciabà
Data Management cluster summary
Chapter 1: Introduction
INFNGRID Workshop – Bari, Italy, October 2004
The LHCb Computing Data Challenge DC06
Presentation transcript:

Ákos Frohner EGEE'08 September 2008 Data Management Activities at CERN Support for Concurrent Data Analysis by many End-Users Ákos Frohner EGEE'08 September 2008

Outline What we can do now? The LHC data flow What else is needed? Analysis requirements How can we do that? Projects EGEE'08 – DM Activities at CERN - 2

What we can do now Handling the LHC data flow CERN Advanced STOrage Manager (CASTOR): through the computer center From the experiments to disk From disk migration to tape From disk to remote sites Recall from tape File Transfer Service (FTS): to the Tier-1 centers Regulated transfer to remote sites DPM: SE for Tier-2 centers EGEE'08 – DM Activities at CERN - 3

Data flow in numbers LHC data 15 PB/year Castor – 2008 September 13 PB data 100 M files FTS aggregated traffic during the last service challenge 1500 MB/s average DPM More than 100 instances EGEE'08 – DM Activities at CERN - 4

Outline What we can do now? Some impressive numbers What else is needed? Analysis requirements How can we do that? Projects EGEE'08 – DM Activities at CERN - 5

What else is needed? WLCG planned analysis at Tier-2 sites, since recently it is also foreseen at CERN Analysis activities Random data access Low latency Small files Disk only copies (multiple) Improved usability Easy access (e.g. mounted file system) Flexible authorization, accounting, quotas Tier-2 sites with disk only storage (mostly DPM) can fulfill these requirements easier. EGEE'08 – DM Activities at CERN - 6

Detailed requirements Requirements in numbers Opening files #: O(1000/s), currently: O(10/s) Latency: O(10ms), currently: O(10s) End user use cases: (different from the coordinated data flow) Interactive usage (browsing, small changes) Random access patterns (hot servers) Use up space (vs. quota) Should see only their own files (authorization) Data access should be easy: Posix open() libc pre-load / FUSE / network file system EGEE'08 – DM Activities at CERN - 7

Outline What we can do now? Some impressive numbers What else is needed? Analysis requirements How can we do that? Projects EGEE'08 – DM Activities at CERN - 8

How can we do that? Security, accounting, quota Xrootd Stager improvements Integrated monitoring Database consolidation Tape efficiency EGEE'08 – DM Activities at CERN - 9

Security Authentication By host (internal usage, e.g. disk-tape transfer) Kerberos 5 (existing infrastructure) X509 certificates/VOMS (grid use cases) Authorization POSIX ACL Pool protection (guaranteed space, bandwidth) Accounting Mapping to single uid/gid EGEE'08 – DM Activities at CERN - 10

Xrootd Fully integrated access protocol for Castor Multi-user support (Krb5, X509/VOMS) Name space mapped into Castor NS Stager integration Tape back-end (migration and recall) Selection of stager and service classes each VO can have its own guaranteed services Bypassing the I/O scheduler Reduced latency for read Large number of open files Experimental feature: mountable via FUSE EGEE'08 – DM Activities at CERN - 11

Stager improvements Quota per pool/group/user Removal of LSF as I/O scheduler File migration/recall and transfer is scheduled through LSF job scheduler Reduced latency for read and write Multiple disk copies Increased disk pool size allows multiple copies Can replace tape backup (i.e. small files) Guaranteed resources for production users File clustering often used application level concept, however hard to capture at storage level EGEE'08 – DM Activities at CERN - 12

Monitoring Goal: real-time metrics to optimize the resource usage and to validate optimization ideas file lifetime and size distribution on disk cache efficiency, hit/miss rate by user, time, fpath, fsize and tape request clustering by user, time, fpath, fsize and tape weekly/monthly: tape cost(mounts) per user and top users Ideas to test: file aggregation on disk/tape migration/garbage collection strategies EGEE'08 – DM Activities at CERN - 13

Database Consolidation Castor is a DB centric ~ daemons are stateless Schema consolidation: SRM and Stager request handling are similar: merging the schema (and the service code) Disk server content/Stager DB: improving the synchronization strategy Improving the synchronization of file locations Name Service DB: file metadata and tape info Stager DB: files on disk (xrootd redirector: files on disk) EGEE'08 – DM Activities at CERN - 14

Tape Efficiency Improved tape format Storage of small files is inefficient (~160MB/sec + 5-9 secs for tape marks/file) Metadata for recoverability and reliability Aggregation of file sets Can work with/out clustering On-the-fly aggregation for tapes (vs. small files) Dedicated disk cache layer (Gbit net ~ tape speed) Repack strategies: new tape media or hardware require the migration of old data formats to new ones (~quarter of the internal data flow) EGEE'08 – DM Activities at CERN - 15

Additional constraints Format or layout changes imply migration 15 PB/year new data! Smooth upgrade path in a live system Testing...testing...testing Incremental upgrades EGEE'08 – DM Activities at CERN - 16

Outline What we can do now? Some impressive numbers What else is needed? Analysis requirements How can we do that? Projects EGEE'08 – DM Activities at CERN - 17