CASTOR: possible evolution into the LHC era

Slides:



Advertisements
Similar presentations
© 2006 DataCore Software Corp SANmotion New: Simple and Painless Data Migration for Windows Systems Note: Must be displayed using PowerPoint Slideshow.
Advertisements

16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
Figure 1.1 Interaction between applications and the operating system.
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
16/4/2004Storage Resource Sharing with CASTOR1 Olof Barring, Benjamin Couturier, Jean-Damien Durand, Emil Knezo, Sebastien Ponce (CERN) Vitali Motyakov.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
Building Advanced Storage Environment Cheng Yaodong Computing Center, IHEP December 2002.
Mass Storage System Forum HEPiX Vancouver, 24/10/2003 Don Petravick (FNAL) Olof Bärring (CERN)
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.
Storage and Storage Access 1 Rainer Többicke CERN/IT.
Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CASTOR status Presentation to LCG PEB 09/11/2004 Olof Bärring, CERN-IT.
VMware vSphere Configuration and Management v6
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team.
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
PHD Virtual Technologies “Reader’s Choice” Preferred product.
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
Jean-Philippe Baud, IT-GD, CERN November 2007
GridOS: Operating System Services for Grid Architectures
Storage Area Networks The Basics.
Operating System (013022) Dr. H. Iwidat
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Service Challenge 3 CERN
StoRM Architecture and Daemons
Introduction to Data Management in EGI
Bernd Panzer-Steindel, CERN/IT
Chapter 12: Mass-Storage Structure
CSC 480 Software Engineering
Introduction to Networks
Ákos Frohner EGEE'08 September 2008
Introduction to Networks
The INFN Tier-1 Storage Implementation
GGF15 – Grids and Network Virtualization
Introduction to the Kernel and Device Drivers
Direct Attached Storage and Introduction to SCSI
OffLine Physics Computing
An Introduction to Device Drivers
Operating Systems Bina Ramamurthy CSE421 11/27/2018 B.Ramamurthy.
Ch 4. The Evolution of Analytic Scalability
CASTOR: CERN’s data management system
Cloud Computing Architecture
Distributed File Systems
INFNGRID Workshop – Bari, Italy, October 2004
06 | SQL Server and the Cloud
Presentation transcript:

CASTOR: possible evolution into the LHC era Data and storage mgmt workshop 17/3/2003 Olof Bärring, CERN IT-ADC

CASTOR: possible evolution into the LHC era Outline CASTOR today LHC requirements Some basic requirements Hardware dependency Security “Gridification” Evolving CASTOR to LHC era Some design ideas Conclusions 17/3/2003 CASTOR: possible evolution into the LHC era

CASTOR: possible evolution into the LHC era CASTOR today Strong points Solid code base: modular & portable Scalable performance for most modules, in particular the tape mover (RTCOPY) Storage resource management optimization strategies built on long experience from production and data challenges Weak points Security Legacy code and dependencies from SHIFT RFIO and stage have grown from code written more than 10 years ago and have become difficult to maintain Information overlap between databases Too many independent user interfaces Problem tracing difficult due to high independence between modules (e.g. no unique request Id) 17/3/2003 CASTOR: possible evolution into the LHC era

LHC requirements Some basic requirements Scalability Managed disk storage: ~10PB Data accumulation: 5-8PB/year Sustained recording rates: 0.1 – 1GB/s Performance Streaming and random access performance limited by hardware Operation Performance and exception monitoring Fault tolerance Allow for fair-share, request prioritization and storage resource optimization (e.g. load-balancing) 17/3/2003 CASTOR: possible evolution into the LHC era

LHC requirements Hardware dependency Two types of hardware dependency Support for new hardware devices A wide range of tape devices and robotics already supported. Relatively easy to add support for new devices. SCSI and FC attached tape drives supported Support for new hardware architectures Server attached storage (current model) New distributed filesystem technologies SAN, iSCSI, fiber channel fabrics, ... Try to keep software design independent of the underlying hardware 17/3/2003 CASTOR: possible evolution into the LHC era

LHC requirements Security Currently model Trusted hosts with shared or mapped uid/gid (similar to NFS exports) Physical files owned by users Would need to support Authorization to top-level user services based on some standard authentication mechanisms (GSI, Kerberos, ...) Internal service communication Host key authentication or trusted networks? Physical file ownership Logical files are owned by real owner Physical files are owned by root or some generic user 17/3/2003 CASTOR: possible evolution into the LHC era

LHC requirements “Gridification” Grid services on top of CASTOR Data transfer/access: GridFTP Storage resource management: SRM (Storage Resource Management) interface Space reservations File “pinning” Issue Logical file ownership for grid users without local logins 17/3/2003 CASTOR: possible evolution into the LHC era

Evolving CASTOR to LHC era Some design ideas (1) Service hierarchy Authorize Select pool Super stager Select server Disk pool mgr Disk pool mgr Disk pool mgr Select disk Local request broker Disk server Local request broker Disk server Local request broker Disk server Local request broker Disk server Local request broker Distributed FS 17/3/2003 CASTOR: possible evolution into the LHC era

Evolving CASTOR to LHC era Some design ideas (2) Disk pool defines the basic storage characteristics, e.g. User group (or virtual organization) Some tape migration attributes Access type policies (streaming, random, ...) Fair-share policies Disk servers Sea of shared resource just like tape servers are nowadays Request pull rather than push Responsible for applying the policies associated with the disk pools Distributed files systems: DMAPI? 17/3/2003 CASTOR: possible evolution into the LHC era

Evolving CASTOR to LHC era Some design ideas (3) Disk server Monitoring data CPU load Disk activity Tape queues Group accounting ... Local Request Broker Request scheduler Migrator Local LFN – PFN mapping Recaller Scheduling policy plug-in Scheduling policy plug-in Scheduling policy plug-in DiskToDisk GC LFN – Logical File Name (/castor/...) PFN – Physical File Name rfiod LFN req. 17/3/2003 CASTOR: possible evolution into the LHC era

Evolving CASTOR to LHC era Some design ideas (4) Other possible improvements Support data access type hints in RFIO Sequential access Streaming sequential (e.g. tape mover) Random access “Predictable” random access (ROOT I/O?) Stream prioritization in RFIO CDR streams Tape streams Normal user streams Tape mover (RTCOPY) extensions Multiple clients for same tape mount Dynamic backfill of incoming requests for mounted tapes 17/3/2003 CASTOR: possible evolution into the LHC era

CASTOR: possible evolution into the LHC era Conclusions CASTOR has proven good performance and scalability in production and recent data challenges. In particular the tape mover (RTCOPY) has proven to scale well with the hardware. Yet some parts of CASTOR, in particular stage/RFIO, need to be modified/rewritten to better meet LHC requirements. Current hardware architecture and the final choice for LHC may have an influence on the software design. High software modularity can partly isolate us from this problem. 17/3/2003 CASTOR: possible evolution into the LHC era