EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services and Solutions Webinar 1
Outline Categorisation of data services in EGI State-of-the-art in the grid data services area: status and future plans Use cases and technical details Plans and next....2
Components Data management is performed by interoperable components Different components address different needs Storage management at site level Transfer between sites Security Catalogue, metadata....3
Storage endpoints....4 How data are managed at site level?
Storage endpoints....5 Distribute data across several disk servers guarantees scalability at site level If tapes are provided, access to tape is transparent A unique namespace is provided to the client Authentication and encryption guarantee confidentiality and integrity Authentication and encryption guarantee confidentiality and integrity Several protocols are supported for file access and transfer
Storage endpoints....6 DPM Lustre or GPFS StoRM
Access, transfers....7 What about interoperability, access, transfers?
Access, transfers....8 DPM StoRM Abstraction layer SRM GridFTP WebDAV NFS/pNFS Applications and users can interact with the endpoints using different protocols SRM offers storage management disk/tape transparent management interface between different transfer protocols standard interface GridFTP offers advanced data transfer Parallel streams Fault tolerance Security (authorization, encryption) Optimization «Storage element»
Access, transfers....9 DPM StoRM Abstraction layer SRM GridFTP WebDAV NFS/pNFS Applications and users can interact with the endpoints using different protocols WebDAV offers a «web-based network file system» Widely supported by many OSes Standard (IETF) NFS4.1 provides «local access» (fast, POSIX) «Storage element»
Access, transfers DPM Abstraction layer SRM GridFTP WebDAV NFS/pNFS Abstraction layer SRM GridFTP WebDAV NFS/pNFS
Data transfer scheduling Can transfers be scheduled?
Data transfer scheduling schedule continuous sustained data transfer across multiple endpoints prioritize inter-VO and intra-VO file transfers Many different clients available towards several protocols (SRM, GridFTP, webdav… ) Useful in the VO management context to control data transfers....12
Catalogue Where are my files? lfn:grid/ /store/data/run1312
Catalogue LFC hierarchical view of files to users, with a UNIX-like client interface Logical File Name (LFN) to Storage URL (SURL) mappings authorization on namespace EXAMPLE: lfn:grid/ /store/data/run1312 srm://storm-se-01.ba.infn.it:8444/srm/managerv2?SFN=//cms/store/group....14
EGI «whole picture» Really complex infrastructure based on elementary «bricks» each VO chooses its «recipe» of components mature and stable integration in a unified release controls stability of the «off-line» machinery operations control stability of the «on-line» machinery
What is next… Storage Management overview 16 24/09/2014
Dynamic Federations (DynaFeds) A set of components that can aggregate on-the-fly storage and metadata farms exposing standard protocols, supporting redirections and WAN data access: Directories are «merged» so that files in the same directory appear inside the same directory even if they come from different sites Browse and access a huge repository made of many sites without requiring a static index No “registration”, no maintenance of catalogues Redirect intelligently clients asking for replica Automatically detects and avoid sites that go offline Accommodates client-geography-based redirection choice stable demo testbed, using HTTP/DAV 17
Dynamic Federations (DynaFeds) 18 /voname/docs/file1 /voname/docs/file2 /voname/docs/file3 /voname/software /voname/pub … Aggregation/Abstraction /voname/docs/file1 /voname/docs/file2 /voname/docs/file3
Dynamic Federations (DynaFeds)
Globus Online provides robust and easy to use file transfer capabilities Web interface Transfer management Performance monitoring Retries after failures, autorecover when possible It’s a service, hosted at (US) But the files that the service moves among EGI sites DO NOT LEAVE Europe GridFTP «3rd party transfer» is used Files copied directly between the EGI endpoint
iRODS Provides high level abstraction layer on top of storage resources Users focus on their data, not on where they are on the data grid Provides native metadata catalogue Multiple authentication plugins (password, PAM, GSI… ) Multiple access protocols (POSIX, S3, RADOS… ) Rule-oriented approach: «policies» can be easily implemented as data management tasks Ongoing integration in the EGI infrastructure Storage Management overview 21 24/09/2014
References EGI dCache DPM/LFC FTS FTS Dashboard Dynamic Federations iRODS Globus Online Cookbook 22