Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MyProxy and EGEE Ludek Matyska and Daniel.
Advertisements

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB JavaForum.
– n° 1 StoRM latest performance test results Alberto Forti Otranto, Jun
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
70-291: MCSE Guide to Managing a Microsoft Windows Server 2003 Network Chapter 7: Domain Name System.
CERN, 29 August 2006 Status Report Riccardo Zappi INFN-CNAF, Bologna.
Status report on SRM v2.2 implementations: results of first stress tests 2 th July 2007 Flavia Donno CERN, IT/GD.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Andrew C. Smith – Storage Resource Managers – 10/05/05 Functionality and Integration Storage Resource Managers.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
SRM workshop – September’05 1 SRM: Expt Reqts Nick Brook Revisit LCG baseline services working group Priorities & timescales Use case (from LHCb)
MEMBERSHIP AND IDENTITY Active server pages (ASP.NET) 1 Chapter-4.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
Overview of Privilege Project at Fermilab (compilation of multiple talks and documents written by various authors) Tanya Levshina.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
SRM & SE Jens G Jensen WP5 ATF, December Collaborators Rutherford Appleton (ATLAS datastore) CERN (CASTOR) Fermilab Jefferson Lab Lawrence Berkeley.
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB Markus.
1 AHM, 2–4 Sept 2003 e-Science Centre GRID Authorization Framework for CCLRC Data Portal Ananta Manandhar.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data management in LCG and EGEE David Smith.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 in DPM Sophie Lemaitre Jean-Philippe.
1 Egrid portal Stefano Cozzini and Angelo Leto. 2 Egrid portal Based on P-GRADE Portal 2.3 –LCG-2 middleware support: broker, CEs, SEs, BDII –MyProxy.
WLCG Grid Deployment Board CERN, 14 May 2008 Storage Update Flavia Donno CERN/IT.
Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.
1 Xrootd-SRM Andy Hanushevsky, SLAC Alex Romosan, LBNL August, 2006.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
SESEC Storage Element (In)Security hepsysman, RAL 0-1 July 2009 Jens Jensen.
DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.
StoRM: status report A disk based SRM server.
SRM-iRODS Interface Development WeiLong UENG Academia Sinica Grid Computing 1.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 5.
1 SRM v2.2 Discussion of key concepts, methods and behaviour F. Donno CERN 11 February 2008.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Active Directory Domain Services (AD DS). Identity and Access (IDA) – An IDA infrastructure should: Store information about users, groups, computers and.
Storage Element Security Jens G Jensen, WP5 Barcelona, May 2003.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Status of SRB/SRM interface development Fu-Ming Tsai Academia Sinica Grid Computing.
Bologna, March 30, 2006 Riccardo Zappi / Luca Magnoni INFN-CNAF, Bologna.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
EGRID Project: Experience Report Implementation of a GRID Infrastructure for the Analysis of Economic and Financial data.
Enabling Grids for E-sciencE EGEE-II INFSO-RI The Development of SRM interface for SRB Fu-Ming Tsai Academia Sinica Grid Computing.
Virtual Organisations and the NGS Mike Jones Research Computing Services e-Science & “The Grid” for Bio/Health Informaticians, IT January 2008.
EGEE Data Management Services
AuthN and AuthZ in StoRM A short guide
OGF PGI – EDGI Security Use Case and Requirements
Classic Storage Element
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
File System Implementation
Status of the SRM 2.2 MoU extension
StoRM Architecture and Daemons
Introduction to Data Management in EGI
SRM Developers' Response to Enhancement Requests
Whether you decide to use hidden frames or XMLHttp, there are several things you'll need to consider when building an Ajax application. Expanding the role.
The INFN Tier-1 Storage Implementation
Chapter 2: The Linux System Part 5
a middleware implementation
INFNGRID Workshop – Bari, Italy, October 2004
StoRM disk management middleware
Presentation transcript:

Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

Advanced topic on data management I will briefly describe how the classic SE works: Highlight design points and consequences for file security. File security: POSIX-like ACL access to files from the GRID. I’ll then talk about the SRM protocol: Its origin to allow tape resources to be accessed from the GRID. Particular attention to design differences with classic SE. SRM transition as an interface to disk storage resources. Differences with Tape based systems. I’ll finally talk about StoRM: an SRM implementation that allows POSIX like ACL access.

I. Classic SE

Classic SE It allows disk resources to be accessed from the GRID. What makes a machine into a SE? Three components are needed: A component that publishes and tells the GRID that it is an available storage resource. The usual framework for authentication: GSI. A component that actually moves the files around: the characterizing feature!

Classic SE Component that allows the GRID to be aware of its presence, i.e. to be included in the GRID information system There is an LDAP Server that publishes information about the SE. Information organised according to the GlueSchema: specifically by the GlueSEUniqueID entity. Information describing the SE such as its name and listening port of service. Information specific to each VO that the SE is serving such as the local path to the file holding directory, available space, etc. Part of the information is updated dynamically, especially that concerning the disk space available and disk space occupied. It is done through LDAP Providers found in /opt/lcg/libexec. The providers run periodically scripts which update the dynamic information. Finally the rest of the grid information system periodically polls the information made available by the SE present there.

Classic SE User authentication: Grid Security Infrastructure GSI Core of GLOBUS 2.4 libraries: used by service in charge of moving files around! i.e. /opt/globus/lib/libglobus_gsi_credential_gcc32dbg.so.0, /opt/globus/lib/libglobus_gsi_proxy-core_gcc32dbg.so.0, etc. Set of scripts run by cron jobs to manage pool accounts: /opt/edg/sbin/edg-mkgridmap creates a gridmap file by reading a local configuration file that specifies sources of allowed credentials, from LDAP server or a specific file. /opt/edg/sbin/lcg-expiregridmapdir used to remove the mapping to local credentials when a grid user no longer is working on that machine. /opt/edg/sbin/edg-fetch-crl used to retrieve revocation lists of invalid certificates.

Classic SE Component that carries out the functionality of moving files around the GRID. In general it is just any implementation of a transport protocol that implements GSI! GridFTP most common! RFIO Anything that somebody comes up with as long as it is GSI enabled: it is just a matter of who will adopt it and use it!

Classic SE GridFTP: Essentially an FTP server extended/optimized for large data transfers: Parallel streams for speed. Allows checkpoints during file transfers, for later resuming. Authentication through GSI certificates instead of user name + password

Classic SE Central point: It is FTP! A user can do what an FTP client allows to be done! There is no separation of what can be done from the grid, and the actual transport protocol. There is no explicit and separate list of file manipulation operations that can be done from the grid! There is no uniform view of the possible file manipulations: they are linked to the underlying transport protocol! Depending on the protocol you may not have the same functionality For the same functionality the specific protocol must be used: it may not be possible to access seamlessly all SEs!

Classic SE Compare with CEs that have LRMS interface to forked jobs or to batch jobs. It is an abstraction layer on the kinds of computations that can be done. LRMS may not be a great protocol (gLite CEs are somewhat different)… yet it is an attempt to introduce an abstraction.

Classic SE A more serious consequence of the lack of abstraction is how to apply POSIX ACL like control on files, from the grid. It is left up to the transport protocol! For GridFTP: It is FTP modified for GSI. FTP allows file manipulation compatible with underlying Unix filesystem permissions. If grid control on files is needed, it is the underlying filesystem that must be carefully managed! Map users to specific local accounts: not pool accounts. Each grid user can be controlled individually once it gets into the machine. Partition local accounts into especially created groups: reflects data access patterns. Carefully crafted directory tree guides data access. So a grid user with no access rights to a file is stopped because the GridFTP server gets stopped on its track by the local filesystem!

Classic SE In any case the proposed solution is problematic because data may be present in several SEs: Users have same UID across all SEs. Replication/Synchronisation of directory structure across all SEs. Users supplied with tools to manage permissions coherently across all SEs.

Classic SE Central point: GRID lacked the concept of access control within the same VO. It was only possible to find it when passing to the local machine. The local machine had the means to enforce it: users + group membership. Security therefore is set up behind the scenes at the implementation level! No GRID concept involved! No GRID abstraction available to: Express fine grained authorization. Express what can be accessed. Check GRID credentials.

Classic SE VOMS proxies and GridFTP Allows to define roles and groups: it therefore allows for fine tuning who the GRID user is. It is up to the system receiving these detailed credentials to decide what local resources to use. For SE there is still the same problem of explicitly listing what these resources are: dependency on the transport protocol as stated.

II. The SRM protocol

The SRM protocol Storage Resource Manager protocol: Originally devised to allow grid access to tape based resources that had a disk area acting as cache. Staging of files: A request for a file arrives If it is in cache it is returned right away Otherwise it is first fetched from tapes, copied to disk and then returned. The system takes care of consistency between cache and tapes. Needed to offset latency due to robotic arm switching tapes.

The SRM protocol SRM designed to handle that Tape/Disk-cache scenario, from the GRID: 1. The presence of cache area introduces the concept of file type: Volatile: files get written in cache and the system then removes them automatically after a lifetime expires. Permanent: the files that get into cache are not removed automatically by the system Durable: files do have a lifetime that may expire but the system does not remove them and instead sends an notification to the user.

The SRM protocol 2. File staging introduces the concept of asynchronous calls to get or put a file: SRM request issued to get a file Server replies immediately without waiting for staging to complete. Server returns a Request Token which the client uses to periodically poll the request’s status.

The SRM protocol 3. The cache area also introduces a partition of file namespace: Tape must store files: there have to be names that uniquely identify the file in tape! The cache area must serve files. It may return a path to fetch the file on disk that is different from the name that allows to uniquely identify the file in tape. It can easily support different fetching mechanisms… that is different transport protocols! SRM reflects this distinction in the concept of SURLs and TURLs: SURL: Storage URL - A name that identifies a grid file in SRM storage: it is what the GRID sees! srm://storage.egrid.it:8334/old-stocks/NYSE.txt TURL: Transfer URL – A name that identifies a transport protocol and the path to fetch the file: it is how the GRID moves the file around! gridftp://storage.egrid.it:2110/home/ecorso/examples/2005/data.txt

The SRM protocol Central point: SRM introduces an abstraction to separate transfer protocol from the file operation itself. Although introduced to handle the cache area, it also solves classic SE issues! It decouples file operations from transfer protocol!

The SRM protocol Direct consequence: SRM servers do not move files in and out of GRID storage! They only return TURLS! It is up to the SRM client once it gets a TURL to call a GridFTP/RFIO/etc client for moving files! SRM acts only as a broker for file management requests! Transfer is decoupled from data presentation!

The SRM protocol Extra features and concepts in the protocol: Big issue of not running out of space during a large file transfer. System used by the HEP community to store/manage huge amounts of data from LHC. SRM introduced space management and reservation interface.

The SRM protocol It distinguishes three types of reserved disk space: Volatile: will be freed by the system as soon as its lifetime expires. Permanent: will not be freed by the system. Durable: will not be freed but the user that allocated it will be warned. Space type and file type cannot be mixed in arbitrary ways: Permanent space will be able to host all three types of files. Volatile space can only host Volatile files. The general way of working: Space request is made. Server returns a SpaceToken. All subsequent SRM calls made by the client pass on the token. The SRM server keeps track tokens and recognises allocated space.

The SRM protocol The protocol calls: Data Transfer Functions Misnomer… no data is moved by an SRM server srmPrepareToPut, srmPrepareToGet: for putting a file into GRID storage or getting one out. srmStatusOfPutRequest srmStatusOfGetRequest for polling! They work on SURLs!

The SRM protocol The protocol calls: Cache area management srmExtendFileLifeTime for extending lifetime of volatile files srmRemoveFiles to remove permenent files srmReleaseFiles, srmPutDone to force early lifetime expiry

The SRM protocol The protocol calls: Directory functions to manage files in tape srmRmdir srmMkdir srmRm srmLs They work on SURL!

The SRM protocol The protocol calls: Space management functions srmReserveSpace srmReleaseSpace srmGetSpaceMetaData Space Token returned and used with all Data transfer functions.

III. SRM applied to disk storage!

SRM applied to disk storage! SRM addresses the issues of classic SE: it is natural to use it also for disk resources. There was also another important driving force for its adoption: Many facilities were in place for LHC analysis of data coming from experiments production centres. The facilities had high performance storage solutions in place, employing disk parallel file systems such as GPFS and Lustre. With advent of GRID technologies it became necessary to adapt existing installations to the GRID.

SRM applied to disk storage! The context of operation is now different: No tape with a cache in between In general all concepts are kept with slight semantic adjustments SURL/TURL distinction is kept - it decouples transfer protocol from data presentation as stated. Three file types are kept - some files may be copied and live just for a certain amount of time. Space reservation is kept - it is an important functionality. Directory functions are kept.

SRM applied to disk storage! Some compromises: Asynchronous nature of srmPrepareToGet, srmPrepareToPut and srmCopy, remain although don’t make sense. SpaceType distinction makes less sense: Arguably the whole disk can be seen as permanent space, and so allow all three file types. Akin to tapes that are permanent by their nature. Releasing of file and lifetime extension remain for volatile files; srmRemoveFiles for managing cache files does not make sense

IV. StoRM SRM implementation

StoRM SRM implementation Result of collaboration between: INFN - Grid.IT Project from the Physics community + ICTP - EGRID Project: to build a pilot national grid facility for research in Economics and Finance (

StoRM SRM implementation StoRM’s implementation of SRM meant to meet three important requirements from Physics community: Large volumes of data exasperating disk resources: Space Reservation is paramount. Boosted performance for data management: direct POSIX I/O call. Security on data as expressed by VOMS: strategic integration with VOMS proxies.

StoRM SRM implementation EGRID Requirements: Data comes from Stock Exchanges: very strict legally binding disclosure policies. POSIX-like ACL access from GRID environment. Promiscuous file access: existing file organisation on disk seamlessly available from the grid + files entering from the grid must blend seamlessly with existing file organisation. Very challenging – probably only partly achievable! StoRM: disk based storage resource manager… allows for controlled access to files – major opportunity for low level intervention during implementation.

StoRM SRM implementation How StoRM solves POSIX-like ACL access from the GRID: All file requests are brokered with SRM protocol. When StoRM receives an SRM request for a file: StoRM asks policy source for access rights to: given SURL for given grid credentials. Check is made at the grid credential level: not local user as before! And it is done on a grid view of a file as identified by the SURL!

StoRM SRM implementation The only part of the implementation outside of the protocol is the Policy Source: a GRID service that is able to formulate/express physical access rules to resources. StoRM leverages grid’s LogicalFileCatalogue (LFC) as policy source: it is intended for Logical Names! StoRM therefore stretches its use. Still, it is very GRID-friendly: it is not a proprietary solution! It would be better to have it explicitly in the SRM protocol: SRM does have some Permission functions but their expressive power is weak, and in the next version of the protocol they will be re-addressed (srmSetPermission, srmReassignToUser, srmCheckPermission).

StoRM SRM implementation A last note: physical enforcement through JustInTime ACL setup. All files have no ACLs setup: no user can access files. Local Unix account corresponding to grid credentials is determined. ACL granting requested access set up for local user. ACL removed when file no longer needed.

Advanced topic on data management Thank-you!