SRM v2.2 planning Critical features for WLCG

Slides:



Advertisements
Similar presentations
Jens G Jensen CCLRC/RAL hepsysman 2005Storage Middleware SRM 2.1 issues hepsysman Oxford 5 Dec 2005.
Advertisements

Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
CASTOR SRM v1.1 experience Presentation at SRM meeting 01/09/2004, Berkeley Olof Bärring, CERN-IT.
– n° 1 StoRM latest performance test results Alberto Forti Otranto, Jun
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Maarten Litmaath (CERN), EGEE User Forum, CERN, 2006/03/02 (v3) Use of the SRM interface Use case What is the SRM? –Who develops it? –Is it a standard?
CERN, 29 August 2006 Status Report Riccardo Zappi INFN-CNAF, Bologna.
A. Sim, CRD, L B N L 1 Oct. 23, 2008 BeStMan Extra Slides.
Data management in grid. Comparative analysis of storage systems in WLCG.
Status report on SRM v2.2 implementations: results of first stress tests 2 th July 2007 Flavia Donno CERN, IT/GD.
SRM 2.2: tests and site deployment 30 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.
SRM 2.2: status of the implementations and GSSD 6 th March 2007 Flavia Donno, Maarten Litmaath INFN and IT/GD, CERN.
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
Data Management The GSM-WG Perspective. Background SRM is the Storage Resource Manager A Control protocol for Mass Storage Systems Standard protocol:
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Andrew C. Smith – Storage Resource Managers – 10/05/05 Functionality and Integration Storage Resource Managers.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/02/08 VOMS deployment Extent of VOMS usage in LCG-2 –Node types gLite 3.0 Issues Conclusions.
SRM workshop – September’05 1 SRM: Expt Reqts Nick Brook Revisit LCG baseline services working group Priorities & timescales Use case (from LHCb)
SRM Monitoring 12 th April 2007 Mirco Ciriello INFN-Pisa.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
WLCG Grid Deployment Board, CERN 11 June 2008 Storage Update Flavia Donno CERN/IT.
SRM & SE Jens G Jensen WP5 ATF, December Collaborators Rutherford Appleton (ATLAS datastore) CERN (CASTOR) Fermilab Jefferson Lab Lawrence Berkeley.
1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/03/08 An update on SRM Mumbai & subsequent discussion summary –
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/06/07 SRM v2.2 working group update Results of the May workshop at FNAL
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
Busy Storage Services Flavia Donno CERN/IT-GS WLCG Management Board, CERN 10 March 2009.
6 Sep Storage Classes implementations Artem Trunov IN2P3, France
Storage Classes report GDB Oct Artem Trunov
Author - Title- Date - n° 1 Partner Logo WP5 Status John Gordon Budapest September 2002.
WLCG Grid Deployment Board CERN, 14 May 2008 Storage Update Flavia Donno CERN/IT.
Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,
SRM v2.2 Production Deployment SRM v2.2 production deployment at CERN now underway. – One ‘endpoint’ per LHC experiment, plus a public one (as for CASTOR2).
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Storage Element Model and Proposal for Glue 1.3 Flavia Donno,
SRM-iRODS Interface Development WeiLong UENG Academia Sinica Grid Computing 1.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
1 SRM v2.2 Discussion of key concepts, methods and behaviour F. Donno CERN 11 February 2008.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
SRM 2.2: experiment requirements, status and deployment plans 6 th March 2007 Flavia Donno, INFN and IT/GD, CERN.
Bologna, March 30, 2006 Riccardo Zappi / Luca Magnoni INFN-CNAF, Bologna.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
Enabling Grids for E-sciencE EGEE-II INFSO-RI The Development of SRM interface for SRB Fu-Ming Tsai Academia Sinica Grid Computing.
EGEE Data Management Services
Federating Data in the ALICE Experiment
a brief summary for users
StoRM: a SRM solution for disk based storage systems
Status of the SRM 2.2 MoU extension
Flavia Donno, Jamie Shiers
Future of WAN Access in ATLAS
dCache “Intro” a layperson perspective Frank Würthwein UCSD
Flavia Donno CERN GSSD Storage Workshop 3 July 2007
John Gordon EDG Conference Barcelona, May 2003
SRM v2.2 / v3 meeting report SRM v2.2 meeting Aug. 29
SRM Developers' Response to Enhancement Requests
Taming the protocol zoo
SRM2 Migration Strategy
Introduction to reading and writing files in Grid
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
Data Management cluster summary
GIN-Data : SRM Island Inter-Op Testing
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

SRM v2.2 planning Critical features for WLCG Results of the May 22-23 workshop at FNAL https://srm.fnal.gov/twiki/bin/view/WorkshopsAndConferences/GridStorageInterfacesWorkshop SRM v2.2 definition geared to WLCG usage, but still compatible with other implementations Some notions backported from SRM v3, others added for WLCG WLCG “MoU” https://srm.fnal.gov/twiki/pub/WorkshopsAndConferences/GridStorageInterfacesWSAgenda/SRMLCG-MoU-day2.doc Needs some updates and polishing Schedule for implementation and testing https://srm.fnal.gov/twiki/pub/WorkshopsAndConferences/GridStorageInterfacesWSAgenda/Schedule.pdf Friday phone conferences to monitor progress and discuss issues Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Critical features for WLCG Result of WLCG Baseline Services Working Group http://cern.ch/lcg/PEB/BS Originally planned to be implemented by WLCG Service Challenge 4 Delayed until autumn 2006 Features from version 1.1 + critical subset of version 2.1 (Nick Brook, SC3 planning meeting – June ’05) File types Space reservation Permission functions Directory functions Data transfer control functions Relative paths Query supported protocols Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

File types Volatile Durable Permanent Temporary and sharable copy of an MSS resident file If not pinned it can be removed by the garbage collector as space is needed (typically according to LRU policy) Durable File can only be removed if the system has copied it to an archive Permanent System cannot remove file Users can always explicitly delete files The experiments only want to store files as permanent Even scratch files  will be explicitly removed by experiment Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Space reservation v1.1 v2.1 v3.0 What about quotas? Space reservation done on file-by-file basis User does not know in advance if SE will be able to store all files in multi-file request v2.1 Allows for a user to reserve space But can 100 GB be used by a single 100 GB file or by 100 files of 1 GB each? MSS space vs. disk cache space Reservation has a lifetime “PrepareToGet(Put)” requests fail if not enough space v3.0 Allows for “streaming” When space is exhausted requests wait until space is released Not needed for SC4 What about quotas? Strong interest from LHC VOs, but not yet accepted as task for SRM Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Permission functions v2.1 allows for POSIX-like ACLs Can be associated per directory and per file Parent directory ACLs inherited by default Can no longer let a simple UNIX file system deal with all the permissions Need file system with ACLs or ACL-aware permission manager in SRM etc. May conflict with legacy applications LHC VOs desire storage system to respect permissions based on VOMS roles and groups Currently only supported by DPM File ownership by individual users not needed in SC4 Systems shall distinguish production managers from unprivileged users Write access to precious directories, dedicated stager pools Supported by all implementations Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Directory functions Create/remove directories Delete files v1.1 only has an “advisory” delete Interpreted differently by different implementations Complicates applications like the File Transfer Service Rename files or directories (on the same SE) List files and directories Output will be truncated to implementation-dependent maximum size Full (recursive) listing could tie up or complicate server (and client) May return huge result Could return chunks with cookies/offsets  server might need to be stateful It is advisable to avoid very large directories No need for “mv” between SEs Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Data transfer control functions StageIn, stageOut type functionality prepareToGet, prepareToPut (a way for) Pinning and unpinning files Avoid untimely cleanup by garbage collector Pin has a lifetime, but can be renewed by client Avoid dependence on client to clean up Monitor status of request How many files ready How many files in progress How many files left to process Suspend/resume request Not needed for SC4 Abort request Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Relative paths Everything should be defined with respect to the VO base directory Example: srm://srm.cern.ch/castor/cern.ch/grid/lhcb/DC04/prod0705/0705_123.dst SE defined by protocol and hostname (and port) VO base directory is the storage root for the VO Advertized in information system, but unnecessary detail Requires information system lookup for storing files Clutters catalog entries afterwards SRM could insert VO base path automatically Available in dCache VO namespace below base directory Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Query supported protocols List of transfer protocols per SE available from information system Workaround, complicates client SRM knows what it supports, can inform client Client always sends SRM a list of acceptable protocols gsiftp, (gsi)dcap, rfio, xrootd, root, http(s), … SRM returns TURL with protocol applicable to site Query not needed for SC4 Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

More coordination items SRM compatibility tests Test suite of Jiri Mencak (RAL) Test suite for GGF-GIN by Alex Sim (LBNL) Test suite of Gilbert Grosdidier (LCG) … Which one(s) will do the job for WLCG? Clients need to keep supporting v1.1 First try v2.x? Some implementations need v2.x to be on separate port 8444 standard? xrootd integration rfio incompatibility Quotas for user files Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

SRM v2.2 MoU for WLCG Summarize agreed client usage and server behavior for the SRM v2.2 implementations used by WLCG applications Servers can ignore non-WLCG use cases for the time being Clients FTS, GFAL, lcg-utils Servers CASTOR, dCache, DPM Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Storage classes Stick with SRM v3 terminology for now, but with a WLCG understanding TRetentionPolicy {REPLICA, CUSTODIAL} OUTPUT is not used TAccessLatency {ONLINE, NEARLINE} OFFLINE is not used Tape1Disk0 == CUSTODIAL + NEARLINE Tape1Disk1 == CUSTODIAL + ONLINE Tape0Disk1 == REPLICA + ONLINE All WLCG files (SURLs) are permanent Files can only be removed by the user Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Information discovery WLCG does not need an SRM information interface for the time being Client implementations provide list of required information GLUE schema will be modified accordingly An interface to obtain (all) the relevant information can be defined later Would allow the SRM clients and servers to be self-sufficient Would simplify the information provider implementations Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

srmReserveSpace Only deals with disk Cache in front of tape back-end, and disk without tape back-end Tape space considered infinite TapeNDiskM storage classes only require static reservations by VO admins Can be arranged out of band without using the SRM interface (CASTOR) Agreement between VO admin and SE admin will be needed anyway Networks of main clients can be indicated (dCache) Dynamic reservations by ordinary users not needed in the short term At least CMS want this feature in the medium term userSpaceTokenDescription attaches meaning to opaque space token “LHCbESD” etc. Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

srmLs To get all metadata attributes for individual files, but only some for directories Directory listings quickly become very expensive Directory listing use case would be to check consistency with file catalog An implementation-dependent upper limit will apply for the time being Use of the offset and count parameters requires further discussion TFileLocality {ONLINE, NEARLINE, ONLINE_AND_NEARLINE, LOST, NONE, UNAVAILABLE} Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

srmPrepareToPut To store a file in the space (i.e. storage class) indicated WLCG clients will supply the space token WLCG files are immutable, cannot be overwritten TConnectionType { WAN, LAN } Will be set by FTS (for 3rd party transfers) TAccessPattern { TransferMode, ProcessingMode } ProcessingMode would apply to a file opened by GFAL (but not via lcg-utils) Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

srmPrepareToGet To prepare a file for “immediate” transfer or access Recall from tape and/or copy to pool accessible by the client should now be done through srmBringOnline WLCG usage excludes changing space or retention attributes of the file TConnectionType { WAN, LAN } Will be set by FTS (for 3rd party transfers) TAccessPattern { TransferMode, ProcessingMode } ProcessingMode would apply to a file opened by GFAL (but not via lcg-utils) Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

srmBringOnline To indicate that a prepareToGet for the files is expected in the near future A delay parameter can be used for further optimization A prepareToGet could tie up resources, e.g. I/O movers in dCache Signature very similar to that of prepareToGet No TURLs are returned Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

srmCopy To copy files or directories between SEs Directories will not be supported for the time being srmPrepareToGet and srmPrepareToPut restrictions apply Individual copies in a multi-file request can be aborted Target SURLs uniquely identify the copy requests removeSourceFiles flag has been deleted from the specification Too dangerous… Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

srmChangeSpaceForFiles To change the storage class of the given files Tape1Disk0  Tape1Disk1 (add/remove disk copy) Tape0Disk1  Tape1DiskN (add/remove tape copy) To be decided which transitions shall be supported The SURL shall not be changed Absolute path may change if SURL only contains relative path (as desired) Not required in the short term Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Removal functions srmRm srmReleaseFiles srmPurgeFromSpace Remove SURL srmReleaseFiles Removes pins e.g. originating from prepareToGet May flag disk copies (TURLs) for immediate garbage collection srmPurgeFromSpace As previous, but not associated with a request srmAbortFiles To abort individual copy requests srmRemoveFiles has been deleted from the specification Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Schedule (1/3) WSDL and SRM v2.2 spec - June 6 Various inconsistencies have been fixed since Discussion about the need for some unexpected changes w.r.t. v2.1 Still to be examined by Timur for dCache srmPrepareToGet, srmPrepareToPut at the same level of functionality as it is present now - June 20 Not technically challenging Need 3 endpoints by the end of this period Need a test suite, Java, C and C++ clients are included LBNL tester FNAL srmcp - Apache Axis + Globus CoG Kit Castor C++ client – gSoap + GSI plugin DPM C client – gSoap + GSI plugin Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Schedule (2/3) Compatibility 1 week after that - June 27 ML to run the tests and work with the developers dCache srmCopy compatibility with DPM and Castor srmPrepareTo(Get/Put) - work by Fermilab - July 4 Space Reservation prerelease implementations - Sept 1 To coincide with SRM v2/v3 workshop at CERN, Aug 30 – Sept 1 Space Reservation / Storage Classes - Sept 30 (optimistic) Proper SRM or out-of-band way to reserve space srmGetSpaceTokens Modifications to srmPrepareToPut and srmCopy; srmPrepateToGet optional srmRm, srmReleaseFiles (srmPurgeFromSpace not needed) Space Reservation may only work for special deployment configurations Need to determine (per VO) if disk pools should be externally reachable Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09

Schedule (3/3) srmBringOnline - Oct 6 srmLs - return of space tokens is not required for October WLCG clients should follow the same schedule Ready to be used as testers by the end of Sept Will have several SRM test suites Functionality, stress tests, error handling and resilience to “malicious” clients Integration week at RAL - Oct 9-13 Firm dates to be decided as milestone (by end of June) It could all work sufficiently by Nov 1 To allow v2.2 to become the standard SRM service (v1.1 for legacy apps) Development of less urgent features will continue Maarten Litmaath (CERN), LCG Internal Review of Services, CERN, 2006/06/09