SRM 2.2 Issues Well, er, and 2.3 too Jens Jensen (STFC RAL/GridNet2) On behalf of GSM-WG OGF22, Cambridge, MA.

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

Applications Area Issues RWL Jones GridPP13 – 5 th June 2005.
HEPiX Storage, Edinburgh May 2004 SE Experiences Supporting Multiple Interfaces to Mass Storage J Jensen
J Jensen CCLRC RAL Data Management AUZN (mostly about SRM though) GGF 16, Athens J Jensen.
Jens G Jensen CCLRC/RAL hepsysman 2005Storage Middleware SRM 2.1 issues hepsysman Oxford 5 Dec 2005.
– n° 1 StoRM latest performance test results Alberto Forti Otranto, Jun
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
16 th May 2006Alessandra Forti Storage Alessandra Forti Group seminar 16th May 2006.
Heads in the cloud? GSM-WG at OGF31, Taipei Jens Jensen, RAL.
Maarten Litmaath (CERN), EGEE User Forum, CERN, 2006/03/02 (v3) Use of the SRM interface Use case What is the SRM? –Who develops it? –Is it a standard?
CERN, 29 August 2006 Status Report Riccardo Zappi INFN-CNAF, Bologna.
A. Sim, CRD, L B N L 1 Oct. 23, 2008 BeStMan Extra Slides.
Status report on SRM v2.2 implementations: results of first stress tests 2 th July 2007 Flavia Donno CERN, IT/GD.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
SRM 2.2: status of the implementations and GSSD 6 th March 2007 Flavia Donno, Maarten Litmaath INFN and IT/GD, CERN.
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
Data Management The GSM-WG Perspective. Background SRM is the Storage Resource Manager A Control protocol for Mass Storage Systems Standard protocol:
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
INFSO-RI Enabling Grids for E-sciencE DAGs with data placement nodes: the “shish-kebab” jobs Francesco Prelz Enzo Martelli INFN.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Andrew C. Smith – Storage Resource Managers – 10/05/05 Functionality and Integration Storage Resource Managers.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
Report on Installed Resource Capacity Flavia Donno CERN/IT-GS WLCG GDB, CERN 10 December 2008.
Computing Sciences Directorate, L B N L 1 CHEP 2003 Standards For Storage Resource Management BOF Co-Chair: Arie Shoshani * Co-Chair: Peter Kunszt ** *
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
SRM Monitoring 12 th April 2007 Mirco Ciriello INFN-Pisa.
WLCG Grid Deployment Board, CERN 11 June 2008 Storage Update Flavia Donno CERN/IT.
SRM & SE Jens G Jensen WP5 ATF, December Collaborators Rutherford Appleton (ATLAS datastore) CERN (CASTOR) Fermilab Jefferson Lab Lawrence Berkeley.
1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/06/07 SRM v2.2 working group update Results of the May workshop at FNAL
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
MA194Using WindowsNT1 Topics for the day… WindowsNT Security WindowsNT File System (NTFS) Viewing/Setting Document and Folder Permissions Access Control.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
6 Sep Storage Classes implementations Artem Trunov IN2P3, France
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
Cracow Grid Workshop, October 15-17, 2007 Polish Grid Polish NGI Contribution to EGI Resource Provisioning Function Automatized Direct Communication Tomasz.
The new FTS – proposal FTS status. EMI INFSO-RI /05/ FTS /05/ /05/ Bugs fixed – Support an SE publishing more than.
PPDG meeting, July 2000 Interfacing the Storage Resource Broker (SRB) to the Hierarchical Resource Manager (HRM) Arie Shoshani, Alex Sim (LBNL) Reagan.
INFSO-RI Enabling Grids for E-sciencE Agreement Service for Storage Space Reservation T.Ferrari, E.Ronchieri JRA1 All Hands Meeting,
Author - Title- Date - n° 1 Partner Logo WP5 Status John Gordon Budapest September 2002.
WLCG Grid Deployment Board CERN, 14 May 2008 Storage Update Flavia Donno CERN/IT.
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
SESEC Storage Element (In)Security hepsysman, RAL 0-1 July 2009 Jens Jensen.
StoRM: status report A disk based SRM server.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Storage Element Model and Proposal for Glue 1.3 Flavia Donno,
SRM-iRODS Interface Development WeiLong UENG Academia Sinica Grid Computing 1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GLUE Schema Configuration for SRM 2.2 Stephen.
1 SRM v2.2 Discussion of key concepts, methods and behaviour F. Donno CERN 11 February 2008.
GridPP2 Data Management work area J Jensen / RAL GridPP2 Data Management Work Area – Part 2 Mass storage & local storage mgmt J Jensen
Enabling Grids for E-sciencE EGEE-II INFSO-RI Status of SRB/SRM interface development Fu-Ming Tsai Academia Sinica Grid Computing.
Bologna, March 30, 2006 Riccardo Zappi / Luca Magnoni INFN-CNAF, Bologna.
User Domain Storage Elements SURL  TURL LFC Domain (LCG File Catalogue) SA1 – Data Grid Interoperation Enabling Grids for E-sciencE EGEE-III INFSO-RI
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
Jean-Philippe Baud, IT-GD, CERN November 2007
OGF PGI – EDGI Security Use Case and Requirements
Status of the SRM 2.2 MoU extension
Open Source distributed document DB for an enterprise
SRM v2.2 / v3 meeting report SRM v2.2 meeting Aug. 29
StoRM Architecture and Daemons
SRM Developers' Response to Enhancement Requests
SRM2 Migration Strategy
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

SRM 2.2 Issues Well, er, and 2.3 too Jens Jensen (STFC RAL/GridNet2) On behalf of GSM-WG OGF22, Cambridge, MA

This Talk Deviates from previous principles of being for beginners –Technical –Less polished… –May be useful for others… Expose standard and protocol process –Not many answers – kickstart(restart) process Combines the two sessions –Input (mainly) from dCache, CASTOR, StoRM

Aims Revisit specification –Implementations’ deviations from OGF specifications –Ensure another group can interoperate –If someone else were to start from scratch –E.g. SRB (ASGC work) Aim is not to start work on 2.3 –I.e. the aim is not – not the aim is to not, not that aim is not to start –If that makes sense

A Very Brief History Spec from 2006 Then came implementations Then came WLCG …  revisit spec Now getting experiences …  revisit spec, highlight issues …  think about next steps

Philosophies Manage diverse storage systems (but nothing else) User interface (not admin) Open Standard –A standard is not a standard until it is a standard (next slide) Open participation (no fees, no closed societies) Protect storage from Grid? Encourage best practices? Encourage uniformity? Allow diversity? The File is the unit of currency (not datasets)

Compare OASIS “Approved within an OASIS Committee,” “Submitted for public review,” “Implemented by at least three organizations,” “And finally ratified by the Consortium's membership at-large.” We would add that the three implementations “must interoperate”!

WLCG Wide deployment “Now get experience” with WLCG MoU: Significant changes to spec… Do they make sense? Process. What about smaller customers? Tape1Disk1=ONLINE_AND_NEARLINE? –…No. In cache does not mean always in cache

Space Tokens on Get srmPrepareToPut uses a space token (description) srmPrepareToGet doesn’t –Also for srmBringOnline Problem for many implementations –dCache, CASTOR –dCache: MSS doesn’t see space token –StoRM: not needed

Other get issues Getting directories? –Not supported? –Or special permissions required? –Also to apply for large bulk requests?

Finance Use Cases Ezio Corso (ICTP/E-Grid) (StoRM) –Compare EGEE industry liaison –“Complexity of financial instruments” –“more stringent risking and reporting requirements” –“Point solution” grids inefficient (silo) –Big computing makes data bottleneck –Access control by individuals

Spaces Access Control on spaces –Also to be published in GLUE 1.3 schema as ACBR on VOInfo Reserving subspaces of spaces Summarising spaces for Owner Query space status?

What is a Space Anyway? A collection at least one of physical storage component area? With a common baseline set of capabilities (access latency etc)? Not to even mention “free” space, “used” space, etc. –Tricky to define –Even more tricky to measure –Still more tricky to get agreement

What is a Space anyway? Is everything a space? –Suggestion to have toplevel static spaces Is disk a space? Or can space have disk? Spaces can be named by token descrs –Always named by space token descr? –Can be referenced by path? Non-uniquely? –Can be referenced (non-uniquely) by capabilities? Is a (static) space an SA?

Space Behaviour What happens if a file is released? –Space given back to the Space? –Space does not re-grow? Permanent file in limited space? –Used to be: not permitted –Now, space is shrunk and released –Keep token around, or permit recycling?

Permissions Simple Unixy (POSIX) permissions Default permissions on directories –Inheritance from above? –Consistent with space permissions, if applicable? –Default (per VO?) Permit for roles and groups? Stage in permission (protect write cache) –Not the same as reading

Permissions StoRM calls out to LFC –Access control API in SRM not adequate –Use LFC’s API Multiple StoRMs can share an LFC => Can synchronise between SE and LFC

Return Codes SRM_REQUEST_QUEUED SRM_REQUEST_INPROGRESS srmCopy()

Use of GSI authentication Currently using SOAP over GSI sockets GSI needed for delegation Delegation needed for srmCopy() (only) Incompatible with SSL Proposal to use gLite delegation –SOAP API specifically for delegation –AstroGrid uses home-made REST-based Not using WS-Anything –Many are Java only, too complex, not mature

FileStorageType Volatile, Durable, Permanent Should have been: ReleaseWhenExpired, WarnWhenExpired, NeverExpire –Avoid confusion with overloaded term from 1.1 – wrongly named in spec. What is done on Durable/WarnWE timeout? (“raise error condition”)

Access Latency OFFLINE not defined Not used by WLCG But does that mean it doesn’t exist? ONLINE_AND_NEARLINE mentioned LOST… UNAVAILABLE…

Default Certain aspects of API optional –Standard default? –Or implementation-defined default? –E.g., “default” space Default filesize on put? –Is it 1? –Is it implementation dependent? Space dependent? –Is it returned?

Implicit Implicit pinning Implicit reservations Implicit lifetimes Implicit changes on action: Implicit changes on expiry Surprising for users? Complicates implementations? What if permission denied for implicit action? What is reasonable?

Explicit but unknown Changing spaces (capabilities) –WLCG restricted D1T1 D0T1 (more or less)

Best Practices for Clients Propagate errors to user Clean up after yourself… –Even after unclean exit Should SRM use request timeout and keepalive? –Cancel at any point? –Or only when queueing

srmCopy Was always slightly tricky (also in 1.0  1.1) Needs delegation (GSI problem) How and when does client check status What if remote host is not an SRM2? Push modes and pull modes – and firewalls And then the GridFTP modes (push/pull) And the GridFTP streams Can’t always get good results if implementation uses defaults or tries to guess No way to set most parameters

srmLs problem Classical problem with large directories Exercise: on a normal filesystem ls -R dir with large directories. While you wait, try to use the system. Large data volumes in SOAP –Attachment supported? Truncate, offset

Which bits are optional…? Many features Most parameters TExtraInfo

Next Steps Continue this process Define terminology Assess “damage” 2.3 –No, not yet –Too soon, not enough experience with 2.2 –Adaption difficult Options Do nothing –Too late (WLCG) Document differences Retrofit things into 2.2 Add to 2.2 (incremental) Postpone to “2.3” Postpone to 3.1

Future Stuff WSRF –Rich Wellner (2004) –(WSRT?) Avoid duplication Compare OGSA-D-Arch –Proposes modular architecture for data

More Capabilities Integrity checking –Act when integrity checking fails? Service description, agreement (dynamic) File content Data sets, chunks Dynamic resource allocation –Networks, additional storage, disk servers (now known as virtualisation) –Recovery