CASTOR SRM v1.1 experience Presentation at SRM meeting 01/09/2004, Berkeley Olof Bärring, CERN-IT.

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

29 June 2006 GridSite Andrew McNabwww.gridsite.org VOMS and VOs Andrew McNab University of Manchester.
HEPiX Storage, Edinburgh May 2004 SE Experiences Supporting Multiple Interfaces to Mass Storage J Jensen
CASTOR SRM v1.1 experience Presentation at HEPiX MSS Forum 28/05/2004 Olof Bärring, CERN-IT.
Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.
J Jensen CCLRC RAL Data Management AUZN (mostly about SRM though) GGF 16, Athens J Jensen.
HEPiX GFAL and LCG data management Jean-Philippe Baud CERN/IT/GD.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester
16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
DGC Paris Community Authorization Service (CAS) and EDG Presentation by the Globus CAS team & Peter Kunszt, WP2.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Zach Miller Condor Project Computer Sciences Department University of Wisconsin-Madison Flexible Data Placement Mechanisms in Condor.
Andrew McNab - SlashGrid, HTTPS, fileGridSite SlashGrid, HTTPS and fileGridSite 30 October 2002 Andrew McNab, University of Manchester
SRM 2.2: tests and site deployment 30 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
Data Management The GSM-WG Perspective. Background SRM is the Storage Resource Manager A Control protocol for Mass Storage Systems Standard protocol:
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
EGEE is a project funded by the European Union under contract IST Gap analysis draft v2 Olle Mulmo, David Groep, Joni Hahkala JRA3 Gap, 10.
1 Meeting Location: LBNL Sept 18, 2003 The functionality of a Replica Registration Service Attendees Michael Haddox-Schatz, JLAB Ann Chervenak, USC/ISI.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Computing Sciences Directorate, L B N L 1 CHEP 2003 Standards For Storage Resource Management BOF Co-Chair: Arie Shoshani * Co-Chair: Peter Kunszt ** *
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CASTOR status Presentation to LCG PEB 09/11/2004 Olof Bärring, CERN-IT.
SRM Monitoring 12 th April 2007 Mirco Ciriello INFN-Pisa.
WLCG Grid Deployment Board, CERN 11 June 2008 Storage Update Flavia Donno CERN/IT.
SRM & SE Jens G Jensen WP5 ATF, December Collaborators Rutherford Appleton (ATLAS datastore) CERN (CASTOR) Fermilab Jefferson Lab Lawrence Berkeley.
1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/06/07 SRM v2.2 working group update Results of the May workshop at FNAL
CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.
OSG AuthZ components Dane Skow Gabriele Carcassi.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
Configuring and Troubleshooting Identity and Access Solutions with Windows Server® 2008 Active Directory®
Owen Synge and Shaun De Witt HTTP as a better file transfer protocol default for SRM Slide 1 HTTP as a better file transfer protocol default for SRM By.
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
C O M P U T A T I O N A L R E S E A R C H D I V I S I O N SRM Basic/Advanced Spec Issues Arie Shoshani, Alex Sim, Junmin Gu Scientific Data Management.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Security Middleware Andrew McNab University of Manchester.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
Andrew McNab - Dynamic Accounts - 2 July 2002 Dynamic Accounts in TB1.3 What we could do with what we’ve got now... Andrew McNab, University of Manchester.
WLCG Grid Deployment Board CERN, 14 May 2008 Storage Update Flavia Donno CERN/IT.
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,
GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data Management cluster summary David Smith JRA1 All Hands meeting, Catania, 7 March.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
1 SRM v2.2 Discussion of key concepts, methods and behaviour F. Donno CERN 11 February 2008.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
Storage Element Security Jens G Jensen, WP5 Barcelona, May 2003.
Security recommendations DPM Jean-Philippe Baud CERN/IT.
ATLAS DDM Developing a Data Management System for the ATLAS Experiment September 20, 2005 Miguel Branco
Dynamic Accounts: Identity Management for Site Operations Kate Keahey R. Ananthakrishnan, T. Freeman, R. Madduri, F. Siebenlist.
CASTOR: possible evolution into the LHC era
Jean-Philippe Baud, IT-GD, CERN November 2007
Status of the SRM 2.2 MoU extension
SRM v2.2 / v3 meeting report SRM v2.2 meeting Aug. 29
SRM Developers' Response to Enhancement Requests
Data Management cluster summary
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

CASTOR SRM v1.1 experience Presentation at SRM meeting 01/09/2004, Berkeley Olof Bärring, CERN-IT

01/09/2004 CASTOR SRM v1.1 experience 2 Outline CASTOR SRM v1.1 implementation Interoperability tests Problems found –SRM specification –GSI GGF: GSM WG –Input to the definition of SRM-Basic Conclusions and outlook

01/09/2004 CASTOR SRM v1.1 experience 3 CASTOR SRM v1.1 Implements the vital operations –get, put, getRequestStatus, setFileStatus, getProtocols No-ops: –pin, unPin, getEstGetTime, getEstPutTime Implemented but optionally disabled (requested by LCG) –advisoryDelete CASTOR GSI (CGSI) plug-in for gSOAP –Also used in GFAL CERN: –First prototype in summer 2003 –First production version deployed in December 2003 Other sites having deployed the CASTOR SRM –CNAF (INFN/Bologna) –PIC (Barcelona)

01/09/2004 CASTOR SRM v1.1 experience 4 CASTOR SRM v1.1 CASTOR tape archive SRM request repository Grid services SRMgridftp GSI CASTOR disk cache stagerRFIO Tape mover Tape queue CASTOR name space Volume Manager Local clients

01/09/2004 CASTOR SRM v1.1 experience 5 Deployment castorgrid.cern.ch DNS loadbalancing gridftp01 srm gridftp gridftp02 srm gridftp gridftp03 srm gridftp gridftp04 srm gridftp gridftp05 srm gridftp gridftp06 srm gridftp CASTOR (stager, nameserver,...) Test/dev node RFIO&co

01/09/2004 CASTOR SRM v1.1 experience 6 Interoperability tests CASTOR SRM has been running interoperability tests with various clients, notably –GFAL (Jean-Philippe) –EDG replica manager (Peter) –FNAL/dCache SRM (Timur)

01/09/2004 CASTOR SRM v1.1 experience 7 Problems found The interoperability problems can be classified as: –Due to problems with the SRM specification –Due to assumptions in SRM or SOAP implementations –Due to GSI incompatibilities The debugging of GSI incompatibilities is by far the most difficult and time consuming

01/09/2004 CASTOR SRM v1.1 experience 8 Problems with SRM spec (1) Lack of enumeration –All enumeration-like types are strings –Client needs to find a common denominator (e.g. cast all strings in capital letters) Request and file state lifecycles –Concise for ‘put’ or ‘get’ –Draft proposal submitted by Timur for ‘copy’. Not yet adopted by CASTOR SRM implementation. –Undefined for ‘mkPermanent’, ‘pin’, ‘unpin’ (probably irrelevant for the latter two)? Request history –What an SRM should with requests that have reached the “Done” or “Failed” status

01/09/2004 CASTOR SRM v1.1 experience 9 Problems with SRM spec (2) Immutability of request identifier –Request id is a 32 bit word –Unspecified if an SRM can reuse request ids for finished (“Done” or “Failed”) requests SURL (Site URL) semantics –Is it an URL or URI? –If URL, does it support relative and absolute paths? –If URI  name space is virtually flat for an arbitrary client Pin lifetime –Pin lifetime is defined to be subject for site policy –No way to query the remaining pin lifetime for a particular file –Current definition appears useless for any practical purpose

01/09/2004 CASTOR SRM v1.1 experience 10 Problems with SRM spec (3) Exception handling and error propagation –Unspecified if a multi-file request should fail when a subset of the files got an error –Unspecified if and when an SRM can do retries –Only one error message, global for all files in a multi-file request, is available for reporting –Format and contents of error message undefined advisoryDelete != delete –It may be vital to know what the effect is No effect at all (if so, what happens if SURL is reused for a new file?) Only remove disk resident copy (if so, when?) Remove HSM file (if so, when?) Directory creation on the fly for ‘put’ requests –If a ‘put’ requests specifies a SURL corresponding to a path for which one or several sub-directory levels do not exist, should it create the missing dirs on the fly (provided the client has the appropriate permissions)?

01/09/2004 CASTOR SRM v1.1 experience 11 Problems due to SRM or SOAP implementation details SRM WSDL discovery –FNAL client put severe constraints on the wsdl publication Bug in gSOAP v2.3 WSDL importer Various bugs in CASTOR SRM found but not reported here

01/09/2004 CASTOR SRM v1.1 experience 12 GSI problems (1) CASTOR (GSI) – EDG RC (Java TrustManager) –TrustManager does not use GSI default of SSL handshake + credential delegation, but just a SSL handshake –TrustManager client would not work with SSL 3.0, which is forced by GSI –Solution: EDG RC uses CoG (Globus Java Security Implementation) instead CASTOR (GSI) – FNAL dCache (Java CoG) –FNAL client only used a limited number of algorithms for encryption that were not matching those provided by standard GSI –Limited Proxy certificate GSI error reporting not working properly

01/09/2004 CASTOR SRM v1.1 experience 13 GSI problems (2) Administration and deployment issues –EDG globus patch for supporting for dynamic pool accounts requires GRIDMAPDIR environment to be declared, even if default location was used for the security files –configuration problems (right Root CA not trusted) –CERN CA changed the Certificate naming scheme (number added at the end of DN). New certificates were not automatically propagated (to, for instance, FNAL). The effort for debugging GSI problems will scale with the number of SRM implementations –Establishing a ‘SRM reference implementation’ for certifying new servers and clients would help

01/09/2004 CASTOR SRM v1.1 experience 14 Conclusions and outlook CASTOR SRM v1.1 is in production since a couple of months at CERN and some other CASTOR Tier- 1 sites SRM interoperability does not come for free –Definition not concise enough, room for too much site specific interpretation –Is GSI interoperability an illusion and, if so, will it continue to be so?  We have currently no plans for a CASTOR SRM v2.1 implementation. Would rather like to tighten up SRM v1.1 in the context of the GGF GSM WG and the SRM-Basic definition