Stephen Burke – Heidelberg - 26/9/2003 Partner Logo Overview of applications view of the data management middleware Stephen Burke.

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
1 WP2: Data Management Paul Millar eScience All Hands Meeting September
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
DataGrid is a project funded by the European Commission under contract IST WP2 – R2.1 Overview of WP2 middleware as present in EDG 2.1 release.
The google file system Cs 595 Lecture 9.
1 CHEP 2000, Roberto Barbera Tests of data management services in EDG 1.2 ALICE Off-line Week,
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 10: File-System Interface.
DataGrid is a project funded by the European Union CHEP 2003 – March 2003 – Title – n° 1 Grid Data Management in Action Experience in Running and.
Backup and Recovery Part 1.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
MCTS Guide to Configuring Microsoft Windows Server 2008 Active Directory Chapter 6: Windows File and Print Services.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
UCL workshop – 4-5 March 2004 – HEP Assessment of EDG – n° 1 HEP Applications Evaluation of the EDG Testbed and Middleware Stephen Burke (EDG HEP Applications.
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
WP9 – Earth Observation Applications – n° 1 Experiences with Testbed1, plans and objectives for Testbed 2 Testbed retreat th August 2002
KNMI Applications on Testbed 1 …and other activities.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
Computing and the Web Operating Systems. Overview n What is an Operating System n Booting the Computer n User Interfaces n Files and File Management n.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
Registry Replication Registry calls are forwarded by a registry Service to a single registry instance (i.e. replica) per VDB. If a replica cannot be contacted.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
First attempt for validating/testing Testbed 1 Globus and middleware services WP6 Meeting, December 2001 Flavia Donno, Marco Serra for IT and WPs.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
WP1 WMS rel. 2.0 Some issues Massimo Sgaravatto INFN Padova.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
Distributed File Systems 11.2Process SaiRaj Bharath Yalamanchili.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
Data Management The European DataGrid Project Team
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
Data Management The European DataGrid Project Team
Author - Title- Date - n° 1 Partner Logo WP5 Status John Gordon Budapest September 2002.
Testing the HEPCAL use cases J.J. Blaising, F. Harris, Andrea Sciabà GAG Meeting April,
WP3 Security and R-GMA Linda Cornwall. WP3 UserVOMS service authr map pre-proc authr LCAS LCMAPS pre-proc LCAS Coarse-grained e.g. Spitfire WP2 service.
WP1 WMS release 2: status and open issues Massimo Sgaravatto INFN Padova.
Stephen Burke – Sysman meeting - 22/4/2002 Partner Logo The Testbed – A User View Stephen Burke, PPARC/RAL.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
GridPP2 Data Management work area J Jensen / RAL GridPP2 Data Management Work Area – Part 2 Mass storage & local storage mgmt J Jensen
Storage Element Security Jens G Jensen, WP5 Barcelona, May 2003.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group
WP3 WP3 at Budapest 2/9/2002 Steve Fisher / RAL. WP3 Steve Fisher/RAL - 2/9/2002WP3 at Budapest2 Summary News –EDG Retreat –EDG Tutorials –Quality –Release.
What to expect in TB 2.0 J. Templon, NIKHEF/WP8. Jeff Templon – AWG Meeting, NIKHEF, WP1 u New resource broker architecture Big potato.
LCG/EGEE Operational Issues Stephen Burke RAL. November 1 st 2004LCG Operations - Issues Introduction List of problems to initiate discussion –A personal.
Earth Observation inputs to ATF Annalisa Terracina EU-DataGrid Project Work Package 9 – EO Applications April 2003 CERN.
Maria Alandes Pradillo, CERN Training on GLUE 2 information validation EGI Technical Forum September 2013.
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
Work Package 9 – EO Applications
StoRM: a SRM solution for disk based storage systems
WP1 WMS release 2: status and open issues
John Gordon EDG Conference Barcelona, May 2003
SRM2 Migration Strategy
Data Management in Release 2
GSAF Grid Storage Access Framework
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
Architecture of the gLite Data Management System
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Stephen Burke – Heidelberg - 26/9/2003 Partner Logo Overview of applications view of the data management middleware Stephen Burke

Stephen Burke – Heidelberg - 26/09/ /20 Software areas discussed u BrokerInfo et al u Replica Manager u Storage Element Based on testing on the EDG dev TB and LCG C&T TB, plus preliminary testing on the EDG app TB

Stephen Burke – Heidelberg - 26/09/ /20 Brokerinfo etc u Issues of file matching and use of BrokerInfo fall between WPs, not always obvious who is responsible n WP1: matching in JDL,.BrokerInfo file n WP2: general file management, edg-brokerinfo (???) n WP3 + GLUE: information schema n WP4: CESEBind info provider n WP5: TURLs and SE info provider n WP6: testbed configuration u This area is very important to users!

Stephen Burke – Heidelberg - 26/09/ /20 File matching u Basic matching works OK u Problems with more than one close SE per CE u Matching algorithm has some quirks n With multiple files will match on 1 even if the others are not available u Automatic uploading of output files was broken n Supposed to be fixed but not deployed u Gang-matching (matching on SE attributes) apparently does not work u Optimisation (access cost ranking) not yet tested u Can only match/use files from one VO

Stephen Burke – Heidelberg - 26/09/ /20 File access in jobs u Functionality of BrokerInfo getSelectedFile in TB 1 is missing n Should return a usable TURL given an LFN and protocol u Configuration of NFS mount point is less flexible than TB 1 n Schema does not properly describe mount points, interim solution relies on having a close CE for every SE with identical mount points u No file locking, job may have to call getBestFile again to recover a deleted file u Still no local disk space management on the WN u Slashgrid?

Stephen Burke – Heidelberg - 26/09/ /20 Replica Manager - General u Extensively tested on EDG dev TB u Some testing on LCG n No ROS, no WP5 SE, MDS instead of R-GMA u 134 bugs in bugzilla, 33 still open u Biggest problem is speed: various reasons, but much too slow u LFN/GUID/SURL/SFN/StFN/TURL system is very complex! n Hidden inside POOL?

Stephen Burke – Heidelberg - 26/09/ /20 RM - Outstanding bugs u RM does not make full use of Service/ServiceStatus tables u getBestFile does not deal properly with multiple close SEs u Some bugs are marked as fixed but not yet deployed

Stephen Burke – Heidelberg - 26/09/ /20 RM - Command line u edg-rm is a general interface to all DM functions n Metadata and wildcard operations excepted n Interface is fairly intuitive u Some errors generate a java backtrace or misleading error messages, but problems are being fixed as they are found u Exception recovery is not perfect n e.g. failed write to catalogue may leave file on disk n Different operations deal with failures in different ways u No equivalent of GDMP (everything is done by the client) n No bulk file transfers, no subscriptions, no server-based actions

Stephen Burke – Heidelberg - 26/09/ /20 RM - Catalogues u Currently have a single LRC and RMC for each VO n ROS is distributed u Distributed system desirable for fault tolerance, but may be complex to configure? u Nothing to keep catalogues and SEs consistent n Are the catalogue data backed up? u Still needs testing with large numbers of files u Not clear if experiments need metadata in RMC – or even LFNS? (POOL)

Stephen Burke – Heidelberg - 26/09/ /20 RM - Security u Not clear what experiments want, or what they will get! n Namespace control on LFNs? n ACLs on files? n ACLs on SEs? u Will we get the secure web service? n Denial of service etc.

Stephen Burke – Heidelberg - 26/09/ /20 WP8 view of replica management u Summary: n New replica manager works well in general n File registration is very slow (> 20 secs for 1 12-byte file!) u Issues: n Moving to the distributed RLI/LRC system is a big change, can we get it working in time? n BrokerInfo interaction / getSelectedFile functionality n VOMS integration – what do we want and can we get it? n Schema changes – lots of interested parties

Stephen Burke – Heidelberg - 26/09/ /20 Summary of Priorities for WP2 Existing system: u Registration time u getSelectedFile u Exception handling u Metadata u Optimisation Vital Desirable Low priority New features: u Secure web service u Distributed RLI/LRC u VOMS integration

Stephen Burke – Heidelberg - 26/09/ /20 SE - General u Tested on dev TB only, not installed by LCG n Basic functionality works n Castor and ADS interfaces seem to be (mostly) working u Configuration seems fragile, not much guidance for sysadmins (e.g. use of partitions) n Validation tests? u 220 bugs in bugzilla, 51 still open u Delete still does not work u Very limited user documentation n But largely hidden behind the RM

Stephen Burke – Heidelberg - 26/09/ /20 SE - Command line u ele* commands removed due to ssh problems, but were more intuitive than edg-se-webservice u edg-se-webservice command is clumsy (especially in insecure mode), but users will normally only use edg- rm u Error reporting is poor, most errors give an XML dump, error messages are often meaningless

Stephen Burke – Heidelberg - 26/09/ /20 SE - Architecture u Separation between permanent storage and cache area makes sense, but only if the cache gets cleaned, otherwise all files are stored twice u Create/write/commit and cache/getTURL/read cycles are reasonably intuitive u No space reservation yet u Storage under hashed names – what is the hash algorithm? Is the mapping from SFNs stored securely? u Should it be possible to overwrite an existing file? (General model is that files are read-only) u Each VO has a separate name space, no way to identify a file uniquely

Stephen Burke – Heidelberg - 26/09/ /20 SE - Usage u create, getTURL, commit, cache, ls, mkdir, getMetaData all OK n Easy to forget to call getTURL! u Can also register an existing file u No way to get back the TURL for a file which has been created but not committed u Metadata is limited (creator CN, size, and create, modify and access timestamps) and is only available when the file is cached u getSECost returns 1 if the file is on disk and –1 if not

Stephen Burke – Heidelberg - 26/09/ /20 SE - Security u Secure web service badly needed u No way to specify VO or user with insecure interface, hence everyone is John Gordon u ACLs?

Stephen Burke – Heidelberg - 26/09/ /20 SE - Information providers u Schema is part of GLUE, supposed to be generic for all storage systems n SE providers come from WP5 u Free disk space reporting is problematic n Does it report the space in the permanent file area or the cache? n Still doesn’t deal properly with partitions (but may not be such a problem) n Can cope with different areas per VO u Technical problem in the way the DN for the SA (Storage Area) object is constructed n First test of GLUE change procedure?  Policy values are published (MaxFileSize,MinFileSize, MaxData,MaxNumFiles,MaxPinDuration) but they are not enforced, hence they are meaningless

Stephen Burke – Heidelberg - 26/09/ /20 WP8 view of SE u Situation: n Functionality is mostly there n Lots of minor bugs and problems n Configuration seems delicate, SEs often fail to work for no obvious reason, diagnosing problems is hard n Error reporting is poor u Issues: n Space management n ACLs n GLUE schema

Stephen Burke – Heidelberg - 26/09/ /20 Summary of Priorities for WP5 Existing system: u Stability/reliability u Delete u Secure web service u Interaction with WP2 u Error reporting u ele* command line New features: u Space management u VOMS integration (ACLs)