ILDG Middleware Status Bálint Joó UKQCD University of Edinburgh, School of Physics on behalf of ILDG Middleware Working Group alternative title: Report on Middleware Working Group Meeting, Oct , NeSC, Edinburgh
Working Assumptions (v1) ● Gauge Connection-esque but decentralised – ILDG Services are “read only” – Publication, deletion and modification are “local” issues (within collaborations) – File name management is a “local” issue – Local issues not standardised – Middleware WG may go on to standardise publication etc in a future version once this simple one is in production
Working Assumptions (v1) ● Collaborations track their data in other collaborations grid – Ties data to producing collaboration – Data only appears in producing collaboration's Replica Catalog (RC) – BUT data may still live “physically” near a consumer – Allows easy location of RC if producing collaboration is known.
Working Assumptions (v1) ● Data has UNIX like access permissions – Owner, group, world ● Owner and group access authenticated through certificates ● Anonymous world access – no certificate required
Architecture Overview ● Client / Server model ● Services (Standardised) – will be Stateless Web Services – communicating through SOAP messages – will have MINIMAL interfaces – Minimal amount of standardisation ● Clients/Applications (Local issue) – May be complex – Maximal amount of local flexibility
The obvious... ● Clients/Applications are a local issue BUT – member collaborations may of course collaborate on writing clients – Can share effort/code – Outside current Middleware WG remit – ILDG may consider setting up an Applications WG to coordinate effort on clients ?
Overview of Services ● Metadata Catalogue (MDC) – Maps metadata query to ● Metadata document AND/OR ● Global File Name (GFN) ● Replica Catalogue (RC) – Maps GFN to one or more URLs – URL can be ● SURL for use with SRM ● Normal URL for download without SRM
Overview of Services ● Storage Resource Manager (SRM) – Already standardised, and in version 2 – SRM standardisation body to become GGF working group – SRM maps SURL to Transfer URL (TURL) – SRM can negotiate transfer protocol – SRM can manage a large collection of storage devices, deal with certificates etc – Version 2 reference implementation available from Jlab.
Overview of Services ● SRM can be highly beneficial for managing storage ● Non SRM based data grids will need to add SRM web service interface over their datagrid implementation ● But this may be time consuming, so initially RC can return TURLs directly and SRM implementation is not mandated until everyone has one
MDC Interface Definition ● Mandated 4 functions: – doMetadataQuery() ● queries both ensemble and configuration metadata – doEnsembleQuery() ● queries only ensemble metadata – doConfigurationQuery() ● queries only configuration metadata – getSupportedQueryTypes() ● return types of query supported by server
MDC Interface Definition ● Query language choice – Lots to choose from (SQL, Xpath, XQuery, XSLT) – ILDG must support lowest common denominator... (Xpath v1.0?) ● SQL and XML Schema mapping difficult/inefficient – map only leaf nodes (automatic mapping)? – potential maintenance nightmare for all eternity as Schema changes?
MDC Interface Definition ● Adopted M. Sato's working prototype MDC spec ● Asked him to support mandated functions and produce WSDL definition. ● Definition and MDC Demo Service are now ready – – there is a new RC WS prototype there too
RC Interface Definition ● Mandated 2 functions: – getURL( GFN ) ● returns URL for a given GFN ● returned URL may be stale – addURL( GFN, URL) ● inform RC that a replica of data for GFN exists at URL. ● information may be queued in server for later processing ● always succeeds.
RC Interface Definition ● Associations in RC may be stale – local implementation may periodically check all associations (consistency agent?) ● Non-mandated management functions – setProtection(GFN, Protection) – createGFN(GFN) – adviseStaleURL(URL) ● Y. Chen has produced WSDL definition and reference implementation – see for details
Middleware Technologies
GFN Structure ● GFN to be a URI ● gfn://collaboration/local-name ● Control over local-name up to collaboration – may be flat/opaque strings, may have directory like semantics,may or may not support reservation ● Collaboration part can be used to identify service instances. ● GFNs are unique & persistent - forever
ILDG Group Files ● ILDG Group Files – contain public certificates of a group of people – may contain URLs to other group files – allows quick and easy collation of certificates. – anyone can create a group – sites can trust “groups” and if necessary reject individuals (within a trusted group) ILDG should patent this for the public domain before someone else does
Bag Attributes localKeyID: subject=/C=UK/O=eScience/OU=Edinburgh/L=NeSC/CN=balint joo support.ac.uk -----BEGIN CERTIFICATE END CERTIFICATE
ILDG Service Description File ● File on ILDG Web site ● Maintained by hand for now – Only O(10) participating collaborations envisaged for now, can automate later if necessary ● Contains location of ILDG services ● Indexed by collaboration name from GFN
Metadata Envelopes ● Ownership & Access not part of “physics metadata” ● How to add this information to metadata “non-intrusively” ? ● Metadata contains some (but not full) revision information? ● Can encapsulate metadata within an envelope
... Your QCDML Metadata goes Here subject=/C=UK/O=eScience/OU=Edinburgh/L=NeSC/ CN=balint joo true false true false Thu Dec 2 23:15:57 GMT 2004
Envelopes not Mandated ● Whether to use or not is a local issue ● an “implementation hint” if you like ● solve problem of “dressing” metadata ● when Metadata is modified, old revision can be kept. Envelope can hold revision information ● Queries/Query results may need to be transformed to take account of envelope
Timeline (Plans) ● Production middleware by Dec 2005 (optimistically) or June 2006 (realistically). – EPCC committed in early 2005 to work on UKQCD “local” implementation... – Jlab efforts focussed on cluster building ● Participants to cooperate on implementation ● Another middleware meeting between May 2005 & Dec 2005 (in Japan?)