OGSA Data Architecture

Slides:



Advertisements
Similar presentations
© 2006 Open Grid Forum GHPN-RG Status update co-chairss:Cees de Laat Dimitra Simeonidou GGF22, Boston, February 2008.
Advertisements

© 2006 Open Grid Forum JSDL 1.0: Parameter Sweeps OGF 23, June 2008, Barcelona, Spain.
© 2006 Open Grid Forum Network Services Interface OGF30: Connection Services Guy Roberts, 27 th Oct 2010.
© 2006 Open Grid Forum Network Services Interface Introduction to NSI Guy Roberts.
© 2006 Open Grid Forum JSDL 1.0: Parameter Sweeps: Examples OGF 22, February 2008, Cambridge, MA.
© 2006 Open Grid Forum OGF19 Federated Identity Rule-based data management Wed 11:00 AM Mountain Laurel Thurs 11:00 AM Bellflower.
© 2007 Open Grid Forum JSDL-WG Session OGF27 – General Session 10:30-12:00, 14 October 2009 Banff, Canada.
©2010Open Grid Forum OGF28 OGSA-DMI Status Chairs: Mario Antonioletti, EPCC Stephen Crouch, Southampton Shahbaz Memon, FZJ Ravi Madduri, UoC.
© 2007Open Grid Forum OGF22, 25th February 2008 OGSA Data Architecture Mario Antonioletti.
© 2007Open Grid Forum GGF19, 1'st February 2007 OGSA Data Architecture Services Dave Berry & Allen Luniewski.
© 2006 Open Grid Forum Joint Session on Information Modeling for Computing Resources OGF 20 - Manchester, 7 May 2007.
© 2007 Open Grid Forum JSDL-WG Session OGF21 – Activity schema session 17 October 2007 Seattle, U.S.
© 2006 Open Grid Forum OGSA Next Steps Discussion Providing Value Beyond the Specifications.
© 2008 Open Grid Forum Resource Selection Services OGF22 – Boston, Feb
© 2006 Open Grid Forum Network Services Interface OGF29: Working Group Meeting Guy Roberts, 19 th Jun 2010.
© 2007 Open Grid Forum JSDL-WG Session 1 OGF25 – General Session 11:00-12:30, 3 March 2009 Catania.
© 2006 Open Grid Forum JSDL Optional Elements OGF 24 Singapore.
© 2007 Open Grid Forum Data/Compute Affinity Focus on Data Caching.
© 2006 Open Grid Forum Joint Session on Information Modeling for Computing Resources (OGSA Modeling Activities) OGF 21 - Seattle, 16 October 2007.
© 2006, 2007 Open Grid Forum Michel Drescher, FujitsuOGF-20, Manchester, UK Andreas Savva, FujitsuOGF-21, Seattle, US (update) Extending JSDL 1.0 with.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
1 ©2013 Open Grid Forum OGF Working Group Sessions Security Area – FEDSEC Jens Jensen, OGF Security Area.
© 2006 Open Grid Forum DCI Federation Protocol BoF Alexander Papaspyrou, TU Dortmund University Open Grid Forum March 15-18, 2010, Munich, Germany.
© 2010 Open Grid Forum Standards All Hands Meeting OGF28, München, March 2010.
© 2006 Open Grid Forum Network Services Interface OGF 32, Salt Lake City Guy Roberts, Inder Monga, Tomohiro Kudoh 16 th July 2011.
© 2010 Open Grid Forum OCCI Status Update Alexander Papaspyrou, Andy Edmonds, Thijs Metsch OGF31.
© 2007 Open Grid Forum JSDL-WG Session OGF22 – General Session (11:15-12:45) 25 February 2008 Boston, U.S.
© 2006 Open Grid Forum FEDSEC-CG Andrew Grimshaw and Jens Jensen.
© 2006 Open Grid Forum Activity Instance Schema Philipp Wieder (with the help of the JSDL-WG) Activity Instance Document Schema BoF Monday, 25 February,
© 2006 Open Grid Forum Network Services Interface OGF 33, Lyon Guy Roberts, Inder Monga, Tomohiro Kudoh 19 th Sept 2011.
© 2006 Open Grid Forum HPC Job Delegation Best Practices Grid Scheduling Architecture Research Group (GSA-RG) May 26, 2009, Chapel Hill, NC, US.
© 2006 Open Grid Forum GridRPC Working Group 15 th Meeting GGF22, Cambridge, MA, USA, Feb
© 2006 Open Grid Forum Joint Session on Information Modeling for Computing Resources OGF 21, Seattle, Tuesday 16 October 2007.
© 2006 Open Grid Forum Network Services Interface CS Errata Guy Roberts, Chin Guok, Tomohiro Kudoh 29 Sept 2015.
© 2006 Open Grid Forum OGSA-WG: EGA Reference Model GGF18 Sept. 12, 4-5:30pm, #159A-B.
© 2006 Open Grid Forum Remote Instrumentation Services in Grid Environment Introduction Marcin Płóciennik Banff, OGF 27 Marcin Płóciennik.
© 2007 Open Grid Forum OGF Management Area Meeting OGF20 7 May, am-12:30pm Manchester, UK.
© 2007 Open Grid Forum Status Reviews and Plans Production Grid Infrastructure (PGI) - WG Morris Riedel et al. Juelich Supercomputing Centre PGI Co-Chair.
© 2006 Open Grid Forum Grid Resource Allocation Agreement Protocol GRAAP-WG working session 1 Thursday, 5 March, 2009 Catania, Sicily.
© 2006 Open Grid Forum VOMSPROC WG OGF36, Chicago, IL, US.
© 2007 Open Grid Forum OGF20 Levels of the Grid Workflow Interoperability OGSA-WG F2F meeting Adrian Toth University of Miskolc NIIF 11 th May, 2007.
© 2006 Open Grid Forum Network Services Interface 2015 Global LambdaGrid Workshop Prague Guy Roberts, Chin Guok, Tomohiro Kudoh 28 Sept to 1 Oct 2015.
© 2008 Open Grid Forum Production Grid Infrastructure WG State Model Discussions PGI Team.
© 2007 Open Grid Forum JSDL-WG Session OGF26 – General Session 11:00-12:30, 28 May 2009 Chapel Hill, NC.
Network Services Interface
GGF Intellectual Property Policy
Welcome and Introduction
OGSA EMS Session OGF19 OGSA-WG session #3 30 January, :30pm
RISGE-RG use case template
OGSA Data Architecture WG Data Transfer Discussion
OGSA Data Architecture Scenarios
GridRPC Working Group 13th Meeting
Grid Resource Allocation Agreement Protocol
OGF session PMA, Florence, 31 Jan 2017.
Sharing Topology Information
Network Services Interface
Network Services Interface Working Group
OGSA-Workflow OGSA-WG.
Information Model, JSDL and XQuery: A proposed solution
OGSA Data Architecture Scenarios
Network Measurements Working Group
WS Naming OGF 19 - Friday Center, NC.
Activity Delegation Kick Off
Network Services Interface Working Group
OGSA-RSS-WG EPS Discussion.
Introduction to OGF Standards
Proposed JSDL Extension: Parameter Sweeps
RNS Interoperability and File Catalog Standardization
UR 1.0 Experiences OGF 24, Singapore.
OGF 40 Grand BES/JSDL Andrew Grimshaw Genesis II/XSEDE
Presentation transcript:

OGSA Data Architecture Dave Berry & Allen Luniewski OGF20, 7th May 2007

OGF IPR Policies Apply “I acknowledge that participation in this meeting is subject to the OGF Intellectual Property Policy.” Intellectual Property Notices Note Well: All statements related to the activities of the OGF and addressed to the OGF are subject to all provisions of Appendix B of GFD-C.1, which grants to the OGF and its participants certain licenses and rights in such statements. Such statements include verbal statements in OGF meetings, as well as written and electronic communications made at any time or place, which are addressed to: the OGF plenary session, any OGF working group or portion thereof, the OGF Board of Directors, the GFSG, or any member thereof on behalf of the OGF, the ADCOM, or any member thereof on behalf of the ADCOM, any OGF mailing list, including any group list, or any other list functioning under OGF auspices, the OGF Editor or the document authoring and review process Statements made outside of a OGF meeting, mailing list or other function, that are clearly not intended to be input to an OGF activity, group or function, are not subject to these provisions. Excerpt from Appendix B of GFD-C.1: ”Where the OGF knows of rights, or claimed rights, the OGF secretariat shall attempt to obtain from the claimant of such rights, a written assurance that upon approval by the GFSG of the relevant OGF document(s), any party will be able to obtain the right to implement, use and distribute the technology or works when implementing, using or distributing technology based upon the specific specification(s) under openly specified, reasonable, non-discriminatory terms. The working group or research group proposing the use of the technology with respect to which the proprietary rights are claimed may assist the OGF secretariat in this effort. The results of this procedure shall not affect advancement of document, except that the GFSG may defer approval where a delay may facilitate the obtaining of such assurances. The results will, however, be recorded by the OGF Secretariat, and made available. The GFSG may also direct that a summary of the results be included in any GFD published containing the specification.” OGF Intellectual Property Policies are adapted from the IETF Intellectual Property Policies that support the Internet Standards Process. IPR Notices Note Well for OGF meetings 2

Contents Current Status Architecture Document Scenarios Document 3

Two Informational Documents OGSA Data Architecture 70+ pages Describes services and their interfaces OGSA Data Scenarios 50+ pages Describes how the services can be composed to address particular scenarios 4

Progress since OGF19 Slow progress Small group We would welcome new contributors Focus on polishing the architecture document Mostly complete; review/editing in progress Updating and completing glossary We welcome feedback and comments 5

Current State Architecture Document Scenarios Document Technical content substantially complete Review of presentation in progress Scenarios Document Some scenarios still need work Integration with execution management services Aiming for submission in August 6

Architecture Document 7

Scope Files and databases (& storage) No stream management, session management, … Services and interfaces Storage, Access, Transfer Replication, Caching, Federation, Metadata catalogues Cross-cutting themes Security, Policies, … Part of the bigger OGSA picture E.g. Naming, Workflow, Transactions, Scheduling, Provisioning, … 8

Architecture Document Services Data Transfer Data Access Storage Resource Management Data Cache Data Replication Data Federation Metadata Catalogues Appendices Specifications referenced Mappings to specifications DAIS ByteIO SRM DMI … Glossary 9

Client APIs (non-OGSA) / Other services Basic Structure Client APIs (non-OGSA) / Other services Sink/ Source Sink/ Source Description Storage Access Description Access Storage Management Data Service Data Service Stored Data Resources Other Data Resources Managed Storage Service interface Resource interface 10

Transfer and Replication Client APIs (non-OGSA) / Other services Replication Transfer Replication Transfer Sink/ Source Description Sink/ Source Access Access Description Data Service Data Service Data Resources Data Resources Transfer Protocols Service interface Resource interface 11

Composite Entities Federation Cache Data Service Data Service Sink/ Source Description Sink/ Source Access Access Description Federation Cache Sink/ Source Description Sink/ Source Access Description Access Data Service Data Service Service interface Resource interface 12

Unstructured access RandomIO StreamableIO Sessionless Read() Write() Truncate() StreamableIO Stateful session resource Read() Write() Not part of data access layer Create() Destroy() 13

Structured access BasicExecute() SQLExecute() XQuery() RDF… XUpdate() 14

Storage Management (1) Directory Management Functions Synchronous & asynchronous List Files Release File Locks Remove Files Copy Files Move Files Make Directory Delete Directory 15

Storage Management (2) Space Management Functions: Sink / Source Reserve Space() Get Space() Release Space() Set Quota() Sink / Source Covered by Transfer 16

Data Transfer See the DMI WG Create EPR e.g. GridFTP Control Transfer Factory Create Source EPR User e.g. GridFTP Transfer Instance Control User wants to move file “foo” from Source Service to file “bar” to Sink Service. User contacts Source Service to get an WS-Name_foo <MyEPR> <wsa:Address> URL <ReferenceParameters> User contacts Sink Service to get an WS-Name_bar User contacts movement factory with: WS-Name_foo and WS-Name_bar (may pass in QoS parameteres + *) Movement factory talks to the service represented in WS-Name_foo and WS-Name_bar to obtain supported movement protocols Movement factory picks a protocol by some unspecified method Movement factory contacts service represented in WS-Name_foo with chosen protocol identifier and gets back a protocol specific URI_foo handle which represents the information required to transfer data from the source. Movement factory contacts WS-Name_bar with chosen protocol identifier and gets back a protocol specific URI_bar handle which represents the information required to transfer data to the sink Movement factory instantiates a movement service that initiates and monitors the transfer using URI_foo and URI_bar Sink 17

Replication Factory Replica Replica Catalogue CreateReplica() Management Process + Targets ModifyReplicatedContents() SynchroniseReplica() ValidateReplica() Destroy Replica() Replica Catalogue Maps name to list of (replica, target) pairs 18

Federation Factory Federation CreateFederation() AddResourceToFederation() RemoveResourceFromFederation() AddAccessMechanismToFederation() RemoveAccessMechanism() UpdateFederationProperties() Destroy() 19

Cache Factory Federation Create() ReConfigure() Synchronise() Destroy() 20

Gaps: For version 2 Information model for data resources For management, resource reservation, … Simple standard for describing files Name, size, creation time, etc. Registries of URIs that name languages, formats, etc. Security extensions Integration of access and transfer 3rd-party delivery Policies Replication coherency, caching coherency, catalogue consistency, etc. Sessions Streams Provisioning, etc. 21

Scenarios 22

Scenarios document Example scenarios of a generic nature Illustrates how the OGSA Data Architecture can be used to satisfy a selection of scenarios. Tests the architecture Not generating requirements 23

Scenarios Current Data replication Data federation Data pipelining Data staging Personal data service Data storage Data provenance Grid file system Possible Data caching Data warehousing Metadata catalogue service 24

Replication – Direct Access Data Storage 1 2. Provision Data Service 1 Replica Manager 1 Specify Replicas Replication Service 4. Transfer copies 4. Transfer copies Data Transfer Service 3. Register 8. Update 7. Notify Registry Service Example interfaces between the consumer/client, the services and the data storage elements with references to the relevant sections of the OGSA Data Architecture document [OGSA Data Arch]. The interfaces in the different steps of the scenario are as follows: 1. A data resource is: a. Registered with a replicating data service (details such as creation time, access control, etc. would also be included) – section 10.2 “Creating Replicas”. b. The replication service enters the data resource into a replica catalogue – section 10.3 “Discovering Replicas”. 2. The replication service uses a data transfer service to move copies of this data to different locations and tracks which data is kept where – section 6 “Data Transfer”. 3. Clients access the catalogue to find the data resource, or to return a list of resources that satisfy certain Quality of Service (QoS) requirements – section 10.3 “Discovering Replicas”. 4. Clients then access the stores either directly or indirectly – section 7 “Data Access”, i.e. any suitable data access interface such as DAIS or ByteIO [ByteIO]. 5. Changes to the data are notified to the replication service – section 3.4 “Notification of Events”. 6. Updates then occur between the data services to synchronize the replicas – section 6 “Data Transfer”. Here it has been assumed that a catalogue as described in section 12 “Metadata Catalogue & Registries” has been used. But on steps 1b) and 3, if a DAIS exposed database was employed for example, then DAIS [WS-DAI] updates or queries could also be performed. 4. Transfer copies 5. Find data Data Storage 2 6. Access data Data Service 2 User 25

Replication – Indirect Access Data Storage 1 2. Provision Data Service 1 Replica Manager 1 Specify Replicas Replication Service 4. Transfer copies 4. Transfer copies Data Transfer Service 3. Register 8. Update 7. Notify Registry Service User Example interfaces between the consumer/client, the services and the data storage elements with references to the relevant sections of the OGSA Data Architecture document [OGSA Data Arch]. The interfaces in the different steps of the scenario are as follows: 1. A data resource is: a. Registered with a replicating data service (details such as creation time, access control, etc. would also be included) – section 10.2 “Creating Replicas”. b. The replication service enters the data resource into a replica catalogue – section 10.3 “Discovering Replicas”. 2. The replication service uses a data transfer service to move copies of this data to different locations and tracks which data is kept where – section 6 “Data Transfer”. 3. Clients access the catalogue to find the data resource, or to return a list of resources that satisfy certain Quality of Service (QoS) requirements – section 10.3 “Discovering Replicas”. 4. Clients then access the stores either directly or indirectly – section 7 “Data Access”, i.e. any suitable data access interface such as DAIS or ByteIO [ByteIO]. 5. Changes to the data are notified to the replication service – section 3.4 “Notification of Events”. 6. Updates then occur between the data services to synchronize the replicas – section 6 “Data Transfer”. Here it has been assumed that a catalogue as described in section 12 “Metadata Catalogue & Registries” has been used. But on steps 1b) and 3, if a DAIS exposed database was employed for example, then DAIS [WS-DAI] updates or queries could also be performed. 4. Transfer copies 5. Request data 5. Find data Data Storage 2 6. Access data Data Service 2 Data Service 3 26

Visualisation Service Data Pipelining RenderingService 1. Submit job. 2. Store results. 3. Transfer results. Client Data Transfer Service Completed Animations Data Service 4. Return results. – 3rd Party Delivery In this scenario results may be streamed to the customers as soon as they are completed or they could be sent as a whole in one batch on final completion of the rendering job. Gives examples of 3rd Party Delivery and Data Streaming. Example interfaces between the consumer/client, the services and the data storage elements with references to the relevant sections of the OGSA Data Architecture document [OGSA Data Arch]. The interfaces in the different steps of the scenario are as follows: 1. Customer 1 (Designer) submits a rendering job to the Rendering Service – section 3.7 “Reservation and scheduling” and Execution Management Services (EMS) in OGSA. The details of the job would be defined in say JSDL [JSDL]. 2. Completed animation is stored to a common storage device – section 3.7. It is assumed that detail on how and where to store the results would be controlled by the Execution Management Services. 3. Rendering Service transfers the completed animations from the Data Service to the Visualization Service using the Data Transfer Service – section 6 “Data Transfer”. 4. The Visualization Service displays the animations to the clients (Designer & Reviewer/Customer) in an agreed format – most likely an application specific interface. 3. Transfer results. Analyst Visualisation Service 27

Data Storage – Bringing data online 1. Make files online. Data Storage Service 1. Make online. Client Nearline Storage 1. Make online. 3. Retire to nearline. 3. Retire to nearline. 2. Transfer files. Data Transfer Service The interfaces in the different steps of the scenario are as follows: 1. The files are made available online by the Data Storage Service – section 8.8 “SRM Interfaces” 2. The data are read through an appropriate interface, such as the Transfer Service – section 6 “Data Transfer”. 3. The online attribute of the files may expire and they can be retired to nearline storage – section 8.8 “SRM Interfaces”. Online Storage 2. Transfer files. Storage Devices 28

Data Staging 6. Delete output data. Input Data Output Data (copy) Data Transfer Service 2a. 4a. Data Service 1 2b. Transfer input data. 2a. 4a. 4b. Transfer output data. 4a. Stage output data. 2a. Stage input data. Client 1. Submit JSDL script. BES Container Data Service 2 BES Container: 3. Run executable & save resulting output data. Output Data Input Data (copy) 5. Delete input data (copy). 6. Delete output data. 29

Questions?

Full Copyright Notice Copyright (C) Open Grid Forum 2007. All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. The limited permissions granted above are perpetual and will not be revoked by the OGF or its successors or assignees. OGF Full Copyright Notice if necessary 31