© 2006 Open Grid Forum GGF18, 13th September 2006 OGSA Data Architecture Scenarios Dave Berry & Stephen Davey
© 2006 Open Grid Forum 2 Contents Overview Five sample scenarios Data Pipelining Data Storage Data Replication Data Staging (Joint OGSA Data + EMS data staging scenario) Personal Data Service
© 2006 Open Grid Forum 3 Two Informational Documents OGSA Data Architecture 70+ pages Describes the services and their interfaces Some work remaining to describe interfaces ogsa-d-wg/docman.root.working_drafts/doc12659 OGSA Data Scenarios 50+ pages Describes how the services can be combined to address particular scenarios Some work remaining to identify interfaces wg/docman.root.working_drafts/doc13605
© 2006 Open Grid Forum 4 Scenarios document Example scenarios of a generic nature to accompany the OGSA Data Architecture document. Illustrates how the components and interfaces described in the OGSA Data Architecture document can be put together in a selection of typical data scenarios. Not a use case document generating requirements.
© 2006 Open Grid Forum 5 Current Scope Files and databases (& storage) Not streams, sessions, … Services and interfaces Storage, Access, Transfer Replication, Caching, Federation, Metadata catalogues Cross-cutting themes Security, Policies, … Part of the bigger OGSA picture E.g. Naming, Workflow, Transactions, Scheduling, Provisioning, …
© 2006 Open Grid Forum 6 Progress since GGF16 More scenarios E,g, Provenance, Grid File System More integration Particularly between scenarios and architecture document Also raising some issues from individual chapters to cross-cutting concerns
© 2006 Open Grid Forum 7 Scenarios done so far … Data Storage – store file data in a Grid Data Service and retrieve it later. Data Replication – maintain a replica of data at a different location (for availability or performance). Data Staging – the movement of data in preparation for the performing of operations on or with this data. Data Pipelining – connect the output from one service to the input of another. Also in the scenarios document: Data Integration – bringing the data that you require together from disparate sources. Personal Data Service – the organising of an individuals data to allow them access to it from many different locations. Data Discovery – discover data; register data/metadata. Data Provenance – the provenance of a piece of data is the process that led to that piece of data; the history of ownership of an object. Grid File System – provide a virtual file system in a Grid environment.
© 2006 Open Grid Forum 8 Data pipelining Completed Animations Visualisation Service Customer 2 1. Submit job.2. Store results. 3. Transfer results. 4. Return results. Customer 1 Data Transfer Service 3. Transfer results. Rendering Service Data Access Service
© 2006 Open Grid Forum 9 Data Pipelining Completed Animations Visualisation Service Customer 2 1. Submit job.2. Store results. 3. Transfer results. 4. Return results. Customer 1 Data Transfer Service 3. Transfer results. Rendering Service Data Service
© 2006 Open Grid Forum 10 Bringing data online Storage Devices Customer Data Storage Service Transfer Service 1. Make files online. 2. Read files. Nearline Storage Online Storage 1. Make online. 3. Retire to nearline.
© 2006 Open Grid Forum 11 Data Storage – Writing a file Storage Devices Customer Data Storage Service Access Service Transfer Service 1. Request file space. 4a. Write file. File Space 4a. Write file. 4b. Access file. 4c. Transfer file. 4b. Access file. 4c. Transfer file. 2. Get file name (SURL). 3. Get Transfer URL (TURL) or Access URL. 5. Notify of completion.
© 2006 Open Grid Forum 12 Data Storage – Bringing data online Storage Devices Customer Data Storage Service Transfer Service 1. Make files online. 2. Transfer files. Nearline Storage Online Storage 1. Make online. 3. Retire to nearline.
© 2006 Open Grid Forum 13 Replication Customer 1 Data Transfer Service Replication Service Data Storage 1 Data Storage 2 Data Service 2 Data Service 1 1b. Publish 2. Transfer copies 6. Update 4. Access data 5. Notify 2. Transfer copies 2. Transfer copies Registry Service 3. Find data 1a. Register data Customer 2
© 2006 Open Grid Forum 14 Data Replication – 1 Customer 1 Data Transfer Service Replication Service Data Storage 1 Data Storage 2 Data Service 2 Data Service 1 1b. Publish 2. Transfer copies 6. Update 4. Access data 5. Notify 2. Transfer copies 2. Transfer copies Registry Service 3. Find data 1a. Register data Customer 2
© 2006 Open Grid Forum 15 Data Replication – 2 1. Register Customer 1 Data Transfer Service Data Storage 1 Data Storage 2 Data Service 2 Data Service 1 2. Transfer copies 6. Update 3. Find data 4. Access data 5. Notify 2. Transfer copies 2. Transfer copies Repli- cation Service Data Service Replica Catalogue Service Customer 2
© 2006 Open Grid Forum 16 Joint OGSA Data + EMS Scenario The steps of this simple scenario are as follows: 1.Submit job to BES container. (JSDL contains execution & data staging info). 2.Use data transfer service to do the required data staging. 3.Run the executable on the BES container with the input data. 4.Stage result output data back to Data Service 1. 5.Delete staged input data at BES container. 6.Delete staged output data BES container.
© 2006 Open Grid Forum 17 Data Staging Data Transfer Service Input Data BES Container Input Data (copy) Output Data 1. Submit JSDL script. 2a. Stage input data. Data Service 1 Data Service 2 2b. Transfer input data. 4a. Stage output data. 2a. 4a. 2a. 4a. Client 4b. Transfer output data. BES Container: 3. Run executable & save resulting output data. 5. Delete input data. 6. Delete output data. Output Data (copy)
© 2006 Open Grid Forum 18 Data Staging Data Transfer Service BES Container Input Data (copy) Output Data 1. Submit JSDL script. 2a. Stage input data. Data Service 1 Data Service 2 2b. Transfer input data. 4a. Stage output data. 2a. 4a. 2a. 4a. Client 4b. Transfer output data. BES Container: 3. Run executable & save resulting output data. 5. Delete input data (copy). 6. Delete output data. Input Data Output Data (copy)
© 2006 Open Grid Forum 19 Data Staging Data Transfer Service BES Container Input Data (copy) Output Data 1. Submit JSDL script. 2a. Stage input data. Data Service 1 Data Service 2 2b. Transfer input data. 4a. Stage output data. 2a. 4a. 2a. 4a. Client 4b. Transfer output data. BES Container: 3. Run executable & save resulting output data. Input Data Output Data (copy) 5. Delete input data (copy). 6. Delete output data.
© 2006 Open Grid Forum 20 Personal Data Service Customer 1 (site 1) Registry Service Data Service 1 Data Service 2 Data Service 3 Local Cache Service 2 Local Cache Service 1 Index 2. Create named space. 3. Name collection. 1. Locate data. 2. Create. 4. Use named space. Customer 1 (site 2) 6. Use named space. 7. Update. 5. Update. Personal Data Service Global Name Resolver Service
© 2006 Open Grid Forum 21 Personal Data Service Customer 1 (site 1) Registry Service Data Service 1 Data Service 2 Data Service 3 Local Cache Service 2 Local Cache Service 1 Index 2. Create named space. 3. Name collection. 1. Locate data. 2. Create. 4. Use named space. Customer 1 (site 2) 6. Use named space. 7. Update. 5. Update. Personal Data Service Global Name Resolver Service
© 2006 Open Grid Forum Questions?