OGSA-DAI Open Grid Services Architecture – Data Access and Integration NeSC Review 18 March 2004
Description and Aims OGSA–DAI Provide a uniform access framework for heterogeneous data resources on the Grid Data resources: Relational databases XML collections Can widen scope, e.g. files, any data source/sink Middleware which: Reduces development cost of data centric Grid applications Facilitates Grid data centric application development Facilitates data integration Increased collaboration
Status Functional Scope defined goals Had a 66 point functional scope for Phase II All MUSTs/SHOULDs achieved except: Statement Metadata (partially complete) –We do not list all SQL operations a GDS supports MAYs completed: Transforming application data (compression, XSLT) Caching of data Block transfer of data Scripting (though we no longer call it that) Persistent components
Workplan All workpackages and deliverables completed The project lasted 92 ( ) weeks. Over the 92 weeks (4/2/02 to 31/10/03): 138 SM of effort at EPCC were planned SM were expended 1% overrun Workpackage (WP)Status 1. Programme Management 1.1 End of Phase 2 review D1.1 End of Phase 2 Report FINISHED Oracle Complete 2. Architecture 2.1 High Level Design D2.1 High level system architecture Due M3, 31/12/2002, delivered 25/11/ Architectural definition D2.2 Architecture framework definition Due M3, 31/12/2002, delivered 7/11/2002 FINISHED EPCC/IBM Complete EPCC/IBM/NeSC/UoM Complete 3. Development 3.1 Detailed Component design D3.1.1 Release design documentation (Release 1) D3.1.2 Release design documentation (Release 2) D3.1.3 Release design documentation (Release 3) 3.2 Component implementation D3.2.1 Release implementation code (Release 1) D3.2.2 Release implementation code (Release 2) D3.2.3 Release implementation code (Release 3) 3.3 Write documentation D3.3.1 Release system and user documentation (Release 1) D3.3.2 Release system and user documentation (Release 2) D3.3.3 Release system and user documentation (Release 3) 3.4 Develop test strategy and test suites D3.4 Release test strategy and test suites 3.5 Perform tests D3.5.1 Documented test results (Release 1) D3.5.2 Documented test results (Release 2) D3.5.3 Documented test results (Release 3) FINISHED EPCC/IBM Complete EPCC/IBM Complete EPCC/IBM Complete EPCC/IBM Complete EPCC/IBM Complete 4. Distributed Query Processing 4.1 Revise DQP model D4.1 Revised DQP model 4.2 Develop prototype D4.2 Distributed query prototype FINISHED UoM/UoN Complete UoM/UoN Complete 5. Release ManagementComplete (Phase 2)
Releases Releases added functionality in staged deliveries Kept on target Max slippage was 2 weeks due to GGF Made available through project website and GTR Early adopters had early access to release candidates 1199 downloads at 31 st Oct % from UK ReleaseRelease Date Release /09/03 Release 331/07/03 Release 2 interim 11/06/03 Release 215/04/03 Release 1 interim 28/02/03 Release 115/01/03
The Basics Data Resource Container DAISGR Client GDSF GDS
Technical Achievements Grid Data Service Perform documents allow for powerful “scripting” Composition of requests (encapsulation of activities) Activity Framework easily extended by developers Variety of delivery/upload mechanisms SOAP/HTTP, GridFTP, GDT Can achieve complex composition patterns e.g. distributed queries using temporary tables Grid Data Service Factory Simple to configure Supported databases: MySQL, DB2, Oracle, XIndice Other “working” databases: SQL Server, Postgres, Access (via JDBC/ODBC) DAI Service Group Registry Framework for service discovery
Dissemination (1) Selected Presentations (EPCC during Phase II only) 10th Anniversary of Poznan Supercomputing Centre, October 24, 2003 Designing and Building Grid Services Workshop, Chicago, October 8, 2003 Glasgow Kelvin Hub opening, September 17, 2003 All Hands presentations and demonstrations, Nottingham, September 2-4, 2003 DAIS F2F, NeSC, August 21 – 22, 2003 ASTAR Visit, NeSC, July 14, 2003 Virtual Observatory as a Data Grid, NeSC, June 30 – July 2, 2003 Geoffrey Fox visit, EPCC, April 4, 2003 NeSC Review, NeSC, March 28, 2003 OGSA-DAI / Informatics meeting, NeSC, March 27, 2003 OGSA Experiences Panel, GGF7 Tokyo, March 4-7, 2003 NeSC Open Day, NeSC, January 17, 2003
Dissemination (2) Posters NeSC Review, NeSC, September 30, 2003 UK e-Science All Hands, Nottingham, September 2-4, 2003 GlobusWorld January 13 – 17, 2003 Publications through GGF DAIS – File Access, September 19, 2003 DAIS – Grid Data Service Specification, September 19, 2003 DAIS – Relational Specialisation, September 19, 2003 DAIS – XML Specialisation, September 19, 2003 DFDL – Basic Structures Ontology, August 5, 2003 DFDL – Primitive Type Ontology, August 5, 2003 DFDL – Structural Description, August 5, 2003 DFDL – XML Representation, August 5, 2003 DFDL – Primer, June 4, 2003 Other notable publications “Grid Security for Dummies”, available from OGSA-DAI website, October 29, 2003
Training Course and tutorials by EPCC staff were run at: eScience Summer School, NeSC, September 29 – October 3rd, 2003 International Summer School on Grid Computing, Naples, July 13 – 25, 2003 OGSA-DAI Training Course, NeSC, April 22, 2003 OGSA-DAI Tutorial, GGF7 Tokyo, March 4, 2003 Creating Grid Services using GT3 and Java course, NeSC, February 24, 2003 OGSA-DAI Training Course, NeSC, February 11, 2003 OGSA-DAI Training Course, NeSC, January 8, 2003 “Show and tell” method of increasing exposure
Support Support for OGSA-DAI through Grid Support Centre from Release 2 Very useful to encourage user take-up Query desk Regular stream of queries Active user list “Power Users” submitted answers to other users questions Discussed innovative ways of extending OGSA- DAI
Exploitation Projects started at both EPCC/NeSC and IBM using OGSA-DAI: eDiaMoND FirstDIG INWA BRIDGES EdSkyQueryG Many more projects using OGSA-DAI Presentations, Visits and Training have been vital to uptake of OGSA-DAI
Future plans Work continues under the DAIT project Research and develop OGSA-DAI software Improve performance and scalability Liase with technology adopters Make sure OGSA-DAI works for them Liase with Globus Globus Alliance OGSA-DAI also distributed through Globus Toolkit Continue standardisation process through DAIS We’ve done Data Access … now it’s time for Data Integration!
Project Participants EPCC Ali Anjomshoaa, Mario Antonioletti, Rob Baxter, Neil Chue Hong, Ally Hume, Mike Jackson, Amy Krause, Jeremy Nowell, Charaka Palansuriya, Tom Sugden, Martin Westhead IBM UK Brian Collins, Simon Laws, Andrew Borley, James Magowan, Neil Hardman, George Hicken, Manfred Oevers, Alan Knox IBM US Susan Malaika, Inderpal Narang NeSC Malcolm Atkinson Oracle UK Dave Pearson University of Manchester Norman Paton, Nedim Alpdemir University of Newcastle Paul Watson, Arijit Mukherjee