DAIT (DAI Two) NeSC Review 18 March 2004
Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids: May be geographically distributed Storage technology and formats are not homogeneous Need to dynamically bind to data sources on demand Grid enabled data resources Discoverable through published metadata Robust against variations in standards Easy to aggregate, federate and manage
Issues Important facets that need to be considered: Metadata Data structure, access policies, data content, provenance, physical properties, etc. Registries required for resource discovery/matching Allow dynamic binding to data sources based on provision of metadata Different modes of operation/delivery JDBC, ODBC, bulk data transfer, third-party data transfer, incremental data transfer, delayed data transfer Security mechanisms Authentication, authorisation, accounting, audit, privacy/encryption Data transformation processes Reformatting, compression Facilitate transaction / workflow participation arrangements
Goals for DAIT Keep the OGSA-DAI name for DAIT Aim to deliver application mechanisms that: Meet the data requirements of Grid applications Reduce development cost of data centric Grid applications Provide consistent interfaces to data resources Functionality, performance, etc. Acceptable and supportable by database providers Provide a standard framework that satisfies standard requirements Trustable, imposed demand is acceptable, etc. A base for developing higher-level services Distributed query processing Data federation
Status Project started 1 st October 2003 Funding for 2 years EPCC, NeSC, Newcastle, Manchester Industry Collaboration IBM engaged at start of 2004 Oracle and Fujitsu on Programme Board People: 5 FTEs at EPCC, 5 FTEs at IBM 1 PDRA at each of NeSC, Newcastle, Manchester Releases: 6 monthly major releases Next release scheduled for April
Project Structure Principal Investigators EPCC Development and Support Team Project Manager IBM Development Team IBM Exploitation Team Programme Management Board Research Team Technical Review Board
Projects Using OGSA-DAI OGSA-DAI ( AstroGrid ( BioSimGrid ( BioGrid ( Bridges ( eDiaMoND ( FirstDig ( GeneGrid ( GEON ( IU RGRBench ( myGrid ( N2Grid ( ODD-Genes ( OGSA-WebDB ( INWA (
Points to Note Most projects seem to be biological Feedback from users largely positive Need more Hope to allow user contributions Need to establish a policy/framework for this Engage more with User Community Meetings scheduled for later this year OGSA-DAI users meeting at NeSC in April OGSA-DAI mini-workshop at AHM 2004 OGSA-DAI tutorials at various meetings/locations
Protecting Your Users Very rapidly moving field Technology changes Standard changes Middleware changes Need to ensure: Technology adopters investment in OGSA-DAI is protected Shielded from future change Positives: Document interface helpful Client toolkit Tech Preview in R3.1
DAIT Roadmap R3.1Feb Technical Preview part of R4 User Group: inaugural meeting – April 7 th 2004 R4.0April 2004 Performance & monitoring Additional DBMSs supported (SQL Server, Postgres) DBMS management operations archive, restore, bulk load File access Client libraries Installation wizard User support, courses, training material, performance report
DAIT Roadmap 2 R5 October 2004 Alignment with DAIS Spec Assuming specs settle post GGF11 Integration into Globus Toolkit WS-RF friendly Basic Web Services friendly as well? R6 April 2005 R7 October 2005 Increased data integration tools Functionality driven by user group
Further information The OGSA-DAI Project Site: The DAIS-WG site: OGSA-DAI Users Mailing list General discussion on grid DAI matters Formal support for OGSA-DAI releases OGSA-DAI training courses