e-Science Data Information and Knowledge Transformation Edikt : e-Science Data, Information and Knowledge Transformation NeSC Review, 30 September 2003 Dr. Denise Ecklund, edikt technical architect
2 What is edikt? The team: 8 professional software engineers, architect, project manager, and support staff SHEFC funded research and development grant –3 years funding: May 2002 – 2005 –+3 years funding upon successful project and review Standards Edikt project Requirements analysis Technology matchmaking Gap fillingRigorous engineering CS Research Grid Services for e-Science Data Management Commercial SW components and skills E-Science Apps
3 Current activities Eldas – Enterprise level data access services –Core data services supporting e-Science virtual organisations BinX – Binary XML –Supports data interchange for astronomy and other applications OSAGE – Ontology-based Species Atlas for Gene Expression –Defines a database schema for storing and annotating 3D anatomy and gene expression data for multiple species Technology and research evaluations
4 Creating a Virtual Organization Radio spectrumX-Ray spectrumOptical spectrum DB2 DB MySQL DBXindice DB Lets share our data! Great! How do I get it? Great! Where is it? Its at X. Get it with Y I cant find it! I cant read it! ELDAS + Grid Directory Services
5 ELDAS – Extensibility via DACs Data Access Components interface to distinct DBMSs Multiple DB drivers can be supported –JDBC, ODBC for relational DBMSs Plug-n-Play installation of ELDAS ELDAS User1User2User3 Reusable ELDAS Core DB2 DBMySQL DBXindice DBOracle 9i DB DAC DAC2 ELDAS Core
6 Java Framework ELDAS – EJB Implementation Java 2 Enterprise Edition implements basic server tasks Java Beans container used to implement ELDAS core ELDAS DB2 DBMySQL DBXindice DB Web User1 Oracle 9i DB EJB - GDS DAC ELDAS runs anywhere Web Servlet Grid Proxy Grid User1Grid User2 Suitable for grid & web
7 e-Science Application Binary Data File BinX – accessing legacy binary data The Problem: –Many binary data files –Applications must know the data format –Binary data formats are machine-specific BinX Library The Solution: –Write a stand-aside format description in XML –Provide a library to Interpret the description Provide file access across different machines –Build higher-level services BinX file describes binary file structure simulations
8 BinX – format transformation Even when we try to agree, we disagree Binary Data File Spectral Analysis Application Binary Data File 3D Image Data Mining Application FITS data format VOTable data format BinX description BinX Utilities BinX Library Data format transformations based on XML descriptions Multiple data format standards require conversions
9 OSAGE – Applying Computer Science Extend the Edinburgh Mouse Atlas –Data model to describe multiple species –Support scientific collaboration via data sharing DB2 DB CS theory Data Access Services Computer Science theory and best practice –Generic data model for species anatomy –Flexible data annotation and versioning with XML
10 The Future – bringing components together ELDAS... User Annotation Service Data Versioning Service DB2 DBMySQL DBXindice DB Data Archiving Service Constraint Mgmt Service CS research results layered over basic ELDAS services BinX Library Binary Files BinX is an intelligent binary file data source Extended Grid Data Services for Virtual Organisations
e-Science Data Information and Knowledge Transformation Thank you! Questions?