Extensible Framework for Data Access & Integration Malcolm Atkinson Director www.nesc.ac.uk 10 th November 2004.

Slides:



Advertisements
Similar presentations
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
Advertisements

Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
Chinese Delegation visit Malcolm Atkinson Director 18 th November 2004.
E-Science Data Information and Knowledge Transformation Eldas Building Service Grids with Enterprise Level Data Access Services Alan Gray
Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
National e-Science Centre Glasgow e-Science Hub Opening: Remarks NeSCs Role Prof. Malcolm Atkinson Director 17 th September 2003.
National e-Science Centre & e-Science Institute Malcolm Atkinson Director 2 nd March 2005.
Open Grid Service Architecture - Data Access & Integration (OGSA-DAI) Dr Martin Westhead Principal Consultant, EPCC Telephone: Fax:+44.
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
An Overview of OGSA-DAI Kostas Tourlas
1 G2 and ActiveSheets Paul Roe QUT Yes Australia!
Grid-Enabling Data: Sticking Plaster, Sellotape, & Chewing Gum? Colin C. Venters National Centre for e-Social Science University.
High Performance Computing Course Notes Grid Computing.
EGEE is a project funded by the European Union under contract IST International Summer School on Grid Computing Vico Equense, 16 th July 2005.
EGEE is a project funded by the European Union under contract IST Grid Summer School Vico Equense, 27 July An Introduction.
EGEE is a project funded by the European Union under contract IST Grid Summer School Vico Equense, 27 July An Introduction.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
1 GGF International Summer School on Grid Computing Vico Equense (Naples), Italy Introduction to OGSA-DAI Prof. Malcolm Atkinson Director
NASA World Wind. What is NASA World Wind? A richly interactive 3D planetary visualization tool. Smart client architecture. Portal for NASA data. Integrates.
17 July 2006ISSGC06, Ischia, Italy1 Agenda Session 26 – 14:30-16:00 An Overview of OGSA-DAI OGSA-DAI today – and future features How to extend OGSA-DAI.
Designing and Building Grid Services GGF9 Chicago October 8, 2003 Organizers: Ian Foster, Marty Humphrey, Kate Keahey, Norman Paton, David Snelling.
Computer Science and Engineering A Middleware for Developing and Deploying Scalable Remote Mining Services P. 1DataGrid Lab A Middleware for Developing.
Data Integration in Service Oriented Architectures Rahul Patel Sr. Director R & D, BEA Systems Liquid Data – XML-based data access and integration for.
Welcome e-Science in the UK Building Collaborative eResearch Environments Prof. Malcolm Atkinson Director 23 rd February 2004.
OGSA-DAI: Future Work and Wrap-up The OGSA-DAI Team
A long tradition. e-science, Data Centres, and the Virtual Observatory why is e-science important ? what is the structure of the VO ? what then must we.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
1 UK NeSC Meeting, November 18 th, 2004 Terry Sloan EPCC, The University of Edinburgh INWA : using OGSA-DAI in a commercial environment.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
UKQCD QCDgrid Richard Kenway. UKQCD Nov 2001QCDgrid2 why build a QCD grid? the computational problem is too big for current computers –configuration generation.
Introduction to OGSA-DAI The OGSA-DAI Team
DAIT (DAI Two) NeSC Review 18 March Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids:
VAMDC use-case for the RDA Data Citation Working Group C.M. Zwölf and VAMDC consortium 6 th RDA Plenary PARIS September 2015.
OGSA-DAI in OMII-Europe Neil Chue Hong EPCC, University of Edinburgh.
1 HPDC12 Seattle Structured Data and the Grid Access and Integration Prof. Malcolm Atkinson Director 23 rd June 2003.
Data and storage services on the NGS Mike Mineter Training Outreach and Education
3-Tier Client/Server Internet Example. TIER 1 - User interface and navigation Labeled Tier 1 in the following graphic, this layer comprises the entire.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
ISIM’06, Přerov ; Corporate Memory Corporate Memory: A framework for supporting tools for acquisition, organization and maintenance of information.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
INFSO-RI Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
Mike Jackson EPCC OGSA-DAI Architecture + Extensibility OGSA-DAI Tutorial GGF17, Tokyo.
DAME: A Distributed Diagnostics Environment for Maintenance Dr Tom Jackson University of York.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI Technology Update GGF17, Tokyo (Japan)
Information Integration BIRN supports integration across complex data sources – Can process wide variety of structured & semi-structured sources (DBMS,
European Network Policy Group Malcolm Atkinson Director 28 th October 2004.
6 February 2009 ©2009 Cesare Pautasso | 1 JOpera and XtremWeb-CH in the Virtual EZ-Grid Cesare Pautasso Faculty of Informatics University.
Exploring ‘Workspaces’ Tom Visser, SARA compute and networking services, Amsterdam Garching Workshop 21 st September 2010.
1 The Challenge of Data Integration Data + Grid = Discovery? Prof. Malcolm Atkinson Director 22 nd January 2003.
1 OGSA-DAI Status Report Neil P Chue Hong 20 th May 2005.
OGSA-DAI & DAIT projects Update for TAG Prof. Malcolm Atkinson Director 30 th October 2003.
OGSA-DAI Users’ Meeting Introduction Malcolm Atkinson Director 7 th April 2004.
The OGSA-DAI Project Databases and the Grid Neil Chue Hong Project Manager EPCC, Edinburgh
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
Data and storage services on the NGS.
Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director 3 rd October 2003.
The National Grid Service User Accounting System Katie Weeks Science and Technology Facilities Council.
Welcome Grids and Applied Language Theory Dave Berry Research Manager 16 th October 2003.
UK Role in Open Grid Services Architecture Towards an Architectural Road Map A Report to the Technical Advisory Group from The Architecture Task Force.
OGSA-DAI.
Servicing Seismic and Oil Reservoir Simulation Data through Grid Data Services Sivaramakrishnan Narayanan, Tahsin Kurc, Umit Catalyurek and Joel Saltz.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
Welcome to National e-Science Centre Official Opening
UK e-Science OGSA-DAI November 2002 Malcolm Atkinson
University of Technology
Google Sky.
Presentation transcript:

Extensible Framework for Data Access & Integration Malcolm Atkinson Director 10 th November 2004

Database Growth PDB Content Growth

Wellcome Trust: Cardiovascular Functional Genomics Glasgow Edinburgh Leicester Oxford London Netherlands Shared data Public curated data BRIDGES IBM

Biochemical Pathway Simulator (Computing Science, Bioinformatics, Beatson Cancer Research Labs) DTI Bioscience Beacon Project Harnessing Genomics Programme Slide from Muffy Calder, Glasgow Now largest EU project in the Life Sciences – see Walter Kolch

eDiaMoND – Compute Mammograms have different appearances, depending on image settings and acquisition systems Standard Mammo Format Standard Mammo Format Temporal mammography Computer Aided Detection 3D View Provided by eDiamond project: Prof. sir Mike Brady et al.

Automatic registration technology Rigid registration of MR and CT images of the head Inter-subject image warping Provided by IXI project: Prof. Derek Hill et al.

Move Computation to Data Code scale Depends on wet-ware  No noticeable rate of improvement Data scale Grows Moore’s Law or Moore’s Law 2 Analysis of data Extracts & derivatives used  Often smaller – more value for current investigation Implies move code to data SQL, Xquery, Java code, … Extensibility mechanisms used by OGSA-DAIers Java mobility (e.g. DataCutter), database procedures, … Increasingly necessary Application control or higher-level service decisions

Integration is Everything Motivation No business or research team is satisfied with one data resource Data Curation Expertise Human Centred Integration Human centred Domain-specialist driven Dynamic specification of combination function Iterative processes  Revised request minutes later  Revised request after months of thought Sources inevitably heterogeneous Time-varying content, structure & policies Robust, stable steerable integration services Higher-level services over multiple resources Fundamental requirements for (re)negotiation Federation or Virtualisation preceding integration or kit of integration tools to be interwoven with an application?

OGSA Infrastructure Architecture Grid or Web Service Infrastructure Data Intensive Applications for Science X Compute, Data & Storage Resources Distributed Simulation, Analysis & Integration Technology for Science X Data Intensive X Scientists Virtual Integration Architecture Generic Virtual Data Access and Integration Layer Structured Data Integration Structured Data Access Structured Data Relational XML Semi-structured- Transformation Registry Job Submission Data TransportResource Usage Banking BrokeringWorkflow Authorisation OGSA-DAI

Database (Xindice, MySQL Oracle, DB2) Request to Registry for sources of data about “x” Registry responds with Factory handle Request to Factory for access to database Factory creates GridDataService Factory returns handle of GDS to client Client queries GDS with SQL, XPath, XQuery etc GDS interacts with database Query results returned XML SOAP/HTTP service creation API interactions Analyst Registry GDSR Factory GDSF Grid Data Service GDS Consumer OR delivered to consumer as XML OGSA-DAI

OGSA-DAI Downloads R4 690 downloads since May 04 -Actual user downloads not search engine crawlers -Does not include downloads as part of GT3.2 releases Total of 838 registered users R1.0 (Jan 03)104 R1.5 (Feb 03)108 R2.0 (Apr 03)250 R2.5 (Jun 03)291 R3.0 (Jul 03)792 R3.1 (Feb 04)630 Total2865 United Kingdom 21% China 26% United States 13% Japan 5% Unknown 7% Germany 5% Italy 5% Austria 2% Australia 2% France 3% Taiwan 2% Downloads by Country – OGSA-DAI R4.0

Multiple tasks / request Ident Type Value Ident Type Value Ident Type Value Ident Type Value Ident Type Value Ident Type Value Ident Type Value Ident Type Value

Be Direct Double Handling costs too much Memory cycles, bus capacity, cache disruption, … Double Handling via discs pathologically bad Data translation expensive Avoid  Deliver as stored, … Compose Stream Main memory is not big enough Stream or use Disk Couple generator & consumer directly Stream from RAM to RAM Requires coupled computation execution Breaks down boundaries and merges data, execution & transport requirements. Demands smart workflow enactment service & foundation services Models for process transformation and optimisation

Take Home Message Data Access & Integration Two Models  kit of parts  Virtualisation Ubiquitous Needs Pervasive and growing number and diversity of data collections Opportunity and power to integrate and mine OGSA-DAI Pioneering Talk by Amrey Krause - 5:15 Today Growing Community Implementation Standards Users Join the party of users, contributors & researchers