Data access and integration with OGSA-DAI: OGSA-DQP Steven Lynden University of Manchester.

Slides:



Advertisements
Similar presentations
Tom Sugden EPCC OGSA-DAI Future Directions OGSA-DAI User's Forum GridWorld 2006, Washington DC 14 September 2006.
Advertisements

OGSA-DQP - A Service-Based Distributed Query Processor for The Grid Arijit Mukherjee University of Newcastle Arijit Mukherjee University.
PHP I.
Service-Based Distributed Query Processing on the Grid M.Nedim Alpdemir Department of Computer Science University of Manchester.
Eldas 1.0 Enterprise Level Data Access Services Design Issues, Implementation and Future Development Davy Virdee.
M.Nedim Alpdemir, Anastasios Gounaris¹, Arijit Mukherjee², Desmond Fitzgerald, Norman W. Paton¹, Paul Watson², Rizos Sakellariou¹, Alvaro A.A. Fernandes¹,
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service.
Data Access & Integration in the ISPIDER Proteomics Grid L. Zamboulis, H. Fan, K. Bellhajjame, J. Siepen, A. Jones, N. Martin, A. Poulovassilis, S. Hubbard,
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Resource wrappers, web services, grid services Jaspreet Singh School of Computer.
Institute for Software Science – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University of.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Amy Krause Applications Consultant, EPCC Tom Sugden Applications Consultant, EPCC OGSA-DAI Client Toolkit Principles.
Slides thanks to Steve Lynden Amy Krause EPCC Distributed Query Processing with OGSA-DQP Principles and Architectures for Structured Data Integration:
1 An Introduction to OGSA-DAI Konstantinos Karasavvas 13 th September 2005.
Mike Jackson EPCC OGSA-DAI Today Release 2.2 Principles and Architectures for Structured Data Integration: OGSA-DAI.
17 July 2006ISSGC06, Ischia, Italy1 Agenda Session 26 – 14:30-16:00 An Overview of OGSA-DAI OGSA-DAI today – and future features How to extend OGSA-DAI.
Client-Server Processing and Distributed Databases
14-18 March 2004 EDBT'04 : Service-Based Distributed Query Processing for the Grid (M N Alpdemir) 1 Title, places, people, funding, projects Manchester.
Module 2: Using Transact-SQL Querying Tools. Overview SQL Query Analyzer Using the Object Browser Tool in SQL Query Analyzer Using Templates in SQL Query.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Christopher Jeffers August 2012
Database Taskforce and the OGSA-DAI Project Norman Paton University of Manchester.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
DAIS Grid1 Database Access and Integration Services on the Grid * * Authors: N. Paton, M. Atkinson, V.
Proteome data integration characteristics and challenges K. Belhajjame 1, R. Cote 4, S.M. Embury 1, H. Fan 2, C. Goble 1, H. Hermjakob, S.J. Hubbard 1,
INFSO-RI Module 01 ETICS Overview Alberto Di Meglio.
The ACGT Workflow Editing & Enactment Environment Giorgos Zacharioudakis Institute of Computer Science, Foundation for Research & Technology – Hellas (ICS-FORTH)
INFSO-RI Module 01 ETICS Overview Etics Online Tutorial Marian ŻUREK Baltic Grid II Summer School Vilnius, 2-3 July 2009.
CERN - IT Department CH-1211 Genève 23 Switzerland t DB Development Tools Benthic SQL Developer Application Express WLCG Service Reliability.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Chapter 1 Introduction Yonsei University 1 st Semester, 2015 Sanghyun Park.
1 1 EPCC 2 Curtin Business School & Edinburgh University Management School Michael J. Jackson 1 Ashley D. Lloyd 2 Terence M. Sloan 1 Enabling Access to.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
3-Tier Client/Server Internet Example. TIER 1 - User interface and navigation Labeled Tier 1 in the following graphic, this layer comprises the entire.
OGSA-DAI.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
INFSO-RI Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.
Mike Jackson EPCC OGSA-DAI Architecture + Extensibility OGSA-DAI Tutorial GGF17, Tokyo.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI Technology Update GGF17, Tokyo (Japan)
OGSA-DQP:Service-Based Distributed Query Processing on the Grid M.Nedim Alpdemir Department of Computer Science University of Manchester.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE User Forum, Manchester, 10 May ‘07 Nicola Venuti
RUS: Resource Usage Service Steven Newhouse James Magowan
Chapter 1 Introduction Yonsei University 1 st Semester, 2014 Sanghyun Park.
Data Integration in Bioinformatics Using OGSA-DAI The BioDA Project Shirley Crompton, Brian Matthews (CCLRC) Alex Gray, Andrew Jones, Richard White (Cardiff.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
Neil Chue Hong Project Manager, EPCC OGSA-DAI Requirements Gathering Exercise 2 nd DIALOGUE workshop eSI, 9-10.
OGSA-DAI Users’ Meeting Introduction Malcolm Atkinson Director 7 th April 2004.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
Mike Jackson EPCC OGSA-DAI Today – Release 8 OGSA-DAI Tutorial GGF17, Tokyo.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
OGSA-DQP Steven Lynden University of Manchester. Data access & integration with OGSA-DAI: GGF 17 2 Introduction OGSA-DQP is a service based distributed.
OGSA-DAI 简介及其它在 China-VO DAS 系统中的应用 杨阳 中国虚拟天文台研发团队 Chinese Virtual Observatory.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Advanced Databases COMP3017 Dr Nicholas Gibbins
1 LM 6 Database Applications Dr. Lei Li. Learning Objectives Explain three components of a client-server system Describe differences between a 2-tiered.
OGSA-DAI.
A Grid Data Integration Service (OGSA-DQP) Paul Watson, University of Newcastle-upon-Tyne based on the work of… Norman Paton, Tasos Gounaris,
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
PHP / MySQL Introduction
Module 01 ETICS Overview ETICS Online Tutorials
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Google Sky.
Web Application Development Using PHP
Presentation transcript:

Data access and integration with OGSA-DAI: OGSA-DQP Steven Lynden University of Manchester

Data access & integration with OGSA-DAI: GGF 17 2 Introduction OGSA-DQP is a service based distributed query processor It evaluates queries over distributed data sources wrapped by OGSA-DAI It is built using OGSA-DAI extensibility points People involved: University of Manchester Tasos Gounaris, Steven Lynden, Alvaro Fernandes, Rizos Sakellariou, Norman Paton University of Newcastle Jim Smith, Arijit Mukherjee, Paul Watson OGSA-DAI Prototype release 3.0 available from the OGSA-DAI website Install on OGSA-DAI WSRF/WS-I 2.1

Data access & integration with OGSA-DAI: GGF 17 3 OGSA-DQP high-level overview OGSA-DQP uses a middleware approach. It can be seen as a mediator over OGSA-DAI wrappers. Usability: use it as an OGSA- DAI data service. DQP is capable of planning, scheduling and executing in parallel the distributed queries Calls to analysis (Web) services can be declared within queries and invoked by DQP. DBMS data OGSA-DQP QueryResults OGSA-DAI DBMS data

Data access & integration with OGSA-DAI: GGF 17 4 Using OGSA-DQP All interactions are client-server based Firstly, configure OGSA-DQP by specifying the data sources and analysis services to be used (administration) DQP creates a global schema which can then be used to formulate queries The user may then submit queries Infrastructural requirements: - OGSA-DAI-wrapped relational databases - Analysis services (optional) - Evaluation infrastructure

Data access & integration with OGSA-DAI: GGF 17 5 OGSA-DQP architecture OGSA-DAI data service perform Evaluator QE Evaluator QE Evaluator QE The “OGSA-DQP service”, Grid Distributed Query Service (GDQS) AKA “Coordinator” AKA Grid Query Evaluation Service (GQES) DQP activities installed

Data access & integration with OGSA-DAI: GGF 17 6 OGSA-DQP architecture DQP evaluator services: Are plain Web services Implement the QueryEvaluation port type: evaluate – the input is a query plan partition which is subsequently executed receiveData – allows the evaluator to receive data from other evaluators OGSA-DAI extensions: DQP resource – a resource which encapsulates a distributed query infrastructure: DQP evaluator services, OGSA-DAI data services etc. Implemented as a data resource accessor. OQL query statement activity – enables the submission of a query in Object Query Language (OQL) DQP factory activity – enables the creation and configuration of DQP resources.

Data access & integration with OGSA-DAI: GGF 17 7 Example query Given two DBMSs and one analysis tool (i.e., a Web service): goTerm : a GO Gene Ontology table in a remote mySQL DB, exposed by an OGSA-DAI data service protein : a table in a protein sequence DB, exposed by an OGSA-DAI data service Blast (sequence alignment scoring Web service); We want to obtain alignment scores for a sequence against proteins of a certain kind The user submits a single query referencing data stored at multiple sites. The author of the query need not be aware of how/where data is stored. Queries are written in Object Query Language (OQL): select p.proteinId, Blast(p.sequence) from protein p, goTerm t where t.termId = ‘GO: ’ and p.proteinId=t.proteinId

Data access & integration with OGSA-DAI: GGF 17 8 Background: OQL Why? OGSA-DQP is based on a parallel distributed query processor for object databases (Polar*) The standard query language of object databases is OQL Polar* is still used by DQP to parse, optimise and schedule queries Instead of querying object databases, we are now querying relational databases OQL queries are compiled by Polar* into distributed query plans. During the execution of the query plan, DQP will query relational data sources using SQL.

Data access & integration with OGSA-DAI: GGF 17 9 Client interaction with OGSA-DQP Two main client/server interactions: 1. Configuration: the client sends a perform document requesting the service to create a DQP data service resource 2. Query submission: the client sends a perform document requesting the service to execute an Object Query Language (OQL) query, using a DQP data service resource created in (1) The data service resource created in (1) encapsulates the distributed query infrastructure used to execute queries. Differs from the typical OGSA-DAI data service resources e.g. relational data service resource

Data access & integration with OGSA-DAI: GGF DQP configuration OGSA-DAI data service perform Evaluator URLs OGSA-DAI data service resources Web service URLs OGSA-DAI data service GetRP DQP factory activity OGSA-DAI data service GetRP creates DQP DSR Global schema of imported DBs & analysis services Set of evaluators that can be used Physical DB metadata (used to optimise queries) Result: resource ID of created DSR

Data access & integration with OGSA-DAI: GGF DQP query evaluation OGSA-DAI data service perform OQL query OGSA-DAI data service perform OQLQueryStatement DQP DSR Evaluator QE transport OGSA-DAI data service perform Analysis service... Evaluator QE Evaluator QE Result: WebRowSet XML Stream

Data access & integration with OGSA-DAI: GGF Interacting with an OGSA-DQP service Three options: A command line client Allows configuration and query submission via the execution of Apache Ant scripts Client toolkit classes Allow you to integrate OGSA-DQP into your own applications [The above utilities are part of the main OGSA-DQP download] GUI client

Data access & integration with OGSA-DAI: GGF Command-line client Configuration example: $ ant factory -Ddqp.config.file=config.xml -Durl= -Dresource.id=dqp-factory Querying the global schema – example: $ ant getschemas -Durl= -Dresource.id=ogsadai-911acvd122

Data access & integration with OGSA-DAI: GGF Command-line client Query submission example: $ ant query -Durl= -Dresource.id=ogsadai-911acvd122 -Dclient.query=“%print select i.id from i in go_goterms;” -Dclient.output.file=results.xml Results will be saved as a WebRowSet, the standard XML representation of relational results used by OGSA-DAI

Data access & integration with OGSA-DAI: GGF Client toolkit classes Client toolkit classes are provided for the activities contributed by OGSA-DQP: GDQSFactory class used to construct DQPFactory activities OQLQuery class used to construct OQLQueryStatement activities The client toolkit allows the integration of DQP with other applications and seamless interaction with the OGSA-DAI client toolkit OGSA-DQP client toolkit is Java only…

Data access & integration with OGSA-DAI: GGF Query execution using client toolkit 1 GenericServiceFetcher fetcher = GenericServiceFetcher.getInstance(); 2 DataService service = fetcher.getDataService(url,resourceID); 3 OQLQuery oqlQuery = new OQLQuery(query); 4 OutputStreamActivity outputStream = new OutputStreamActivity(); 5 outputStream.setInput( oqlQuery.getOutput() ); 6 ActivityRequest request = new ActivityRequest(); 7 request.add( oqlQuery ); 8 service.perform(request); 9 oqlQuery.getResultSet(); 10 java.sql.ResultSet rs = outputStream.getResultSet();

Data access & integration with OGSA-DAI: GGF Demo: The GUI Client The GUI allows you to: Interact with OGSA-DQP services. The GUI is pre-configured with the URL of a OGSA-DQP service we have deployed at EPCC. View the configuration parameters of DQP data service resources View the global schema maintained by a DQP data service resource Submit OQL queries to DQP data service resources View the results of queries View graphical and XML representations of query plans

Data access & integration with OGSA-DAI: GGF Newcastle University Evaluator service giga01.ncl.ac.uk OGSA-DAI data service GO Term DB Evaluator service giga02.ncl.ac.uk OGSA-DAI data service Protein interaction DB Evaluator service giga03.ncl.ac.uk OGSA-DAI data service Protein Term DB Evaluator service giga04.ncl.ac.uk OGSA-DAI data service Protein property DB

Data access & integration with OGSA-DAI: GGF Newcastle University Evaluator service giga05.ncl.ac.uk OGSA-DAI data service Protein Sequence DB Evaluator service giga06.ncl.ac.uk Evaluator service giga07.ncl.ac.uk Evaluator service giga08.ncl.ac.uk Evaluator service giga09.ncl.ac.uk Entropy analyser service

Data access & integration with OGSA-DAI: GGF Database tables namelengthsqlTypeName id32varchar type55varchar name255varchar GO Terms extent name: “goterms_goterms” Protein interactions Extent name: “interaction_protein_interactions” namelengthsqlTypeName ORF150varchar ORF250varchar baitProtein50varchar interactionType5varchar repeats11int experimenter100varchar

Data access & integration with OGSA-DAI: GGF Database tables Protein terms extent name: “protein_term_protein_goterm” namelengthsqlTypeName ORF55varchar molecularWeight12float hydrophobicity12float Protein properties extent name: “protein_property_protein_propertys” namelengthsqlTypeName ORF50varchar sequence65535text Protein sequence extent name: “protein_sequence_protei n_sequences” namelengthsqlTypeName ORF55varchar GOTermIdenti fier 32varchar

Data access & integration with OGSA-DAI: GGF DQP EPCC OGSA-DAI data service DQP factory GIGA resource test.ogsadai.org.uk Encapsulates the distributed query environment deployed at Newcastle ogsadai-1092f60c1e1

Data access & integration with OGSA-DAI: GGF Conclusion OGSA-DQP is a service based distributed query processor that is: Exposed as a service Implemented as an orchestration of services It provides an example of how the OGSA-DAI extensibility points can be used… The activity extensibility points are used New data resource accessors are implemented Dynamic resource deployment is used during configuration to create new resources Benefits: OGSA-DAI manages activity concurrency – we didn’t need to write concurrent code OGSA-DQP can take advantage of the host of delivery options provided by OGSA-DAI OGSA-DQP is insulated from multiple platforms (WS-I, WSRF) by OGSA-DAI