Download presentation
Presentation is loading. Please wait.
Published byAngelina Gilmore Modified over 9 years ago
1
© Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez Nick Roussopoulos
2
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 2 Client Introduction èDatabase Middleware Systems: 3Used to integrate data from multiple sources. 3Help to keep clients simple thin clients economic ($$$) to deploy Web-based GUI 3Re-use existing servers replacing them can be expensive and dangerous 3Examples TSIMMIS, Garlic, DISCO, Oracle, Sybase,... Client OracleImagesXML Translator Integration Server Catalog
3
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 3 Limitations of this Solution 8Code Deployment Problem –Code for data types and operators is user-defined Polygon Perimeter() –Need to manually install the code to: clients integration servers translators –Must be ported (C/C++ code) –Security (do not crash system) èDoes not scale well as the number of sites increases hard to deploy, upgrade and maintain the code Client OracleImagesXML Translator Integration Server Catalog
4
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 4 Limitations of this Solution 8Query Processing Problem –Availability of code limits operator placement options. not all sites can evaluate the operators in a query –Integration server ends up doing most of the processing. data must be shipped to it –Too much data movement! èDoes not scale well network becomes a major performance bottleneck limited bandwidth increases query execution time Client OracleImagesXML Translator Integration Server Catalog 100MB
5
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 5 The MOCHA Solution 3Middleware system automatically deploys the code –ship Java classes for data types and operators –done at run time in dynamic fashion 3Provide information on how to use the code –metadata and control in XML and RDF 3Exploit these features in query operator placement –place operators at sites that minimize data movement remote data sources get operators that filter the data integration server gets operators that expand the data –more on this: SIGMOD 2000 paper
6
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 6 MOCHA Architecture Client Network Oracle 8i Informix XML Repository Text Files DAP QPC CatalogCode Repository
7
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 7 Automatic Code Deployment Select location, Composite(image) From Rasters Where week BETWEEN t1 and t2 Group By location QPC Client Internet Code Repository Catalog Texas VirginiaMaryland DAP Informix DAP Oracle Virginia
8
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 8 Internet Answering the Query Select location, Composite(image) From Rasters Where week BETWEEN t1 and t2 Group By location QPC Client Code Repository Catalog Texas VirginiaMaryland DAP Informix DAP Oracle Virginia 200MB tuples 100MB tuples results 200KB results 150KB results 150KB results 200KB results 150KB results 200KB results 350KB results 350KB
9
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 9 Components of MOCHA Client Application QPC –parsing (SQL) –optimizing –catalog management –code deployment –query execution DAP –data translation –query execution Data Server –storage server Client QPC Catalog Code Repository DAP Oracle DAP XML DAP Text Internet
10
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 10 Catalog Organization Holds information describing the structure and proper use of tables, data types and query operators. –Generically referred to as “resources” Each resource is uniquely identified by an URI: –mocha://cs1.umd.edu/EarthSci/Polygon Metadata is encoded using RDF (an XML derivative) 3makes it easy to understand, use and exchange metadata Each resource has a catalog entry in the form: (URI, RDF File)
11
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 11 Metadata Requirements Select location, Composite(image) From Rasters Where week BETWEEN t1 and t2 Group By location location image week band Table Rasters Query: 1. What kind of metadata are needed? 2. How to specified them?
12
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 12 RDF Model: Data Types mocha:Type mocha:Class mocha:Repository mocha:Size mocha:Creator mocha://cs1.umd.edu/EarthSci/Raster Raster Raster.class cs1.umd.edu/EarthSci 1 megabyte user1@cs.umd.edu
13
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 13 RDF Model: Query Operators mocha:Aggregate mocha:Cla ss mocha:Repository mocha:Type mocha:Arguments Composite mocha:Creator mocha:URI mocha:Type mocha:URI rdf:type Composite.clas s cs1.umd.edu/EarthSci... Raster rdf:Seq... Raster mocha:Result mocha://cs1.umd.edu/EarthSci/Composite user1@cs.umd.edu
14
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 14 RDF Model: Tables mocha://cs1.umd.edu/EarthSciDB/Rasters... mocha:Table Rasters cs1.umd.edu/EarthSciDB mocha:Database mocha:Columns rdf:type rdf:Seq mocha:Owner user1@cs.umd.edu mocha:Column mocha:Type mocha:URI location Polygon... mocha:Column mocha:Type mocha:URI... image Raster
15
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 15 Metadata and Control Exchange QPC sends to each DAP: 4 metadata for the data types and operators they will receive 4query plan specifying task to do Metadata is serialized as XML –RDF serialization syntax Plans –XML documents –easy to use and understand –can be mapped to suitable form tree, DAG, graph, etc. –prevents version inconsistencies changes in Java classes <rdf:Description about= “mocha://cs1.umd.edu/EarthSci/Raster”> Raster Raster.class cs1.umd.edu/EarthSci 1MB user1@cs1.umd.edu
16
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 16 Processing a Query in MOCHA Query Parsing Resource Discovery Query Optimization Metadata and Control Exchange Code Deployment Phase Query Execution Select location, Composite(image) From Rasters Where week BETWEEN t1 and t2 Group By location location image week band Table Rasters Query:
17
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 17 Performance of MOCHA èshipping Composite() code to DAP 3cuts data movement by 99% 34-1 performance improvement Running Time (secs) Select location, Composite(image) From Rasters Where week BETWEEN t1 and t2 Group By location Non-MOCHAMOCHA Middleware Type
18
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 18 Benefits of MOCHA 3Middle-tier solution 3Extensible 3Java Code Re-usability –across platforms 3Automatic Code Deployment –“Plug-n-Play” 3 Easier to Administer 3 XML-based Metadata 3 XML-based Control 3 Efficient Query Processing –data movement reduction –moving code vs. data
19
EDBT 2000 M. Rodriguez-Martinez – N. Roussopoulos 19 Conclusions Identified limitations in existing middleware systems –Code Deployment Problem –Query Processing Problem Proposed a new framework to automate the deployment of new functionality: –automatic code deployment –efficient query processing Described its implementation in MOCHA,based on well-accepted technologies: Java, XML, RDF.http://www.cs.umd.edu/projects/mocha/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.