Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Query Translation Scheme for Rapid Implementation of Wrappers Presented By Preetham Swaminathan 03/22/2007 Yannis Papakonstantinou, Ashish Gupta, Hector.

Similar presentations


Presentation on theme: "A Query Translation Scheme for Rapid Implementation of Wrappers Presented By Preetham Swaminathan 03/22/2007 Yannis Papakonstantinou, Ashish Gupta, Hector."— Presentation transcript:

1 A Query Translation Scheme for Rapid Implementation of Wrappers Presented By Preetham Swaminathan 03/22/2007 Yannis Papakonstantinou, Ashish Gupta, Hector Garcia-Molina, Jeffery Ullman

2 Introduction As part of the TSIMMIS project a lot of hard coded wrappers have been developed for a variety of sources including legacy systems. Some Observations –Only small part of code deals with access details of source –Lot of code deals with communication, buffering etc. –Or code implements query and data transformation that can be expressed in a high level declarative fashion.

3 Introduction Based on observations Wrapper implementation toolkit for rapid wrapper building developed. Toolkit contains –Library of commonly used functions –Facility to translate queries into source specific commands and queries. –Translating results into a model useful to the application. Main focus on the Query translation component of toolkit. (Converter)

4 Converter Converter – Query translation component of the toolkit. An implementor gives converter a set of templates. –These templates describe queries accepted by wrapper. –If application query matches template implementer provides an action. –The action is executed to produce native query for the source which answers the query.

5 Example Consider data source that can only do selections on attribute dept. Source does not understand the notion of projecting attributes. Template describing the source select * from $X where $X.dept = ‘toy’ The following query does not match this template because it consists of a projection. select emp.name from emp where emp.dept=‘toy’

6 Example The wrapper could process the above query as follows –Transform the query into one without a projection. –Perform a projection on the result of the query – also known as process of filtering. Wrapper toolkit can handle this type of query transformation. –Convertor not only generates native queries for source but also filters describing additional processing on the results.

7 Converter Converters in the toolkit targets MSL query language. MSL is logic based language for simple object oriented data model called OEM. Converter is configured with templates written in QDTL. Each template is associated with an action. Converter takes as input MSL query and generates –Commands for source and –Filter to be applied to the results.

8 Converter Converter will process –Directly supported queries – queries that syntactically match template. –Logically supported queries –Indirectly supported queries – can be processed as a combination of a direct query and a filter.

9 OEM Model OEM stands for Object Exchange model. OEM does not support classes, methods and inheritance. Classes and methods can be emulated. Example:

10 OEM Model At each source top level OEM objects are defined. –They provide entry points into object structure. –Sub-objects can be requested as explained below using the following MSL query. (Q1) *P:- }> Tail is of form Matching –When field is a constant then pattern binds only with objects that have same constant value –When field is a variable the pattern can bind with any OEM object.

11 A Detailed Query Translation Example Build a wrapper for a university “lookup” facility that contains information about employees and students. Accessed from command line of computers and offers limited query capabilities. –Can return only the full records of persons including all fields like firstname, lastname and telephone. –No way for the user to retrieve just one field.

12 Query Translation Only queries that are accepted are –Retrieve person records by specifying last name. (L2) lookup –ln Smith –Retrieve person records by specifying first and last name. (L3) lookup –ln Smith –fn John –Retrieve all person records (L4) lookup

13 Query Translation Using Query description translation language (QDTL) the description for lookup facility can be written as below. (D1) (QT1.1) Query ::= *O:- }> (QT1.2) Query ::= *O:- }> (QT1.3) Query ::= *O:- Identifiers preceded by $ are constant place holders Upper case identifiers are variable place holders.

14 Query Translation Each template describes many more queries than those that match syntactically. Each template describes following classes of queries. –Directly supported queries. –Logically supported queries. –Indirectly supported queries.

15 Query Translation Directly Supported Queries –A query q is directly supported by a template t if q can be derived by substituting the constant placeholders of t by constants and the variables of t by variables. –*P:- }> is directly supported by template QT1.1 by substituting O with P and $LN with ‘Smith’.

16 Query Translation Logically supported queries –A query q is logically supported by a template t if q is logically equivalent to some query q` directly supported by t. *O:- }> *O:- }> *O:- }> AND }> –All these queries are equivalent to *O:- }> (supported by QT1.2)

17 Query Translation Indirectly supported queries –A query q is indirectly supported by template t if q can be broken down into a directly supported query and then filter is applied on the results. (Q6) *Q:- }> –The above query is not logically supported by any templates in the description.

18 Query Translation Converter realizes that the answer to the following query contains answers to the original query (subset of the following query) (Q7) *Q:- } Thus the converter matches Q6 to template QT1.1 as if it were Q7 binding $LN to ‘Smith’ and generates the filter *O:- }> The filter is an MSL query that is applied to the result of Q7 to produce the result of Q6

19 Native Query Formulation (D2) (QT2.1) Query::=*O:- }> (AC2.1) {sprintf(lookup_query, ’lookup –ln %s’, $LN);} (QT2.2) Query::=*O:- }> (AC2.2){sprintf(lookup_query, ‘lookup –ln %s –fn %s’, $LN,$FN);} (QT2.3) Query::=*O:- (AC2.3) {sprintf(lookup_query, ‘lookup’);}

20 Non-terminals (D4) /* A description with nonterminals */ (QT4.1) Query ::= *OP :- /*Query Template*/ (NT4.2) __OptLN ::= /*Nonterminal template*/ (NT4.3) __OptLN ::= /* empty nonterminal template*/ (NT4.4) __OptFN ::= (NT4.5) __OptFN ::= /* empty */ (NT4.6) __OptRole ::= (NT4.7) __OptRole ::= /* empty */

21 Nonterminals - Actions (D5) (QT5.1) Query ::= *OP :- (AC5.1) {sprintf(lookup query, 'lookup %s %s %s', $ _OptLN, $ _OptFN, $ _OptRole)} ; (NT5.2) _OptLN ::= (AC5.2) {sprintf($_OptLN,'-ln %s',$LN);} (NT5.3) _OptLN ::= (AC5.3) {$_OptLN = '';} (NT5.4) _OptFN ::= (AC5.4) {sprintf($ _OptFN, '-fn %s', $FN);} (NT5.5) _OptFN ::= (AC5.5) {$_OptFN = '';} (NT5.6) _OptRole ::= (AC5.6) {sprintf($_OptRole,'-role %s',$R);} (NT5.7) _OptRole ::= (AC5.7) {$_OptRole = '';}

22 Wrapper Architecture Wrapper Consists of –Implementer provides the driver that has the primary control of query processing Provides the QDTL description for the converter Provides the Data Extraction (DEX) template for the extractor component of the toolkit. –Converter –Driver

23 Wrapper Architecture

24 Wrappers generated with the toolkit behave as server in a client server architecture. Clients use client support library to issue queries and receive OEM results. The server support library component of the toolkit receives queries and sends it to driver component for processing. Driver invokes the converter which finds a query that supports the input query and returns native queries.

25 Wrapper Architecture Driver submits the native queries to information source and receives result as OEM objects. If filter was generated during processing the driver passes the OEM result and the filter to the filter processor. Data Extractor (DEX) is used to parse the result and identify required data. DEX is configured with a description of source output and what part of source output needs to be extracted.

26 Correspondence of OEM to Relational Models OEM objects are represented relationally by flattening them into tuples of 3 relations top, object and member. OEM objects can be converted using a few straight forward rules. –For an object o with object id oid, label l and atomic value v the tuple can be written as object(oid,l,v) –If o is a set object then the tuple becomes object(oid,l,set)

27 OEM to SQL –If o has sub objects o i where 1 ≤ i ≤ n identified by oid then we introduce tuple member(oid,oid i ) –Finally if o is a top level object defined by oid then we introduce tuple top(oid) –Relational representation of MSL queries is obtained by querying the top, object and member relations that represent the object structure referenced in the query.

28 Example Consider the query *O:- }> The above MSL query can be written as the following datalog query. answer(O):- top(O), object(O,person,set), member(O,LM), object(LM, last_name, ’Smith’) Paper contains an algorithm that for a given MSL finds supporting queries from QDTL and if required creates a filter to be applied to OEM result objects.

29 Conclusions Toolkit that facilitates implementation of wrappers developed. Heart of toolkit is the converter that maps incoming queries into native commands of the source. Converter provides translation flexibility of systems like Yacc, but gives substantially more power (translates a wider class of queries)


Download ppt "A Query Translation Scheme for Rapid Implementation of Wrappers Presented By Preetham Swaminathan 03/22/2007 Yannis Papakonstantinou, Ashish Gupta, Hector."

Similar presentations


Ads by Google