1 Le Thi Thu Thuy*, Doan Dai Duong*, Virendrakumar C. Bhavsar* and Harold Boley** * Faculty of Computer Science, University of New Brunswick, Fredericton,

Slides:



Advertisements
Similar presentations
Intelligent Technologies Module: Ontologies and their use in Information Systems Revision lecture Alex Poulovassilis November/December 2009.
Advertisements

Long-term Digital Metadata Curation Arif Shaon University of Reading 16 April 2014.
Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Computing Structural Similarity of Source XML Schemas against Domain XML Schema Jianxin Li 1 Chengfei Liu 1 Jeffrey Xu Yu 2 Jixue Liu 3 Guoren Wang 4 Chi.
Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.
Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,
Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2.
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
Efficient Keyword Search for Smallest LCAs in XML Database Yu Xu Department of Computer Science & Engineering University of California, San Diego Yannis.
Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Structural Joins: A Primitive for Efficient XML Query Pattern Matching Al Khalifa et al., ICDE 2002.
Transform and Conquer Chapter 6. Transform and Conquer Solve problem by transforming into: a more convenient instance of the same problem (instance simplification)
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
The Mystery Vault Game Played by radio stations A listener can win up to $25,000 if he/she guesses the correct amount in the vault One guess per hour until.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
Chapter 6: Transform and Conquer
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
Selective Dissemination of Streaming XML By Hyun Jin Moon, Hetal Thakkar.
Aki Hecht Seminar in Databases (236826) January 2009
ADVISE: Advanced Digital Video Information Segmentation Engine
DYNAMIC ELEMENT RETRIEVAL IN A STRUCTURED ENVIRONMENT MAYURI UMRANIKAR.
TCSS 343, version 1.1 Algorithms, Design and Analysis Transform and Conquer Algorithms Presorting HeapSort.
Chapter 4: Trees Radix Search Trees Lydia Sinapova, Simpson College Mark Allen Weiss: Data Structures and Algorithm Analysis in Java.
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
Containment and Equivalence for an XPath Fragment By Gerom e Mikla Dan Suciu Presented By Roy Ionas.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
Keyword Proximity Search on XML Graphs Vagelis Hristidis Yannis Papakonstatinou Andrey Presenter: Feng Shao.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
E.G.M. Petrakissearching1 Searching  Find an element in a collection in the main memory or on the disk  collection: (K 1,I 1 ),(K 2,I 2 )…(K N,I N )
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04 DB Seminar, Spring 2005 By: Andrey Balmin Fatma Ozcan Kevin.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Quete: Ontology-Based Query System for Distributed Sources Haridimos Kondylakis, Anastasia Analyti, Dimitris Plexousakis Kondylak, analyti,
Query Processing Presented by Aung S. Win.
Overview Integration Testing Decomposition Based Integration
A Unified Framework for the Semantic Integration of XML Databases
1 Expert Finding for eCollaboration Using FOAF with RuleML Rules MCeTECH May 2006 Jie Li 1,2, Harold Boley 1,2, Virendrakumar C. Bhavsar 1, Jing.
XPath Processor MQP Presentation April 15, 2003 Tammy Worthington Advisor: Elke Rundensteiner Computer Science Department Worcester Polytechnic Institute.
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
Querying Structured Text in an XML Database By Xuemei Luo.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
My Research and e-Business Virendrakumar C. Bhavsar Professor and Director, Advanced Computational Research Laboratory Faculty of Computer Science University.
Advance Data Structure 1 College Of Mathematic & Computer Sciences 1 Computer Sciences Department م. م علي عبد الكريم حبيب.
Minor Thesis A scalable schema matching framework for relational databases Student: Ahmed Saimon Adam ID: Award: MSc (Computer & Information.
1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley Faculty of Computer Science University of New Brunswick (UNB) Fredericton,
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
1 Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Match-Making Lu Yang, Marcel Ball, Virendra C. Bhavsar and.
The Volcano Optimizer Generator Extensibility and Efficient Search.
Early Profile Pruning on XML-aware Publish- Subscribe Systems Mirella M. Moro, Petko Bakalov, Vassilis J. Tsotras University of California VLDB 2007 Presented.
XML and Database.
XML Access Control Koukis Dimitris Padeleris Pashalis.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
Chapter 4: Trees Part I: General Tree Concepts Mark Allen Weiss: Data Structures and Algorithm Analysis in Java.
1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Deriving Relation Keys from XML Keys by Qing Wang, Hongwei Wu, Jianchang Xiao, Aoying Zhou, Junmei Zhou Reviewed by Chris Ying Zhu, Cong Wang, Max Wang,
Packet Classification Using Dynamically Generated Decision Trees
Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference.
1 Data Structures CSCI 132, Spring 2014 Lecture23 Analyzing Search Algorithms.
2018/6/26 An Energy-efficient TCAM-based Packet Classification with Decision-tree Mapping Author: Zhao Ruan, Xianfeng Li , Wenjun Li Publisher: 2013.
OrientX: an Integrated, Schema-Based Native XML Database System
Fast Trie Data Structures
Joining Interval Data in Relational Databases
Chapter 6: Transform and Conquer
Market-based Dynamic Task Allocation in Mobile Surveillance Systems
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
CoXML: A Cooperative XML Query Answering System
Presentation transcript:

1 Le Thi Thu Thuy*, Doan Dai Duong*, Virendrakumar C. Bhavsar* and Harold Boley** * Faculty of Computer Science, University of New Brunswick, Fredericton, NB, Canada {Thuy_Thi_Thu.Le, Duong_Dai.Doan, **Institute for Information Technology - e-Business, NRC, Fredericton, NB, Canada A Bottom-up Strategy for Query Decomposition First IEEE International Conference on Digital Information Management (ICDIM) December 6-8, 2006

2 Agenda Introduction Lausen and Marron (LM) Approach Proposed Approach Query Decomposition Algorithm Additional Cases of Input Queries Conclusion

3 Introduction Utilization of available heterogeneous web data sources is still a demanding task Automatic retrieval of relevant data from distributed and chaotic sources Avoid generation of such data from scratch Global-As-View (GAV) Distributed data sources follow their own schemas Integration system integrates heterogeneous schemas to a global schema Users interact with the integration system through a global schema

4 Introduction Query of a user based on a global schema Cannot be directly employed to query distributed sources due to different structures of global schema and distributed nature To access data from distributed sources, global query must be decomposed into subqueries, conforming to structures of distributed sources Query decomposition plays important role

Users Integrated data   >  query  >  Xuan G26 Vietnam Phuoc A12 Campuchia System 1 Xuan G26 Vietnam Phuoc A12 Campuchia System 2 Xuan G26 Vietnam Phuoc A12 Campuchia System n s n r c s n r c s n r c xxx xx xx QUERY DECOMPOSITION DATA CONVERSION Query n Query 1  >  Query 2  >   >   >   > >      >  Result 1Result 2 Result n   >   >   >  Result 2Result nResult 1 Mappings are needed General Scenario for Query Decomposition of Distributed Databases 5

6 Problems with Mappings Building mappings is a difficult task Mappings are normally handcrafted Can we decompose a global query into subqueries without mappings ?

7 Lausen and Marron (LM02) Approach XML data sources Use XPath query Qglobal='/p1/p2/…/pi/…/pn-1/pn' Decompose global query into subqueries without mappings Use top-down approach Process from top (root node) of a tree (schema) to its bottom (leaf nodes) Process global query from left to right (P1  Pn) G. Lausen and P.J. Marron, “Adaptive evaluation techniques for querying XML-based E-Catalogs,” DBLP, 2002, pp

8 Proposed Approach XML data sources Use XPath query Qglobal='/p1/p2/…/pi/…/pn-1/pn' Decompose global query into subqueries without mappings Use bottom-up approach Process from bottom (leaf nodes) of a tree (schema) to the top (root node) Process global query from right to left (Pn  P1)

a. Global schema b. SESP schema c. BIGGER schema Fig. 1. Example of a global schema and two local schemas (from LM) XPath query based on global schema Qglobal='/p1/p2/…/pi/…/pn-1/pn' Qglobal= '/department/mobile/products/jammer[price<200]' Find Q SESP AND Q BIGGER for schemas SESP and BIGGER ? Q SESP ='/products/jammer[price<200]' Q BIGGER ='/department/mobile/jammer[price<200]' 9 Example

10 Query Decomposition Algorithm Given Q global ='/p1/p2/…/pi/…pn-1/pn' Take rightmost part pn to evaluate If pn is not found in local schema  no subquery for schema. Stop the algorithm Else, pn is found at a node in the tree (local schema), mark that node so that the next search will only be performed on its ancestor nodes Sequentially, consider pi (i=(n-1),...1) of the query for evaluation

Check(P i, Anchor) P i exists in the local schema S from Anchor up to the root node P i is matched with the root of S Subquery:='/'  Subquery Subquery:=P i  '/'  Subquery Anchor:=father(P i ) in S Yes i>1 No Yes Anchor:=LeftmostLeafNode Subquery:='' i:= |Q global | Stop Subquery='' Yes No Subquery:=P i P i =Anchor Yes Subquery:=P i  '//'  Subquery No Return Subquery Yes No Subquery='' i=1 Subquery:='//'  Subquery Yes No Yes No i := i-1 Flowchart of the algorithm 11

12 XPath queries contain constraints (filter expressions) Qglobal := '/department/mobile/products/jammer[price<200]' Idea Examine price before jammer. Avoid transforming the whole query if price does not exist in local schema  Considerable reduction in execution time Additional Cases of Input Queries (Constraints in Queries)

13 Additional Cases of Input Queries (Constraints in Queries) In this case, no subquery for local schema from the global query Qglobal := '/department/mobile/products/jammer[price<200]' products jammer name company Fig. 2. Local schema without price leaf node (adapted from LM)

14 Algorithm Analysis We evaluate right to left parts of the input query and from bottom to top of the XML tree Worst case No subquery for a local schema The rightmost part Pn of global query has to be compared to all nodes of local schema Time complexity for a query having n parts to full k-ary tree of height h Numbers of nodes in full k-ary tree of height h

15 Algorithm Analysis Best case The rightmost part pn matches with a leaf node of tree at first The above is true for all pi nodes at upper levels of tree. Time complexity for a query having n parts to a full k-ary tree of height h  min(n,h) because algorithm stops when  all n parts of Q global are processed  all nodes from bottom to top of the tree (with height h ) are traversed

16 LM Approach – Algorithm Analysis Transform XPath query Qglobal='/p1/p2/…/pi/…/pn-1/pn‘ into local subqueries for local schemas At each pi, to evaluate pi for a binary tree of height h, three operators are used compute and select suitable elements from global query to form local subqueries No transformation: 1 unit time Subquery generalization: 2 h+1 -1 unit times Subquery elimination: 1 unit time

17 LM Approach: Algorithm Analysis (cont.) Time complexity to evaluate the whole query Time complexity of the algorithm for a query having n parts given a full k-ary tree of height h

18 Comparison LM Algorithm Our algorithm Worst case Best case

19 Conclusion Proposed an efficient bottom-up algorithm for query decomposition without predefined mappings Global query is efficiently processed based on its constraints Our algorithm can be extended to work not only with XPath queries, but also with general path expressions like those in Object-Oriented Databases

20