Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04 DB Seminar, Spring 2005 By: Andrey Balmin Fatma Ozcan Kevin.

Slides:



Advertisements
Similar presentations
Advanced XSLT. Branching in XSLT XSLT is functional programming –The program evaluates a function –The function transforms one structure into another.
Advertisements

XML: Extensible Markup Language
Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.
Semantics Static semantics Dynamic semantics attribute grammars
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Fast Algorithms For Hierarchical Range Histogram Constructions
XPath Eugenia Fernandez IUPUI. XML Path Language (XPath) a data model for representing an XML document as an abstract node tree a mechanism for addressing.
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Instructor: Craig Duckett CASE, ORDER BY, GROUP BY, HAVING, Subqueries
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
XSL Concepts Lecture 7. XML Display Options What can XSL Transformations do? generation of constant text suppression of content moving text (e.g., exchanging.
A Framework for Using Materialized XPath Views in XML Query Processing Dapeng He Wei Jin.
Containment and Equivalence for an XPath Fragment By Gerom e Mikla Dan Suciu Presented By Roy Ionas.
Storing and Querying Ordered XML Using Relational Database System Swapna Dhayagude.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
XML files (with LINQ). Introduction to LINQ ( Language Integrated Query ) C#’s new LINQ capabilities allow you to write query expressions that retrieve.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
Xpath Query Evaluation. Goal Evaluating an Xpath query against a given document – To find all matches We will also consider the use of types Complexity.
CSE3201/CSE4500 XPath. 2 XPath A locator for elements or attributes in an XML document. XPath expression gives direction.
Lecture 21 XML querying. 2 XSL (eXtensible Stylesheet Language) In HTML, default styling is built into browsers as tag set for HTML is predefined and.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
1/17 ITApplications XML Module Session 7: Introduction to XPath.
CSE3201/CSE4500 Information Retrieval Systems
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
1 XPath XPath became a W3C Recommendation 16. November 1999 XPath is a language for finding information in an XML document XPath is used to navigate through.
A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS
IBM Almaden Research Center © 2006 IBM Corporation On the Path to Efficient XML Queries Andrey Balmin, Kevin Beyer, Fatma Özcan IBM Almaden Research Center.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
Database Management 9. course. Execution of queries.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
Querying Structured Text in an XML Database By Xuemei Luo.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for.
Web Data Management Indexes. In this lecture Indexes –XSet –Region algebras –Indexes for Arbitrary Semistructured Data –Dataguides –T-indexes –Index Fabric.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Database Systems Part VII: XML Querying Software School of Hunan University
WPI, MOHAMED ELTABAKH PROCESSING AND QUERYING XML 1.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
XML and Database.
XML Access Control Koukis Dimitris Padeleris Pashalis.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
1 Typing XQuery WANG Zhen (Selina) Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,
More XML XPATH, XSLT CS 431 – February 23, 2005 Carl Lagoze – Cornell University.
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
Computer Sciences Department1.  Property 1: each node can have up to two successor nodes (children)  The predecessor node of a node is called its.
CSE3201/CSE4500 XPath. 2 XPath A locator for items in XML document. XPath expression gives direction of navigation.
XPath --XML Path Language Motivation of XPath Data Model and Data Types Node Types Location Steps Functions XPath 2.0 Additional Functionality and its.
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 4: Introduction to C: Control Flow.
Scheduling of Transactions on XML Documents Author: Stijin Dekeyser Jan Hidders Reviewed by Jason Chen, Glenn, Steven, Christian.
1 XPath. 2 Agenda XPath Introduction XPath Nodes XPath Syntax XPath Operators XPath Q&A.
1 The XPath Language. 2 XPath Expressions Flexible notation for navigating around trees A basic technology that is widely used uniqueness and scope in.
Chapter 1: Preliminaries Lecture # 2. Chapter 1: Preliminaries Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
XML Query languages--XPath. Objectives Understand XPath, and be able to use XPath expressions to find fragments of an XML document Understand tree patterns,
XML: Extensible Markup Language
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Querying and Transforming XML Data
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Presentation transcript:

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04 DB Seminar, Spring 2005 By: Andrey Balmin Fatma Ozcan Kevin S. Beyer Roberta J. Cochrane Hamid Pirahesh

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 ‘Prolog’  Materialized Views Vs. Views  Materialized XPath views Vs. materializes views (relational databases) ?

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Agenda  Introduction  Materialized XPath Views  XPath Matching Algorithm  Definitions  Description  Examples  Complexity  Compensation Expression  Additional References

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Introduction With increase amounts of data, presented and exchanged as XML document, there is a need to efficiently query those documents. To address this request:  W3C has proposed an XML query language – XQuery (will be discussed later)  ANSI and ISO has defined SQL / XML (Extends relational databases to handle XML) Both uses Xpath to navigate through the XML document.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XQuery  XML query language  Input: XML document (tree)  Output: XML sub tree  Syntax: FLWOR (For-[Let]-Where-[Order]-Return)  Example: XML DocumentXQuery – Q1 Result

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Complexity The containment problem, as discussed in “Containment and Equivalence for an XPath Fragment” by G. Miklau and D. Suciu (will by reviewed at the 8’th lecture), is shown to be CO-NP complete Meaning: The problem T’  T is NP complete, and therefore cannot be solved in a polynomial time. Just for intuition.. : Consider the query: //A//b And the following XML document: Introduction

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Materialized XPath Views The Goal  In relational databases indexes and materialized views are two well-known techniques to accelerate processing of expensive SQL queries.  We would like to expand the materialized views idea to speed up processing of XQuery or SQL / XML queries. The System  Suggest a materialized XPath view structure.  Define an algorithm that checks whether a given view can be used to answer a given query. (Match algorithm)  Define a compensation expression that computes the query result using the information available from the view.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Materialized XPath Views Storing XPath Views  To be able to answer all kinds of queries, the view structure should contain:  Node path (a list of ancestors)  Typed values  Reference to the node  Note that storing only typed values and node path (without references to the document nodes), will not allow us execute queries whose result is a node collection. For example:  Sometimes it also beneficial to store actual copies of XML fragments in an XPath view.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Materialized XPath Views Example: V =

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm After defining the materialized XPath view structure, we have to make sure that a given view can be utilized in a user query. Definitions  We represent XPath expressions as labeled binary trees, called XPS trees. (XPath Step)  An XPS node, represents a step in the XPath statement, is labeled with:  Axis  {“root”, “child”, “descendant”, “self”, “attribute”, “descendant-or-self”, “parent”}  Test  {name test, wildcard test (*), kind test (node(), text())} //employees /* /employee Node Axis: dos Test: employees Axis: descendent Test: *

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Definitions – cont’  Each XPS node has two children:  The first child is called predicate (and / or, a comparison operator, a constant, XPS node)  The second child, called next, points to the next step (XPS node)  If one of the children does not exist, we represent it with null For example:

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Transition Rules

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm General  The algorithm computes all possible mappings from XPS nodes of the view to XPS nodes of the query expression, in a single top-down pass of the view expression.  The basic algorithm deals only with and / or predicates, and child, attribute or descendant axis.  Each function of the table evaluates to Boolean.  Important! The view expression can not be more restrictive than the query.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchStep(v,q) Rule 1.1: It is sufficient for one of the conjunction to be mapped by a node of v. (For the other conjunction we use the reference) Rule 1.2: If the view is more restricted than the query, it cannot be used. (For example: Q 2 )

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchStep(v,q) – cont’ Rule 1.3: When the view node contains a “descendant” axis, we keep looking for matches down the tree. Rule 1.4: If the axis matches, we try to match the predicate and the next children of the view node. (axis  {predicate, child, attribute}) Rule 1.5: If none matches, the algorithm returns false.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchChildren(v,q) Rule 2.1: If the tests matches, we try to match the predicate and the next step of v. matchPred(v pred,q) Rule 3.1: If v does not have a predicate, then the step trivially matches.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchPred(v pred,q) – cont’ Rule 3.2: If v has a predicate and q does not (meaning: v is more restricted than q), then the match fails. Rule 3.3 and 3.4: Match both conjuncts is case of conjunction, and one disjunct in case of disjunction. (For example if v contains all the orders with both price and amount attributes, it cannot be used for a query that requires either price or amount). Rule 3.5: Match both predicates, and the view’s predicate with query’s next child, in case of nested XPath expressions.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchNext(v next,q) Rule 4.1 and 4.2: Same as 3.1 and 3.2. Rule 4.3: Match next children, and the view’s next child with the query’s predicate, in case of nested XPath expressions. (For example: v = //a[b/c] q = //a/b[c] )

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Example Consider the following hierarchy of employees:  Each employee has:  Name, Salary and Bonus attributes  Zero or more sub-employees elements

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Example – cont’ Consider the following view: witch contains all attributes in a sub-tree of any employee, And the query: that asks for the salary of employees who, together with their direct managers, have bonuses.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchStep(1,11)matchChildren(1,11)matchPred(null,11) ^ matchNext(2,11) T matchNext(2,11)

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchStep(2, null)  matchStep(2,12) F matchStep(2,12) matchChildren(2,12)  matchChildren(2,14) Later.. matchChildren(2,12)

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchPred(null,12) ^ matchNext(3,12) T matchNext(3,12) matchStep(3,13)  matchStep(3,14) Later.. matchStep(3,14)

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchChildren(3,14) matchPred(null,14) ^ matchNext(4,14) T matchNext(4,14) matchStep(4,15)  matchStep(4,16)matchChildren(4,15)  matchChildren(4,16) T T T T

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchStep(XPS 1, XPS 11 ) matchChildren(XPS 1, XPS 11 ) matchNext(XPS 2, XPS 11 ) matchPred(null, XPS 11 )  T matchStep(XPS 2, XPS 12 ) matchStep(XPS 2, null)  F matchChildren(XPS 2, XPS 14 ) matchChildren(XPS 2, XPS 12 )  matchNext(XPS 3, XPS 12 ) matchPred(null, XPS 12 )  T matchStep(XPS 3, XPS 14 ) matchStep(XPS 3, XPS 13 )  T F T Same way… T T T T T

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Recording the Match Consider the following example: View v consists n nodes: //a//a…//a Query q consists of m nodes: /a/a…/a Where m > n.  Any view node n v can map to any query expression node n q such that n v ’s parent maps to some ancestor of n q.  Hence, there are n out of m distinct tree mapping of the view to the query expression. And so on…

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Solution: Keeping track of all mapping in a match matrix structure (matches between v and q).  Match matrix allows us to encode an exponential number of tree mappings in a polynomial size structure, by recording all possible contexts for each node mapping.  It also reduces running time of the algorithm to polynomial.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Match matrix structure: v \ q XPS 11 root XPS 12 //Employ ee XPS XPS 14 /Employe e XPS XPS XPS 1 root T XPS 2 //Employ ee T T XPS 3 dos::* T T XPS T TT  Each row corresponds to an XPS node of the view tree.  Each column corresponds to an XPS node of the query tree.  Each cell may contain one of three possible values: Empty, True or False.  Edges represent the context in which the mapping was detected.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm  For example, in the previous XPS trees: matchStep(XPS v2, XPS q2 ) will be called twice, but will be calculated only once.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Extensions Handling Comparison Predicates  The algorithm as shown so far, lacks rules to match comparison operations.  To solve this problem, the XPS trees are preprocessed in a way that comparison operations become transformed to filters.  After a successful matching, for each filter in the view query tree, we have to check if the filter of a matched node in the query tree is at least as specific.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Complexity  Space complexity:  Match matrix size:O(|V| * |Q|) Number of XPS nodes in the view Number of XPS nodes in the query expression  Each matrix cell can have at most |Q| incoming edges  The number of edges in the matrix: O(|V|*|Q| 2 )

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Complexity – cont’  Constructing the matrix:  matchStep(v,q) function has only |V|*|Q| distinct sets of parameters  Each match can be calculated at most once  In the worst case a function call may expand into |Q| function calls:  Total running time: O(|V|*|Q| 2 )  Polynomial!

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Compensation Expression  After having a successful match between the query and the view, we will have to extract the query by using the materialized view.  For simplicity we assume that there is only one possible mapping between the view and the query.  This extraction is called compensation, and is achieved in two steps:  Step 1: Eliminate unnecessary conditions. For example: V = Q =  The query expression does not have to include the condition, since it implies by the view.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Compensation Expression  Step 2: The query statement is transformed, or relaxed, so that data can be extracted easily from the view table:  We define the last matched node of the view as extraction point, and the last matched node of the query expression as compensation root node.  Than, we reconstruct Q to an equivalent expression that starts at the compensation root node.  Now, as in this example, item elements can be directly extracted from the view table.

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Compensation Expression Extract from the table Use reference

Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Additional References  XQuery and SQL / XML:  XML Seminar:  General:

Inbal Yahav The End DB Seminar, Spring 2005