1 Rewriting Nested XML Queries Using Nested Views Nicola Onose joint work with Alin Deutsch, Yannis Papakonstantinou, Emiran Curtmola University of California,

Slides:



Advertisements
Similar presentations
Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,
Advertisements

Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
Efficient Keyword Search for Smallest LCAs in XML Database Yu Xu Department of Computer Science & Engineering University of California, San Diego Yannis.
Structural Joins: A Primitive for Efficient XML Query Pattern Matching Al Khalifa et al., ICDE 2002.
1 Programming Languages (CS 550) Lecture Summary Functional Programming and Operational Semantics for Scheme Jeremy R. Johnson.
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams Bernhard Stegmaier (TU München) Joint work with.
Containment of Nested XML Queries Xin (Luna) Dong, Alon Halevy, Igor Tatarinov University of Washington.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Efficient Query Evaluation on Probabilistic Databases
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Xyleme A Dynamic Warehouse for XML Data of the Web.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
The Query Language TQL Speaker: Giovanni Conforti Joint work with: G. Ghelli, A. Albano, D. Colazzo, P. Manghi, and C. Sartiani Università di Pisa WebDB.
A Graphical Environment to Query XML Data with XQuery
Query Languages Aswin Yedlapalli. XML Query data model Document is viewed as a labeled tree with nodes Successors of node may be : - an ordered sequence.
TOSS: An Extension of TAX with Ontologies and Similarity Queries Edward Hung, Yu Deng, V.S. Subrahmanian Department of Computer Science University of Maryland,
Flexible and Efficient XML Search with Complex Full-Text Predicates Sihem Amer-Yahia - AT&T Labs Research → Yahoo! Research Emiran Curtmola - University.
A Uniform and Layered Algebraic Framework for XQueries on XML Streams Hong Su Jinhui Jian Elke A. Rundensteiner Worcester Polytechnic Institute CIKM, Nov.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
CSE 636 Data Integration XML Distributed Query Processing Slides by Yannis Papakonstantinou.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
Ontology translation: two approaches Xiangkui Yao OntoMorph: A Translation System for Symbolic Knowledge By: Hans Chalupsky Ontology Translation on the.
Query Processing Presented by Aung S. Win.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Xpath Query Evaluation. Goal Evaluating an Xpath query against a given document – To find all matches We will also consider the use of types Complexity.
1 Distributed Monitoring of Peer-to-Peer Systems By Serge Abiteboul, Bogdan Marinoiu Docflow meeting, Bordeaux.
CS848: Topics in Databases: Foundations of Query Optimization Topics Covered  Databases  QL  Query containment  More on QL.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
Efficient Evaluation of XQuery over Streaming Data Xiaogang Li Gagan Agrawal The Ohio State University.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
INTERPRETING IMPERATIVE PROGRAMMING LAGUAGES IN EXTENSIBLE STYLESHEET LANGUAGE TRANSFORMATIONS (XSLT) Authors: Ruhsan Onder Assoc.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
Pattern tree algebras: sets or sequences? Stelios Paparizos, H. V. Jagadish University of Michigan Ann Arbor, MI USA.
CSE 636 Data Integration Limited Source Capabilities Slides by Hector Garcia-Molina Fall 2006.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Michael Soffner A Variability Model for Query Optimizers Michael Soffner 1, Norbert Siegmund 1, Marko Rosenmüller 1, Janet Siegmund 1, Thomas.
1 Lessons from the TSIMMIS Project Yannis Papakonstantinou Department of Computer Science & Engineering University of California, San Diego.
Materialized View Selection for XQuery Workloads Asterios Katsifodimos 1, Ioana Manolescu 1 & Vasilis Vassalos 2 1 Inria Saclay & Université Paris-Sud,
Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,
Optimization in XSLT and XQuery Michael Kay. 2 Challenges XSLT/XQuery are high-level declarative languages: performance depends on good optimization Performance.
ISP 433/533 Week 11 XML Retrieval. Structured Information Traditional IR –Unit of information: terms and documents –No structure Need more granularity.
RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ A. ONUR DOĞUÇ
Database Systems Part VII: XML Querying Software School of Hunan University
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
BNCOD07Indexing & Searching XML Documents based on Content and Structure Synopses1 Indexing and Searching XML Documents based on Content and Structure.
XML and Database.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
Multi-Query Optimization and Applications Prasan Roy Indian Institute of Technology - Bombay.
1 Execution Strategies for SQL Subqueries Mostafa Elhemali, César Galindo- Legaria, Torsten Grabs, Milind Joshi Microsoft Corp.
Query Caching and View Selection for XML Databases Bhushan Mandhani Dan Suciu University of Washington Seattle, USA.
CSE 6331 © Leonidas Fegaras XQuery 1 XQuery Leonidas Fegaras.
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
1 Holistic Twig Joins: Optimal XML Pattern Matching Nicolas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 2002 Presented by Jun-Ki Min.
Efficient Discovery of XML Data Redundancies Cong Yu and H. V. Jagadish University of Michigan, Ann Arbor - VLDB 2006, Seoul, Korea September 12 th, 2006.
Welcome to CPSC 534B: Information Integration Laks V.S. Lakshmanan Rm. 315.
Chapter 13: Query Processing
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
Efficient Evaluation of XQuery over Streaming Data
Logic as a Query Language: from Frege to XML
Querying XML and Semistructured Data
Evaluation of Relational Operations: Other Operations
Spatial Online Sampling and Aggregation
Evaluation of Relational Operations: Other Techniques
XQuery Leonidas Fegaras.
Adaptive Query Processing (Background)
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

1 Rewriting Nested XML Queries Using Nested Views Nicola Onose joint work with Alin Deutsch, Yannis Papakonstantinou, Emiran Curtmola University of California, San Diego

2 query result The problem … views defined by queries V1, …, Vn and materialized as docV1, …, docVn the query Q docVn docV1 V1Vn Can we answer Q using only view access paths? Input XML data INTRO

3 The problem views defined by queries V1, …, Vn and materialized as docV1, …, docVn is there a query R such that R(V1( Input ) … Vn( Input )) = Q( Input )? ? query result … the query Q the rewriting query R docVn docV1 V1Vn Input XML data INTRO

4 Motivation: caching & indexes caching: answer new queries using results of previously answered ones (partial) indexes: materialized references to frequently accessed parts of the data materialized views, faster to access than the original input query result … the query Q the rewriting query R docVn docV1 V1Vn Input XML data INTRO

5 query result Motivation: security views … checking existence of R  security problem: allow only queries that can be expressed in terms of certain permitted queries, the security views the query Q the rewriting query R docVn docV1 V1Vn ? security views (permitted queries) Input XML data INTRO

6 query result Motivation: data integration … data integration: given a query expressed in global terms, rewrite it using the descriptions of the particular sources the query Q the rewriting query R source1 sourcen local/global mappings expressed as views INTRO Virtual global DB

7 Rewritings enabled by pattern matching Previous literature: find parts of the query that are precomputed by the views. How to decide that: match the patterns of the views into the query –In the relational case, patterns were: tableaux, conjunctive queries –For XPath: tree patterns Matching XML queries? –(until recently) no pattern based description of XQuery semantics –Nested XML Tableaux (NEXT) come to fill the gap The NEXT Logical Framework for XQuery, A.Deutsch et al., VLDB’04 INTRO

8 Scope of Our Approach Nested XML Tableaux (NEXT) extend previous work on tree patterns. NEXT+ extends NEXT to the whole XQuery. Tree Patterns  cover XPath NEXT  extend TreePatterns with: - nested for-loops - joins - element construction etc. NEXT+  extends NEXT to the whole XQuery language, including: - function calls - universal quantification - disjunction, negation etc. INTRO

9 Scope of Our Approach INTRO Tree Patterns  cover XPath NEXT  extend TreePatterns with: - nested for-loops - joins - element construction etc. NEXT+  extends NEXT to the whole XQuery language, including: - function calls - universal quantification - disjunction, negation etc. soundness guarantee: if a rewriting is found, it is equivalent to the original query completeness guarantee: if a rewriting exists, we will find one

10 Query Q: group titles by author for each distinct author, output the titles of his/her books View V: group authors by title for each book, output its title and the list of authors Rewriting using views example Rewriting R scan the view and create an entry for each distinct author in the view output; add to it all the titles of the respective author Data on the Web  bib.xml book title author The result of the view is cached and has faster access time than getting the data directly from the source INTRO

11 View V: group authors by title for $b1 in $doc//book, $t1 in $b1/title return {$t1, $b1/author} Rewriting using views example Rewriting R scan the view and create an entry for each distinct author in the view output; add to it all the titles of the respective author INTRO Previous work captures: - XPath navigation Query Q: group titles by author for each distinct author, output the titles of his/her books

12 View V: group authors by title for $b1 in $doc//book, $t1 in $b1/title return {$t1, $b1/author} Rewriting using views example Query Q: group titles by author for $a in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $t in $b/title where some $a1 in $b/author satisfies $a1 eq $a return $t } Previous work captures: - XPath navigation NEXT captures: - XPath navigation - nested for loops - joins - element construction etc. INTRO

13 View V: group authors by title for $b1 in $doc//book, $t1 in $b1/title return {$t1, $b1/author} Rewriting using views example Query Q: group titles by author for in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $t in $b/title where some in $b/author satisfies $a1 eq $a return $t } INTRO Previous work captures: - XPath navigation NEXT captures: - XPath navigation - nested for loops - joins - element construction etc. $a1 $a

14 View V: group authors by title for $b1 in $doc//book, $t1 in $b1/title return {$t1, $b1/author} Rewriting using views example Query Q: group titles by author for $a in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $t in $b/title where some $a1 in $b/author satisfies $a1 eq $a return $t } INTRO Previous work captures: - XPath navigation NEXT captures: - XPath navigation - nested for loops - joins - element construction etc.

15 Rewriting using views example Data on the Web  bib.xml book title author Query Q: group titles by author for $a in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $t in $b/title where some $a1 in $b/author satisfies $a1 eq $a return $t } bound to the root of the view output INTRO View V: group authors by title for $b1 in $doc//book, $t1 in $b1/title return {$t1, $b1/author} Rewriting R for $a3 in distinct-values($docV/authorlist[title]/author) return { $a3, for $p in $docV/authorlist, $t3 in $p/title where some $a4 in $p/author satisfies $a4 eq $a3 return $t3 } navigate inside the view output

16 Outline NEXT (NEsted XML Tableaux) Rewriting Algorithm and Extensions Experiments Previous Work Conclusions

17 Outline NEXT (NEsted XML Tableaux) Rewriting Algorithm and Extensions Experiments Previous Work Conclusions

18 Architecture of the NEXT framework Nested XML Tableaux (NEXT) Normalization XQuery query and views Minimization Rewriting Using Views Logical Optimization Plan Execution Engine Logical Plan VLDB’04 presented at this conference NEXT patterns Nested XML Tableaux (NEXT) Translate to XQuery To Any XQuery Processor

19 The need for normalization Nested XML Tableaux (NEXT) Normalization XQuery query and views NEXT for $a in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $t in $b/title where some $a1 in $b/author satisfies $a1 eq $a return $t }

20 Normalization into NEXT Nested XML Tableaux (NEXT) Normalization XQuery query and views for $a in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $t in $b/title where some $a1 in $b/author satisfies $a1 eq $a return $t } NEXT for $a in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $a1 in $b/author, $t in $b/title where $a1 eq $a return $t }

21 Normalization into NEXT Nested XML Tableaux (NEXT) Normalization XQuery query and views NEXT for $a in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $a1 in $b/author, $t in $b/title where $a1 eq $a groupby [$b], [$t] return $t } for $a in distinct-values($doc//book[title]/author) return { $a, for $b in $doc//book, $t in $b/title where some $a1 in $b/author satisfies $a1 eq $a return $t } cardinality ? NEXT …

22 NEXT Patterns book($b1) title($t1) book($b1) author($a2) $t1, B2(V) $a2 B1(V) [$a2] [$b1],[$t1] $doc B2(V) alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: graphical representation of NEXT: nested patterns NEXT B1(V) B2(V) forest of tree patterns for $b1 in $doc//book, $t1 in $b1/title groupby [$b1], [$t1] return {$t1, for $a2 in $b1/author groupby [$a2] return $a2 }

23 NEXT Patterns alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: book($b1) title($t1) book($b1) author($a2) $t1, B2(V) $a2 B1(V) [$a2] [$b1],[$t1] $doc B2(V) graphical representation of NEXT: nested patterns NEXT B1(V) B2(V) descendant navigation child navigation for $b1 in $doc//book, $t1 in $b1/title groupby [$b1], [$t1] return {$t1, for $a2 in $b1/author groupby [$a2] return $a2 }

24 NEXT Patterns book($b1) title($t1) book($b1) author($a2) $t1, B2(V) $a2 B1(V) [$a2] [$b1],[$t1] $doc B2(V) return function alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: graphical representation of NEXT: nested patterns NEXT B1(V) B2(V) for $b1 in $doc//book, $t1 in $b1/title groupby [$b1], [$t1] return {$t1, for $a2 in $b1/author groupby [$a2] return $a2 }

25 NEXT Patterns book($b1) title($t1) book($b1) author($a2) $t1, B2(V) $a2 B1(V) [$a2] [$b1],[$t1] $doc B2(V) list of groupby variable s alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: graphical representation of NEXT: nested patterns NEXT B1(V) B2(V) for $b1 in $doc//book, $t1 in $b1/title groupby [$b1], [$t1] return {$t1, for $a2 in $b1/author groupby [$a2] return $a2 }

26 NEXT Patterns alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns book($b1) title($t1) book($b1) author($a2) $doc book($b0) title($t0) Query Q: author($a) book($b) title($t) author($a1) $t1, B2(V) $a2 B1(V) [$a2] [$b1],[$t1] $doc $a, B2(Q) $t B1(Q) $a B2(Q) [$b], [$t] B2(V) for $b0 in $doc//book, $t0 in $b0/title, $a in $b0/author groupby $a return { $a, for $b in $doc//book, $a1 in $b/author, $t in $b/title where $a1 eq $a groupby [$b],[$t] return $t } NEXT View V: graphical representation of NEXT: nested patterns B1(V) B2(V) B1(Q) B2(Q) for $b1 in $doc//book, $t1 in $b1/title groupby [$b1], [$t1] return {$t1, for $a2 in $b1/author groupby [$a2] return $a2 }

27 NEXT Patterns alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns book($b1) title($t1) book($b1) author($a2) $doc book($b0) title($t0) author($a) book($b) title($t) author($a1) $t1, B2(V) $a2 B1(V) [$a2] [$b1],[$t1] $doc $a, B2(Q) $t B1(Q) $a B2(Q) [$b], [$t] B2(V) NEXT View V: graphical representation of NEXT: nested patterns Query Q: for $b0 in $doc//book, $t0 in $b0/title, $a in $b0/author groupby $a return { $a, for $b in $doc//book, $a1 in $b/author, $t in $b/title where $a1 eq $a groupby [$b],[$t] return $t } for $b1 in $doc//book, $t1 in $b1/title groupby [$b1], [$t1] return {$t1, for $a2 in $b1/author groupby [$a2] return $a2 }

28 Outline NEXT (NEsted XML Tableaux) Rewriting Algorithm and Extensions Experiments Previous Work Conclusions

29 Architecture of the NEXT framework Nested XML Tableaux (NEXT) Normalization XQuery query and views Minimization Rewriting Using Views Logical Optimization Plan Execution Engine Logical Plan NEXT Nested XML Tableaux (NEXT) Translate to XQuery Independent XQuery Processor rewriting algorith m

30 Overview of the Rewriting Algorithm Input: query Q, views V 1.detect alternative access paths towards the variable bindings through the views 2.build a candidate rewriting R that uses only the access paths from phase 1. 3.check that R is equivalent to Q REWRITING ALGORITHM Query Q Access paths through V Access paths (candidate rewriting)

31 Step 1: Detect View Access Paths access paths: ways of accessing data using the view identify matching subqueries (extended tree pattern matching) find a mapping and add navigation from the view return book($b1) title($t1) book($b1) author($a2) $doc book($b0) title($t0) author($a) book($b) title($t) author($a1) $t1, B2(V) $a2 $doc view query body REWRITING ALGORITHM

32 Step 1: Detect View Access Paths access paths: ways of accessing data using the view identify matching subqueries (extended tree pattern matching) find a mapping and add navigation from the view return book($b1) title($t1) book($b1) author($a2) $doc book($b0) title($t0) author($a) book($b) title($t) author($a1) $t1, B2(V) $a2 $doc view query body $docV authorlist($p0) title($t2) extended query REWRITING ALGORITHM

33 Step 1: Detect View Access Paths access paths: ways of accessing data using the view identify matching subqueries (extended tree pattern matching) find a mapping and add navigation from the view return and another one… book($b1) title($t1) book($b1) author($a2) $doc book($b0) title($t0) author($a) book($b) title($t) author($a1) $t1, B2(V) $a2 $doc view query body $docV authorlist($p0) extended query author($a3) title($t2) REWRITING ALGORITHM

34 Step 1: Detect View Access Paths access paths: ways of accessing data using the view identify matching subqueries (extended tree pattern matching) find a mapping and add navigation from the view return and another one… computing all such mappings  query extension that uses only view access paths book($b1) title($t1) book($b1) author($a2) $doc book($b0) title($t0) author($a) book($b) title($t) author($a1) $t1, B2(V) $a2 $doc view query body extended query $docV authorlist($p0) title($t2) author($a3) authorlist($p) title($t3) author($a4) $docV query extension REWRITING ALGORITHM

35 Step 2: Candidate Rewriting same return function as the initial query, but with other variable bindings $doc book($b0) title($t0) author($a) book($b) title($t) author($a1) $doc original query $docV authorlist($p0) title($t2) author($a3) authorlist($p) title($t3) author($a4) $docV extended query $a, B2(Q) $t B1(Q) $a B2(Q) [$b], [$t] REWRITING ALGORITHM

36 Step 2: Candidate Rewriting same return function as the initial query, but with other variable bindings $doc book($b0) title($t0) author($a) book($b) title($t) author($a1) $doc original query $docV authorlist($p0) title($t2) author($a3) authorlist($p) title($t3) author($a4) $docV $a3, B2(R) $t3 B1(R) B2(R) $a3 [$t3] candidate rewriting B1(Q) $a B2(Q) [$b], [$t] REWRITING ALGORITHM

37 Step 3: Equivalence Check check that R ≡ Q: containment mappings defined on the tree of query blocks and then (optional step) translate back to XQuery: $docV authorlist($p0) title($t2) author($a3) authorlist($p) title($t3) author($a4) $docV $a3, B2(R) $t3 B1(R) B2(R) $a3 [$t3] Rewriting R: for $a3 in distinct-values ($docV/authorlist[title]/author) return { $a3, for $p in $docV/authorlist, $t3 in $p/title where some $a4 in $p/author satisfies $a4 eq $a3 return $p } REWRITING ALGORITHM

38 Under the Hood two types of equality: by value and by node id –mappings must take it into consideration –the groupby clause also XQuery results have order. We consider rewritings that: –do not respect order (for DB-centric applications) –respect order (for text-centric applications) for rewritings that respect order: look for an ordering of the view access paths that preserves the original query order (details in the paper) REWRITING ALGORITHM

39 for $x in $doc/book where count( for $a in $x/author where $x/price eq 60 groupby [$a] return $a ) eq count( …) groupby $x return $x Extensions to NEXT Extended NEXT to NEXT+: –extend the pattern based representation to the whole XQuery –functions and other expressions (negation, disjunction, aggregates etc.) modeled as uninterpreted functions Extended the algorithm to use NEXT+: need to identify maximal subparts that are pure NEXT blocks REWRITING ALGORITHM

40 Extensions to NEXT Extended NEXT to NEXT+: –extend the pattern based representation to the whole XQuery –functions and other expressions (negation, disjunction, aggregates etc.) modeled as uninterpreted functions Extended the algorithm to use NEXT+: need to identify maximal subparts that are pure NEXT blocks. REWRITING ALGORITHM for $x in $doc/book where count( for $a in $x/author where $x/price eq 60 groupby [$a] return $a ) eq count( …) groupby $x return $x rewrite blocks inside function arguments, with free variables bound in upper blocks rewrite outer block, disregarding function calls

41 The rewriting algorithm is sound and complete for a large fragment of XQuery (the one that can be translated into NEXT), without order –Completeness means that if there are any rewritings, we are guaranteed to find at least one. There is no hope for completeness for –ordered rewritings: equivalence is undecidable –expressions beyond NEXT: negation and universal quantification also lead to undecidability  In these cases, our algorithm is a best effort approach, with guaranteed soundness. Formal Guarantees REWRITING ALGORITHM

42 Implementation (considerations) completeness guarantees  a price to pay: compute mappings between view and query patterns in general, NP-complete, but PTIME if the patterns are trees (no equality conditions): based on M. Yanakakis, Algorithms for acyclic database schemes, 1981 our goal: design an implementation whose running time is polynomial for pure tree patterns and degrades progressively with the number of added joins REWRITING ALGORITHM

43 Implementation in practice when computing the query plan, apply techniques from the Yanakakis algorithm: push projections & selections performance degrades with the number of equalities: the problem is NP-complete in the width of the view pattern (see the paper) and in PTIME when no join equalities. V query plan (SPJ) Q XML instance compile evaluate..… mappings REWRITING ALGORITHM compile

44 Outline NEXT (NEsted XML Tableaux) Rewriting Algorithm and Extensions Experiments Previous Work Conclusions

45 Experiments: Design The running time of the algorithm increases with: –number of nested levels: mappings are block by block –size of the pattern: # of mapped and target nodes increases –number of views: more patterns to match Our experiments measured how the algorithm scales with these parameters. We designed a configuration where we generated queries and views of increasing size and nesting depth. EXPERIMENTS

46 Experiments: Implementation Queries & views with similar basic patterns, in a vertical chain of blocks: $doc mkmk a c1c1 mkmk a c2c2 m k+1 a c1c1 $doc m k+1 a c2c2 ….. basic pattern $doc mkmk a cici Irrelevant views don’t matter (can be quickly discarded).  We create only relevant views (with mappings into query): –split the query recursively into fragments = views –make them overlap on basic patterns EXPERIMENTS block B k+1 block B k

47 Experiments: Good Scalability d = depth (# of nested levels in a query) b = breadth (# of basic patterns in a block) EXPERIMENTS 1.25s for d=16, b=16 and 128 views

48 Previous work rewriting XPath queries using XPath views Rewriting XPath Queries Using Materialized Views W.Xu et al. VLDB 2005 rewriting XQuery using XPath views A Framework for Using Materialized XPath Views in XML Query Processing A. Balmin et al. VLDB 2004 rewrite an XQuery with only one XQuery view that has to contain the query ACE-XQ: A CachE-aware XQuery Answering System L.Chen et al. WebDB 2002 caching common XQuery subexpressions Implementing Memoization in a Streaming XQuery Processor Y.Diao et al. XSym 2004

49 Conclusions NEXT is a pattern based representation that describes what the query result is and not how it is computed  more opportunities for semantic optimizations extensible to all of XQuery, using NEXT+ rewriting using views algorithm –sound for the whole language –complete for a large fragment of XQuery –good scalability –independent of the underlying algebra of the query processor

50 Online Demo