XQuery Implementation in a Relational Database System Shankar Pal Istvan Cseri, Oliver Seeliger, Michael Rys, Gideon Schaller, Wei Yu, Dragan Tomic, Adrian.

Slides:



Advertisements
Similar presentations
XML Data Management 8. XQuery Werner Nutt. Requirements for an XML Query Language David Maier, W3C XML Query Requirements: Closedness: output must be.
Advertisements

XML: Extensible Markup Language
Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
1 XQuery Web and Database Management System. 2 XQuery XQuery is to XML what SQL is to database tables XQuery is designed to query XML data What is XQuery?
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
1 Lecture 12: XQuery in SQL Server Monday, October 23, 2006.
A Framework for Using Materialized XPath Views in XML Query Processing Dapeng He Wei Jin.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML.
1 COS 425: Database and Information Management Systems XML and information exchange.
Query Execution Professor: Dr T.Y. Lin Prepared by, Mudra Patel Class id: 113.
XML Data in MS SQL Server Query and Modification Steven Blundy, Duc Duong, Abhishek Mukherji, Bartlett Shappee CS561.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
Access Path Selection in a Relation Database Management System (summarized in section 2)
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
Deep Dive into XQuery and XML in Microsoft SQL Server: Common Problems and Best Practice Solutions Michael Rys Principal Program Manager Microsoft Corporation.
8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu.
Module 17 Storing XML Data in SQL Server® 2008 R2.
2.2 SQL Server 2005 的 XML 支援功能. Overview XML Enhancements in SQL Server 2005 The xml Data Type Using XQuery.
XML files (with LINQ). Introduction to LINQ ( Language Integrated Query ) C#’s new LINQ capabilities allow you to write query expressions that retrieve.
10/06/041 XSLT: crash course or Programming Language Design Principle XSLT-intro.ppt 10, Jun, 2004.
Using XML in SQL Server 2005 NameTitleCompany. XML Overview Business Opportunity The majority of all data transmitted electronically between organizations.
Xpath Query Evaluation. Goal Evaluating an Xpath query against a given document – To find all matches We will also consider the use of types Complexity.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
Sofia, Bulgaria | 9-10 October Using XQuery to Query and Manipulate XML Data Stephen Forte CTO, Corzen Inc Microsoft Regional Director NY/NJ (USA) Stephen.
IBM Almaden Research Center © 2006 IBM Corporation On the Path to Efficient XML Queries Andrey Balmin, Kevin Beyer, Fatma Özcan IBM Almaden Research Center.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
Module 7 Reading SQL Server® 2008 R2 Execution Plans.
Database Management 9. course. Execution of queries.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Querying Structured Text in an XML Database By Xuemei Luo.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
XQL, OQL and SQL Xia Tang Sixin Qian Shijun Shen Feb 18, 2000.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
XSLT part of XSL (Extensible Stylesheet Language) –includes also XPath and XSL Formatting Objects used to transform an XML document into: –another XML.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
EXist Indexing Using the right index for you data Date: 9/29/2008 Dan McCreary President Dan McCreary & Associates (952) M.
Optimization in XSLT and XQuery Michael Kay. 2 Challenges XSLT/XQuery are high-level declarative languages: performance depends on good optimization Performance.
August Chapter 6 - XPath & XPointer Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
Database Systems Part VII: XML Querying Software School of Hunan University
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
XML and Database.
1 XML Data Management XPath Principles Werner Nutt.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
Session 1 Module 1: Introduction to Data Integrity
Module 3: Using XML. Overview Retrieving XML by Using FOR XML Shredding XML by Using OPENXML Introducing XQuery Using the xml Data Type.
CSE 6331 © Leonidas Fegaras XQuery 1 XQuery Leonidas Fegaras.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
XML: Extensible Markup Language
Efficient Evaluation of XQuery over Streaming Data
Relational Database Design
Querying and Transforming XML Data
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Chapter 15 QUERY EXECUTION.
Lecture 7: Introduction to Parsing (Syntax Analysis)
Introduction to Database Systems CSE 444 Lecture 12 More Xquery and Xquery in SQL Server April 25, 2008.
XQuery Leonidas Fegaras.
CPSC-608 Database Systems
Lecture 12: XQuery in SQL Server
Introduction to Database Systems CSE 444 Lecture 12 Xquery in SQL Server October 22, 2007.
XML? What’s this doing in my database? Adam Koehler
Presentation transcript:

XQuery Implementation in a Relational Database System Shankar Pal Istvan Cseri, Oliver Seeliger, Michael Rys, Gideon Schaller, Wei Yu, Dragan Tomic, Adrian Baras, Brandon Berg, Denis Churin, Eugene Kogan SQL Server Microsoft Corp

VLDB Sep 1S. Pal et al.2 Overview  Background XML Support in SQL Server 2005 OrdPath labeling of XML nodes XML indexes – PATH, VALUE, PROPERTY  Main topic – XQuery compilation Architecture XML operators Mapping XML operators to relational+ ops  Conclusions

VLDB Sep 1S. Pal et al.3 Create table DOCS ( ID int primary key, XDOC xml)  XML stored in an internal, binary form (‘blob’)  Optionally typed by a collection of XML schemas Used for storage and query optimizations  3 of 5 methods on XML data type: query(): returns XML type value(): returns scalar value exist(): checks conditions on XML nodes  XML indexing  More information at Background XML Support in SQL Server 2005

VLDB Sep 1S. Pal et al.4 Background XQuery embedded in SQL  Retrieve section titles from wrapped in new elements: SELECT ID, XDOC.query(' for $s in /BOOK/SECTION return {data($s/TITLE)} ') FROM DOCS

VLDB Sep 1S. Pal et al.5 Background XQuery – supported features  XQuery clauses “for”, “where”, “return” and “order by”  XPath axes – child, descendant, parent, attribute, self and descendant-or-self  Functions – numeric, string, Boolean, nodes, context, sequences, aggregate, constructor, data accessor  SQL Server extension functions to access SQL variable and column data within XQuery  Numeric operators (+, -, *, div, mod)  Value comparison operators (eq, ne, lt, gt, le, ge)  General comparison operators (=, !=,, =)

VLDB Sep 1S. Pal et al.6 Background [SIGMOD04] ORDPATH Label of Nodes BOOK1 Section1.3 Figure1.3.3Title1.3.1 Section1.5 node 1 precedes node 2 in document order  ORDPATH (node 1 ) < ORDPATH (node 2 ) node 1 is ancestor of node 2  ORDPATH (node 1 ) is prefix of ORDPATH (node 2 ) ORDPATH(1.3) ≤ id < Descendant_Limit (1.3) = 1.4

VLDB Sep 1S. Pal et al.7 Background [VLDB 2004] Indexing XML column  Primary XML index on an XML column Creates B+tree tree on data model content of the XML nodes Adds column Path_ID for the reversed, encoded path from each XML node to root of XML tree  OrdPath labeling schema is used for XML nodes Relative order of nodes Document hierarchy

VLDB Sep 1S. Pal et al.8 Background XML example INSERT INTO myTable VALUES (7, ‘ Bad Bugs Tree frogs … ’)

VLDB Sep 1S. Pal et al.9 Background Primary XML Index Entries IDORDPATHTAGNODETYPEVALUEPATH_ID 711 (Book)10 (ns:bT)NULL# (ISBN)2 (xs:string)' …'#2# (Section)11 (ns:sT)NULL#3# (Title)2 (xs:string)'Bad Bugs'#4#3# (Figure)12 (ns:fT)NULL#5#3# (Section)11 (ns:sT)NULL#3# (Title)2 (xs:string)'Tree frogs'#4#3# (Figure)12 (ns:fT)NULL#5#3#1 Clustering key - Encoding of tags & types stored in system meta-data - Additional details not shown

VLDB Sep 1S. Pal et al.10 Background Secondary XML indexes  To speed up different classes of commonly occurring queries  Statistics created on key columns of the primary and secondary XML indexes Used for cost-based selection of secondary XML indexes PATHpath-based queriesPATH_ID, VALUE, ID, ORDPATH VALUEvalue-based queries VALUE, PATH_ID, ID, ORDPATH PROPERTYObject propertiesID, PATH_ID, VALUE, ORDPATH

VLDB Sep 1S. Pal et al.11 Background Handling Types  If XML column is typed Values are stored in XML blob and XML indexes with appropriate typing  Untyped XML Values are stored as strings Convert to appropriate types for operations  SQL typed values stored in primary XML index Most SQL types are compatible with XQuery types (integer) Value comparisons on XML index columns suffice Some types (e.g. xs:datetime) are stored in internal format and processed specially

VLDB Sep 1S. Pal et al.12 XQuery Processing Architecture  XQuery Compiler: Parses XQuery expr Checks static type correctness Type annotations Applies static optimiztns  Path collapsing  Rewrites using XML schemas  XML Operator Mapper Recursively traverses XML algebra tree Converts each XmlOp to reln+ operator sub-tree Mapping depends upon existence of primary XML index XQuery expression XQuery Compiler XML algebra tree (XmlOp ops) XML Operator Mapper Relational Operator Tree (relational+ operators) Reln Query Processor

VLDB Sep 1S. Pal et al.13 Examples of XML Operators XmlOp_Select In: list of items, condition Out: items satisfying condition XmlOp_Path In: simple paths, no predicates Opt: path context to collapse paths Out: eligible XML nodes XmlOp_Apply In: two item lists Out: one item list Variable binding in “for” expression XmlOp_Construct In: sub-nodes for element construction, otherwise value Out: constructed node

VLDB Sep 1S. Pal et al.14 XML Operator Mapping – Overview XMLPK XQUERY PK REL+ tree Primary XML Index PATH Index VALUE Index PROPERTY Index OrdPath Special handling for SELECT * | XDOC

VLDB Sep 1S. Pal et al.15 New operators  Some produce N rows from M (≠ N) rows XML_Reader – streaming, pull-model XML parser XML_Serializer – to serialize query result as XML  Some are for efficiency Contains – to evaluate XQuery contains() TextAdd – to evaluate the XQuery function string() Data – to evaluate XQuery data() function  Some are for specific needs Check – validate XML during insertion or modification

VLDB Sep 1S. Pal et al.16 XML Operator Mapping  Following categories: Mapping of XPath expressions Mapping of XQuery expressions Mapping of XQuery built-in functions

VLDB Sep 1S. Pal et al.17 XPath Expressions  Two cases: Fully known, forward paths without branching after path collapsing Paths without branching that are not fully known after path collapsing  Segments of the path cannot be collapsed or a path is split into multiple segments  Occurs most commonly for paths containing wildcard steps, //, self and parent axes  Evaluated using LIKE operator on XML idx

VLDB Sep 1S. Pal et al.18 Non-indexed XML, Full Path  XML_Reader produces subtrees of Node table rows Contains OrdPath No PK or PATH_ID  XML_Serialize reassembles those row into XML data type To output result XML operator tree: XmlOp_Path PATH = “/BOOK/SECTION” “/BOOK/SECTION” Rel+ operator tree: XML_Serialize XML_Reader (XDOC, “/BOOK/SECTION”)

VLDB Sep 1S. Pal et al.19 Query Execution on XML Blob  XDOC column value in each row parsed at runtime Parser is XmlReader (not DOM) Evaluate simple XPath (without branching) during parsing Rest of processing done in memory using relational operators  // and * are also pushed into XML_Reader SELECT ID, XDOC.query (' /BOOK/SECTION [2] ') FROM DOCS

VLDB Sep 1S. Pal et al.20 Sample query execution using Primary XML Index IDORDPATHTAGNODETYPEVALUEPATHID 711 (Book)10 (ns:bT)NULL# (ISBN)2 (xs:string)' …'#2# (Section)11 (ns:sT)NULL#3# (Title)2 (xs:string)'Bad Bugs'#4#3# (Figure)12 (ns:fT)NULL#5#3# (Section)11 (ns:sT)NULL#3# (Title)2 (xs:string)'Tree frogs'#4#3# (Figure)12 (ns:fT)NULL#5#3#1 Clustering key /Book/Section  #3#1 (by XML Op Mapper) /Book/Section  #3#1 (by XML Op Mapper)

VLDB Sep 1S. Pal et al.21 Indexed XML, Full Path  XmlOp_Path mapped to SELECT  GET(PXI) – rows from primary XML index Match PATH_ID  Not shown: JOIN with base table on PKXML_Serialize Apply Select ($b) GET(PXI) Path_ID=#SECTION#BOOK $b.OrdP ≤ OrdP< DL($b) GET(PXI) Select Assemble Subtree

VLDB Sep 1S. Pal et al.22 XML index – PATH PATH_IDVALUEIDORDPATH #1NULL71 #2#1' …'71.1 #3#1NULL71.3 #3#1NULL71.5 #4#3#1'Bad Bugs' #4#3#1'Tree frogs' #5#3#1NULL #5#3#1NULL  Speeds up path evaluations  Example – /Book/Section  #3#1

VLDB Sep 1S. Pal et al.23 Indexed XML, Imprecise Paths /BOOK/SECTION// TITLE  Matched using LIKE operator on Path_ID Apply Select ($s) GET(PXI) Path_ID LIKE #TITLE%#SECTION#BOOK XML_Serialize Assemble subtree of Assemble subtree of

VLDB Sep 1S. Pal et al.24 SBN#BOOK & VALUE=“12” & Par($b) Predicate Evaluation = “12”]  Search value compared with VALUE column in PXI  Collapsed path Induce index seeks Reduce intermediate result size  Parent check – Par($b) Using OrdPath  Value conversion might be neededXML_Serialize Apply Select GET(PXI) Apply Select ($b) GET(PXI) Path_ID= #BOOK Assemble subtree of Assemble subtree of

VLDB Sep 1S. Pal et al.25 Ordinal Predicate  /BOOK[n]  Adds ranking column to the rows for elements Retrieves the nth node  Special optimizations [1]  TOP 1 ascending [last()]  TOP 1 descending Avoids sorting when input is sorted  Example – in XML_Serializer

VLDB Sep 1S. Pal et al.26 Error handling  Static type errors at compilation time Raises static type errors if an expression could fail at runtime due to type safety violation  Addition of string to integer  Querying non-existent node name in typed XML  Non-singleton in “eq” Some can be fixed using explicit cast or ordinal specification  Dynamic error converted to empty sequence Yields correct result in predicates without negations

VLDB Sep 1S. Pal et al.27 “for” Iterator Path_ID LIKE #BK & VALUE >= 3 & Par($s) Select Select ($s) GET (PXI) Path_ID LIKE #SECTION%#BOOK Exists GET(PXI) Select XML_Serialize Assemble Path_ID LIKE #TITLE#SECTION% #BOOK & Par($s) Apply ($s) Apply for $s in /BOOK//SECTION where >= 3 return $s/TITLE  XML op for “for” is XmlOp_Apply Maps to APPLY Binds $s and iterates over Determines its children  Nested “for” and “for” with multiple bindings turn into nested APPLY Each APPLY binds to a different variable

VLDB Sep 1S. Pal et al.28 XQuery “order by” and “where”  Order by: Sorts rows based on order-by expression Adds a ranking column to these rows Ranking column converted into OrdPath values  Yield the new order of the rows  Fits rest of query processing framework  Where Becomes SELECT on input sequence Filters rows satisfying specified condition

VLDB Sep 1S. Pal et al.29 XQuery “return”  Return nodes sequence in document order Use OrdPath values and XML_Serialize operator  New element and sequence constructions Merge constructed and existing nodes into a single sequence (SWITCH_UNION)

VLDB Sep 1S. Pal et al.30 XQuery Functions & Operators  Built-in fn and op are mapped to relational fn and op if possible fn:count()  count()  Additional support for XQuery types, functions and operators that cannot be mapped directly Intrinsics

VLDB Sep 1S. Pal et al.31 Optimizations  Exploiting Ordered Sets Sorting information (OrdPath) made available to further relational operators XML_Serialize is an example  Using static type information Eliminates CONVERT() in operations Allows range scan on VALUE index

VLDB Sep 1S. Pal et al.32 Conclusions  Built-up infrastructure for query processing framework Other XQuery features (such as “let” and typeswitch) can be implemented Data modification language  Fits into relational query processing framework  XQuery features can be implemented using rel++ operators  Optimizations pose the biggest challenges  More cost-based optimizations can be done Enhanced costing model (e.g. choice of PXI) Matching materialized views

VLDB Sep 1S. Pal et al.33 Thank you!