Order-sensitive XML Query Processing over Relational Sources: An Algebraic Approach Authors: Ling Wang, Song Wang, Brian Murphy and Elke A. Rundensteiner.

Slides:



Advertisements
Similar presentations
HUX: Handling Updates in XML DataBase Systems Research Group Departmet of Computer Science Worcester Polytechnic Institute, Worcester, MA 01609, USA
Advertisements

Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
Composing XSL Transformations with XML Publishing Views Chengkai LiUniversity of Illinois at Urbana-Champaign Philip Bohannon Lucent Technologies, Bell.
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
Raindrop: An Algebra-Automata Combined XQuery Engine over XML Streams Hong Su, Elke Rundensteiner, Murali Mani, Ming Li Worcester Polytechnic Institute.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
SilkRoute: A Framework for Publishing Relational Data in XML Mary Fernández, AT&T Labs - Research Dan Suciu, Univ. of Washington Yada Kadiyska, Univ. of.
Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
Manish Bhide, Manoj K Agarwal IBM India Research Lab India {abmanish, Amir Bar-Or, Sriram Padmanabhan IBM Software Group, USA
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A.
Ling Wang, Mukesh Mulchandani Advisor: Elke A. Rundensteiner Rainbow Research group, DSRG, WPI Updating XQuery Views over Relational Data.
XML Views El Hazoui Ilias Supervised by: Dr. Haddouti Advanced XML data management.
Chapter 6: Database Evolution Title: AutoAdmin “What-if” Index Analysis Utility Authors: Surajit Chaudhuri, Vivek Narasayya ACM SIGMOD 1998.
VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute
Storing and Querying Ordered XML Using Relational Database System Swapna Dhayagude.
2003. DSRG, Worcester Polytechnic Institute1 Beyond the Rainbow: —— A Pot of Gold ala XML Database Projects WPI DSRG GROUP.
Database Systems and XML David Wu CS 632 April 23, 2001.
April 4, 2002 Updating XML Views of Relational Data 1 Master’s Thesis Update Talk For Mukesh Mulchandani Advisor : Prof. Elke Rundensteiner Reader : Prof.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04 DB Seminar, Spring 2005 By: Andrey Balmin Fatma Ozcan Kevin.
WIDM 2002 DSRG, Worcester Polytechnic Institute1 Honey, I Shrunk the XQuery! —— An XML Algebra Optimization Approach Xin Zhang, Bradford Pielech and Elke.
1 Rainbow XML-Query Processing Revisited: The Incomplete Story (Part II) Xin Zhang.
1 XQuery to XAT Xin Zhang. 2 Outline XAT Data Model. XAT Operator Design. XQuery Block Identification. Equivalent Rewriting Rules. Computation Pushdown.
CS561 On Relational Support for XML Publishing Beyond Sorting and Tagging Surajit Chaudhuri Raghav Kaushik Jeffrey F. Naughton Presented by:
TECHNIQUES FOR OPTIMIZING THE QUERY PERFORMANCE OF DISTRIBUTED XML DATABASE - NAHID NEGAR.
8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu.
CVSQL 2 The Design. System Overview System Components CVSQL Server –Three network interfaces –Modular data source provider framework –Decoupled SQL parsing.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
DATABASE and XML Moussa Mané. Learning Objectives ● Learn about Native XML Databases ● Learn about the conversion technology available ● Understand New.
LegoDB 1 Data Binding Workshop, Avaya Labs, June 2003 LegoDB: Cost-based XML to Relational “Shredding” Jerome Simeon Bell Labs – Lucent Technologies joint.
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
Eurotrace Hands-On The Eurotrace File System. 2 The Eurotrace file system Under MS ACCESS EUROTRACE generates several different files when you create.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Database Management 9. course. Execution of queries.
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
Efficient XSLT Processing in Relational Database System Zhen Hua Liu Anguel Novoselsky Oracle Corporation VLDB 2006.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “QUERY OPTIMIZATION” Academic Year 2014 Spring.
Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,
Optimization in XSLT and XQuery Michael Kay. 2 Challenges XSLT/XQuery are high-level declarative languages: performance depends on good optimization Performance.
Rainbow - Bridging XML and Relational Databases: Design, Implementation, and Evaluation MQP Advisor: Prof. Elke A. Rundensteiner, PhD Sponsor:
C-Store: How Different are Column-Stores and Row-Stores? Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 8, 2009.
Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation MQP Advisor: Prof. Elke A. Rundensteiner Sponsor: Verizon.
1 XQuery to SQL by XML Algebra Tree Brad Pielech, Brian Murphy Thanks: Xin.
CS 4432query processing1 CS4432: Database Systems II Lecture #11 Professor Elke A. Rundensteiner.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Rainbow - Bridging XML and Relational Databases: Design, Implementation, and Evaluation MQP Advisor: Prof. Elke A. Rundensteiner, PhD Sponsor:
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Chapter 13: Query Processing
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Efficiently Publishing Relational Data as XML Documents IBM Almaden Research Center Eugene Shekita Rimon Barr Michael Carey Bruce Lindsay Hamid Pirahesh.
XPERANTO: A Middleware for Publishing Object-Relational Data as XML Documents Michael Carey Daniela Florescu Zachary Ives Ying Lu Jayavel Shanmugasundaram.
Chris Menegay Sr. Consultant TECHSYS Business Solutions
XML and Databases.
OrientX: an Integrated, Schema-Based Native XML Database System
SilkRoute: A Framework for Publishing Rational Data in XML
Querying XML XPath.
Querying XML XPath.
Query Optimization.
Query Processing.
Wednesday, May 22, 2002 XML Publishing, Storage
Presentation transcript:

Order-sensitive XML Query Processing over Relational Sources: An Algebraic Approach Authors: Ling Wang, Song Wang, Brian Murphy and Elke A. Rundensteiner Institute: Database Systems Research Group, Worcester Polytechnic Institute (WPI) IDEAS’2005

IDEAS’05 2 Order in XML  Order is important to XML  Document order  XML view can be ordered (OrderBy) …  User query can be order-sensitive (OrderBy, position(), range()…) SXE Revenge Shutdown FOR $play in document(“record.xml")/PLAY OrderBy $play/band RETURN $play[3]/SONG[rang 1 to 2]/text() 1. Sort PLAY by its band’s name 2. Find third PLAY 3. Extract its first and second SONG Misfits She Back Street Boy Bullet We Are 138 Project X SXE Revenge Shutdown

IDEAS’05 3 Why XML-to-SQL?  XML is stored in relational database to …  provide reliable persistent storage  exploit mature technologies  XML-to-SQL Systems  SilkRoute (AT&T), XPERANTO (IBM), RAINBOW (WPI), Rolex (BellLab), Agora, MARS  Oracle XML DB, Microsoft SQL Server 2000 SQLXML, IBM DB2 XML Extender

IDEAS’05 4  XML Views  support XML view mechanism for XML data publishing  support queries (updates) over XML views  XML publishing scenario  Relational model is not order-sensitive  Order in XML views over RDB has no meaning  XML storage scenario  Order is essential !!!  Order-preserving loading –XML document  Relational database –Implicit order in XML document  explicit order code in RDB  Order-restoring in extraction views –Explicit order code in RDB  implicit order in XML view through view query XML Views XML RDB XML View User query Order encoding

IDEAS’05 5 Order-specific loading  Order-specific loading:  Loading strategies: Inline, edge, …  Order encoding methods: Global, local, dewey …

IDEAS’05 6 Example <xs:element name="PLAY" minoccurs="1" maxOccurs="unbounded"> <xs:element name="SONG" type="xs:string" minoccurs="1"/> Misfits She Back Street Boy Bullet We Are 138 Project X SXE Revenge Shutdown XML schema XML document

IDEAS’05 7 IIDPIDPOSITION 101 RECORDLIST IIDPIDPOSITIONBAND_PCDATA 211Misfits 312Back Street Boy 413Project X PLAY IIDPIDPOSITIONSONG_PCDATA 521She 631Bullet 732We Are SXE Revenge 942Shutdown SONG Relational Database Inline loading + local order encoding Example FOR $play IN document("dxv.xml")/PLAY/ROW ORDER BY $play/POSITION/text() RETURN $play/BAND_PCDATA FOR $song IN document("dxv.xml")/SONG/ROW [PID/text() = $play/IID/text()] ORDER BY $song/POSITION/text() RETURN $song/SONG_PCDATA/text() View query

IDEAS’05 8 Motivation  Many loading + Encoding combinations are possible …  {inline, edge, …} * {local, global, dewey…}  Hybrid of multiple loading and encoding may occur:  Loading: –Schema is available --- inline –Schema is not available --- edge  Order-encoding –Heavy update workload --- dewey –Query workload --- global  Multiple XML documents are loaded into RDB  Other loading and encoding methods may emerge in future  Conclude: Need general approach for XQuery-to-SQL translation

IDEAS’05 9 XSOT  XML-to-SQL Order-sensitive Translation (XSOT):  Step1: Encode XML document with explicit order code (order-exposing)  Step 2: Load XML to relational database (order-preserving)  Step 2: Extract XML view from relational database (order-restoring)  Step 3: Query via XML view with order predicates (order-sensitive)

IDEAS’05 10 XQuery Parser Default XML Schema Default XML View Web/Intranet Application User Sub- System Process Data Query flow Data flow Legend XAT Generator User XAT View Composer XAT Optimizer View XAT SQL XML Result Ordered Tuple Streams XML Schema XML Data View Query XML Generator XAT RDBMS SQL Generator Mapping Manager XQuery Engine DB2OracleSQL Server Loading XQuery Schema generation Data Loading Order Encoding XQuery Data Extracting XML Source Wrapper Default XML View Order-Sensitive User Query Composed XAT Optimized XAT Order Code Comparison Function Sybase XSOT Framework

IDEAS’05 11 IIDPIDPOSITION 101 RECORDLIST IIDPIDPOSITIONBAND_PCDATA 211Misfits 312Back Street Boy 413Project X PLAY IIDPIDPOSITIONSONG_PCDATA 521She 631Bullet 732We Are SXE Revenge 942Shutdown SONG We are 138 Shutdown FOR $record in document(“record.xml") RETURN $record/PLAY/SONG[2]/text() Find second song of each play FOR $play IN document("dxv.xml")/PLAY/ROW ORDER BY $play/POSITION/text() RETURN $play/BAND_PCDATA FOR $song IN document("dxv.xml")/SONG/ROW [PID/text() = $play/IID/text()] ORDER BY $song/POSITION/text() RETURN $song/SONG_PCDATA/text() View query Relational Database Inline loading + local order encoding Running Example

IDEAS’05 12 Order-sensitive XML Algebra Tree  XSOT methodology:  An algebraic approach  XML Algebra Tree (XAT)  XAT operators –Select, CartesianProduct, ThetaJoin, LeftOuterJoin, Distinct, GroupBy, OrderBy –Source, Navigate, Combine, Tagger  XAT Order Extension –Position() –Range()  Composition of the view and user XAT

IDEAS’05 13 View Query XAT FOR $play IN document("dxv.xml")/PLAY/ROW ORDER BY $play/POSITION/text() RETURN $play/BAND_PCDATA FOR $song IN document("dxv.xml")/SONG/ROW [PID/text() = $play/IID/text()] ORDER BY $song/POSITION/text() RETURN $song/SONG_PCDATA/text() Combine $dataPlayTag Tagger $dataSongTag $dataPlayTag GroupBy $play Combine $dataSongTag Navigate $song, SONG_PCDATA/text() $sData Tagger $sData $dataSongTag Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play Source “dxv.xml” $S Navigate $S, SONG/ROW $song Navigate $song, PID/text() $sPID ThetaJoin $pIID=$sPID Source “dxv.xml” $P Navigate $P, PLAY/ROW $play Navigate $play, POSITION/text() $pPos OrderBy $pPos Tagger $dataPlayTag $record Navigate $play, IID/text() $pIID

IDEAS’05 14 User Query XAT FOR $record in document(“record.xml") RETURN $record/PLAY/SONG[2]/text() GroupBy $record, $uPlay Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $uSongData $uDataSongTag Navigate $uRecord, PLAY $uPlay Navigate $uPlay, SONG $uSong Navigate $uSong, text() $uSongData Select $uNumPos=2 Source “record.xml” $P POS $uSong $uNumPos

IDEAS’05 15 GroupBy $record, $uPlay Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $uSongData $uDataSongTag Navigate $uRecord, PLAY $uPlay Navigate $uPlay, SONG $uSong Navigate $uSong, text() $uSongData Select $uNumPos=2 Source “record.xml” $P POS $uSong $uNumPos User XAT $P=$record Composed XAT Combine $dataPlayTag Tagger $dataSongTag $dataPlayTag GroupBy $play Combine $dataSongTag Navigate $song, SONG_PCDATA/text() $sData Tagger $sData $dataSongTag Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play Source “dxv.xml” $S Navigate $S, SONG/ROW $song Navigate $song, PID/text() $sPID ThetaJoin $pIID=$sPID Source “dxv.xml” $P Navigate $P, PLAY/ROW $play Navigate $play, POSITION/text() $pPos OrderBy $pPos Tagger $dataPlayTag $record Navigate $play, IID/text() $pIID View XAT

IDEAS’05 16 XAT Optimization – Order Explicit  Why?  Order in user XAT depends on the implicit order in the view  It blocks further optimization: Computation push down

IDEAS’05 17 XAT Optimization – Order Explicit Tagger $dataSongTag $dataPlayTag GroupBy $play Combine $dataSongTag Tagger $sData $dataSongTag View XAT construct SONG construct PLAY GroupBy $record, $uPlay Select $uNumPos=2 POS $uSong $uNumPos For each PLAY Sort SONGs Pick second song User XAT Depend on Cannot push down! Cannot translated into SQL!

IDEAS’05 18 XAT Optimization – Order Explicit  Goal: Convert user query order  FROM: implicit order in the XML view  TO: Explicit order-code column in relational encoding POS $uSong = POS $sPos POS $uSong $uNumPos GroupBy $record, $uPlay View Portion XAT Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play POS $sPos $uNumPos GroupBy $play User Portion XAT View Portion XAT Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play User Portion XAT

IDEAS’05 19 SQL-oriented XAT optimization  Goal:  Optimize XAT for efficient order-sensitive SQL generation  Rules:  Computation push-down –Push as much as possible to RDB  Order pull-up –Sort as late as possible –Avoid re-sorting !!!  Order-step rewrite –Match RDB order template

IDEAS’05 20 Optimized XAT Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play Source “dxv.xml” $S Navigate $S, SONG/ROW $song Navigate $song, PID/text() $sPID ThetaJoin $pIID=$sPID Source “dxv.xml” $P Navigate $P, PLAY/ROW $play Navigate $play, POSITION/text() $pPos Navigate $play, IID/text() $pIID OrderBy $sPos, $pPos 4 GroupBy $pPos Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $sData $uDataSongTag Select $uNumPos= POS $sPos $uNumPos Navigate $song, SONG_PCDATA/text() $sData 13 Computation push down Order pull up OrderStep rewrite OrderStep [$pPos], [$pPos, $sPos] $uNumPos

IDEAS’05 21 Navigate $song, POSITION/text() $sPos OrderStep [$pPos], [$pPos, $sPos] $uNumPos Source “dxv.xml” $S Navigate $S, SONG/ROW $song Navigate $song, PID/text() $sPID ThetaJoin $pIID=$sPID Source “dxv.xml” $P Navigate $P, PLAY/ROW $play Navigate $play, POSITION/text() $pPos Navigate $play, IID/text() $pIID OrderBy $sPos, $pPos 4 Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $sData $uDataSongTag Select $uNumPos= Navigate $song, SONG_PCDATA/text() $sData 13 Optimized XAT

IDEAS’05 22 TEMPLATE: SELECT row_number() over ( ? ) $pos_func_binding FROM + PARTITION: partition by ORDERBY: order by | TONUMBER: to_number( ) ELEMENT: element name TABLE: table name | TEMPLATE Order Template  SQL-99 standard  Oracle, DB2 … Order Template

IDEAS’05 23 Order-sensitive SQL generation  About push-down strategies  In general ---- push as much computation as possible into relational engine.  In order scenario --- tradeoff  Deep push: –Push OrderStep into Relational Engine –Relational engine has to support order template (SQL99) Q5 = SELECT Q2.sData FROM (SELECT Q1.pPos, Q1.sPos, Q1.sData, row_number() OVER (PARTITION BY Q1.pPos ORDER BY Q1.sPos) uNumPos FROM (SELECT P.POSITION AS pPos, S.POSITION AS sPos, S.SONG_PCDATA AS sData FROM PLAY P, SONG S WHERE P.IID = S.PID ) Q1 ) Q2 WHERE Q2.uNumPos = 2 ORDER BY Q2.pPos, Q2.sPos SQL Q5 $sData 32 Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $sData $uDataSongTag

IDEAS’05 24 Order-sensitive SQL generation  Shallow push (otherwise) –leave OrderStep outside RDB –No requirement for Relational engine for supporting order template (SQL99) SELECT P.POSITION AS pPos, S.POSITION AS sPos, S.SONG_PCDATA AS sData FROM PLAY P, SONG S WHERE P.IID = S.PID OrderStep [$pPos], [$pPos, $sPos] $uNumPos 29 OrderBy $sPos, $pPos 4 Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $sData $uDataSongTag Select $uNumPos= SQL Q1 $sData

IDEAS’05 25 Deep Push vs. Shallow Push  Low selectivity similar  High selectivity  Shallow push is better  Repeated sorting in deep push is expensive!

IDEAS’05 26 Experimental Study SQL Execution time --- Global vs. Local order encoding

IDEAS’05 27 Discussion: Further SQL optimization  General SQL optimization can be applied…  Cost-based SQL translation (SilkRoute)  Any other SQL optimization…  When order encoding is assumed…  SQL statements can be simplified by avoiding re-ordering  When relational database schema is aware …  Schema specific SQL optimization [KKN2002]

IDEAS’05 28 Related Work  XQuery-to-SQL translation systems: XPERANTO, SilkRoute, …  [TVB2002] I. Tatarinov, S. D. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita, and C. Zhang. Storing and Querying Ordered XML Using a Relational Database System. In SIGMOD,  Three order encoding methods are utilized  Algorithms of translating ordered XPath expressions into SQL But …  [KKN2002] R. Krishnamurthy, R. Kaushik, and J. F. Naughton. Optimizing Fixed-Schema XML to SQL Query Translation. In VLDB, 2002.

IDEAS’05 29 Conclusion  Propose a general framework for order-sensitive XQuery-to-SQL translation (XSOT)  Propose order-sensitive XML algebra Tree (XAT)  SQL-oriented order-sensitive XAT optimization  Efficient order SQL statements generation and optimization techniques  Implementation using Rainbow query engine  Experiments to verify the generality and SQL performance

IDEAS’05 30 Rainbow XML Management System  Rainbow website:  Software download  Thank you!