Lecture 24 15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 24 – Part 2 XML Query Processing Phil Gibbons April.

Slides:



Advertisements
Similar presentations
XML to Relational Database Mapping
Advertisements

XML: Extensible Markup Language
XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
Composing XSL Transformations with XML Publishing Views Chengkai LiUniversity of Illinois at Urbana-Champaign Philip Bohannon Lucent Technologies, Bell.
By Daniela Floresu Donald Kossmann
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
Storing and Querying XML Documents Using Relational Databases Mustafa Atay Wayne State University Detroit, MI February 28, 2006.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
1 XEM: Managing the Evolution of XML Documents Author: Hong Su, Diane Kramer. Li Chen, Kajal Claypool and Elke A. Rundensteiner Presented by: Li Shuhong.
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
Benchmarking XML storage systems Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML.
1 COS 425: Database and Information Management Systems XML and information exchange.
Storing and Querying Ordered XML Using Relational Database System Swapna Dhayagude.
Storage of XML Data XML data can be stored in –Non-relational data stores Flat files –Natural for storing XML –But has all problems discussed in Chapter.
2005rel-xml-i1 Relational to XML Transformations  Background & Issues  Preliminaries  Execution strategies  The SilkRoute System.
SLIDE 1IS 257 – Fall 2006 New Generation Database Systems: XML Databases University of California, Berkeley School of Information IS 257: Database.
Database Systems and XML David Wu CS 632 April 23, 2001.
Bridging Relational Technology and XML Jayavel Shanmugasundaram University of Wisconsin & IBM Almaden Research Center.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
LegoDB Customizing Relational Storage for XML Documents Timothy Sutherland Sachin Patidar.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
4/20/2017.
8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu.
XML, distributed databases, and OLAP/warehousing The semantic web and a lot more.
XML: Extensible Markup Language FST-UMAC Gong Zhiguo.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
An Extension to XML Schema for Structured Data Processing Presented by: Jacky Ma Date: 10 April 2002.
Lecture 7 of Advanced Databases XML Querying & Transformation Instructor: Mr.Ahmed Al Astal.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
Maziar Sanaii Ashtiani – SCT – EMU, Fall 2011/12.
Lecture 6 of Advanced Databases XML Querying & Transformation Instructor: Mr.Eyad Almassri.
LegoDB 1 Data Binding Workshop, Avaya Labs, June 2003 LegoDB: Cost-based XML to Relational “Shredding” Jerome Simeon Bell Labs – Lucent Technologies joint.
Sofia, Bulgaria | 9-10 October Using XQuery to Query and Manipulate XML Data Stephen Forte CTO, Corzen Inc Microsoft Regional Director NY/NJ (USA) Stephen.
1 Maintaining Semantics in the Design of Valid and Reversible SemiStructured Views Yabing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
Computing & Information Sciences Kansas State University Thursday, 15 Mar 2007CIS 560: Database System Concepts Lecture 24 of 42 Thursday, 15 March 2007.
Copyright © 2004 Pearson Education, Inc.. Chapter 26 XML and Internet Databases.
Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,
RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ A. ONUR DOĞUÇ
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
XML Name: Niki Sardjono Class: CS 157A Instructor : Prof. S. M. Lee.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
XML and Database.
LegoDB XML-to-Relational Mapping using LegoDB Dustin Anderson CSC560 a way to map XML Schema structures to relational tables.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
SQL Server 2005 XML Datatype David Wilson Ohio North SQL Server Special Interest Group July 12, 2007.
Review Lecture DB A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Review Lecture Databases Phil Gibbons May 1, 2003.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML Storage We must upgrade to XML. Everyone is talking about it. Well, that is going to cost us XXX on YYY and earn us WWW on ZZZ.
XPERANTO: A Middleware for Publishing Object-Relational Data as XML Documents Michael Carey Daniela Florescu Zachary Ives Ying Lu Jayavel Shanmugasundaram.
Bridging Relational Technology and XML Jayavel Shanmugasundaram University of Wisconsin & IBM Almaden Research Center.
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
XML to Relational Database Mapping
XML: Extensible Markup Language
Querying and Transforming XML Data
Semi-Structured data (XML Data MODEL)
Alin Deutsch, University of Pennsylvania Mary Mernandez, AT&T Labs
2/18/2019.
Wednesday, May 29, 2002 XML Storage Final Review
Wednesday, May 22, 2002 XML Publishing, Storage
Presentation transcript:

Lecture A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 24 – Part 2 XML Query Processing Phil Gibbons April 15, 2003

Lecture XML Query Processing: Outline XML vs. Relational XML on Relational DB: Shanmugasundaram et al, “Relational Databases for Querying XML Documents: Limitations and Opportunities”, VLDB’99, plus follow on papers LegoDB, STORED, Edge (2 slides) To be continued in my next lecture…

Lecture XML vs. SQL for Sensor Databases IrisNet represents data in XML (semi-structured model) Hierarchical documents, Queries in XPATH TinyDB represents data in the relational model Tables, Queries in SQL What are the pros and cons for each approach? How does it depend on the sensing context?

Lecture Why IrisNet Uses XML Rich, heterogeneous data Hard to capture in a rigid data model Self-describing tags useful Schema evolution XML supports on-the-fly schema changes Wide area sensing => Hierarchical organization Good match for XML, bad for relational Standard data exchange format

Lecture Disadvantages of XML Query languages are lacking Some minimal features: e.g., aggregates, updates Query processors not available for XQuery Query processing is SLOW Key research question: Can we store XML in a relational DB, and use a relational database system to process queries?

Lecture Why use Relational DB Systems? Highly reliable, scalable, optimized for performance, advanced functionality Result of 30+ years of Research & Development XML database systems are not “industrial strength” … and not expected to be in the foreseeable future Existing data and applications XML applications have to inter-operate with existing relational data and applications Not enough incentive to move all existing business applications to XML database systems Lessons from object-oriented database systems? Adapted from slides ©Jayavel Shanmugasundaram

Lecture XML Query Processing: Outline XML vs. Relational XML on Relational DB: Shanmugasundaram et al, “Relational Databases for Querying XML Documents: Limitations and Opportunities”, VLDB’99, plus follow on papers LegoDB, STORED, Edge (2 slides) To be continued…

Lecture Storing and Querying XML Documents Relational Database System XML Translation Layer XML Schema Relational Schema Translation Information XML Documents Tuples XML Query SQL Query Relational Result XML Result Adapted from slides ©Jayavel Shanmugasundaram

Lecture Relational Data PurchaseOrder IdCustomer 200I YearMonth Cars R Us10June I Day Bikes R UsnullJuly1999 Payment Installment 40% Percentage Pid 300I 100% 200I 60% I Item Name 200I Cost Firestone Tire I Quantity Goodyear Tire Pid 300I Trek Tire20 300ISchwinn Tire Adapted from slides ©Jayavel Shanmugasundaram

Lecture SQL Query Find all the items bought by “Cars R Us” in the year 1999 Select it.name From PurchaseOrder po, Item it Where po.customer = “Cars R Us” and po.year = 1999 and po.id = it.pid Predicates Join PurchaseOrder Id Customer 200I YearMonth Cars R Us 10 June I Day Bikes R Usnull July 1999 Payment Installment 40% Percentage Pid 300I 100% 200I 60% I Item Name 200I Cost Firestone Tire I Quantity Goodyear Tire Pid 300I Trek Tire I Schwinn Tire Adapted from slides ©Jayavel Shanmugasundaram

Lecture XML Document 10 June % 60% Nested structure Self-describing tags Nested sets Order Adapted from slides ©Jayavel Shanmugasundaram

Lecture XML Schema Date (Item)* (Payment)* PurchaseOrder Date Day? Month Year Day {integer} Month {string} Year {integer} Item Quantity … and so on Adapted from slides ©Jayavel Shanmugasundaram

Lecture Schemas to Relations: Issues Complex schema specifications Two level nature of relational schema (tuples and attributes) vs. arbitrary nesting of XML Schema Recursion Adapted from slides ©Jayavel Shanmugasundaram

Lecture Naïve Approach PurchaseOrder Id (200I) Customer (Cars R Us) Date Day (10) Month (June) Year (1999) Item Payment (40%) … Element NodeAttribute Node Adapted from slides ©Jayavel Shanmugasundaram

Lecture Naïve Approach (Contd.) Problem: Many joins for queries (one per hop) eg. PurchaseOrder/Date/Year Edges Id Name 0 ParentIdType PurchaseOrdernullElement null 1 ValueOrdinal null AttributeId200I00 2 AttributeCustomerCars R Us10 3 ElementDatenull20 4 ElementDay ElementMonthJune13 6 ElementYear ……………… Adapted from slides ©Jayavel Shanmugasundaram

Lecture Desired Properties of Generated Relational Schema R All XML documents conforming to XML schema should be “mappable” to tuples in R All queries over XML documents should be “mappable” to SQL queries over R Not Required: Ability to re-generate XML schema from R Adapted from slides ©Jayavel Shanmugasundaram

Lecture XML Schema: Further Examples Date? (Item | Payment)* PurchaseOrder (Date | Payment*) (Item (Item Item)* Payment)* PurchaseOrder Date Item (PurchaseOrder)* Payment PurchaseOrder Adapted from slides ©Jayavel Shanmugasundaram

Lecture Simplifying XML Schemas XML schemas can be “simplified” for translation purposes Without undermining storage and query functionality Date? (Item)* (Payment)* PurchaseOrder (Date | (Payment)*) (Item (Item Item)* Payment)* PurchaseOrder Adapted from slides ©Jayavel Shanmugasundaram

Lecture Simplification Desiderata Simplify structure, but preserve differences that matter in relational model Single occurrence (attribute) Zero or one occurrences (nullable attribute) Zero or more occurrences (relation) (Date | (Payment)*) (Item (Item Item)* Payment)* PurchaseOrder Date? (Item)* (Payment)* PurchaseOrder Adapted from slides ©Jayavel Shanmugasundaram

Lecture Simplification Rules Flattening transformations (e1 e2)* -> e1* e2* (e1 e2)? -> e1? e2? (e1 | e2) -> e1? e2? Simplification transformations e** -> e* e*? -> e* e?* -> e* e?? -> e? Grouping transformations e1* e2* e1* -> e1* e2* …etc e+ -> e* What is lost? Adapted from slides ©Jayavel Shanmugasundaram

Lecture Result: Translation Normal Form An XML schema production is either of the form: … or of the form: {type} P a 1 … a p a p+1 ? … a q ? a q+1 *… a r * P where a i  a j Adapted from slides ©Jayavel Shanmugasundaram

Lecture Simplified XML Schema Date (Item)* (Payment)* PurchaseOrder Date Day? Month Year Day {integer} Month {string} Year {integer} Item Quantity … and so on Adapted from slides ©Jayavel Shanmugasundaram

Lecture Relational Schema Generation PurchaseOrder (id, customer) Date DayMonthYear Item (name, cost) Quantity Payment 1 ?11 ** 1 Minimize: Number of joins for simple path expressions (of form /a/b/c) Satisfy: Tables are normalized Adapted from slides ©Jayavel Shanmugasundaram

Lecture Generated Relational Schema and Shredded XML Document PurchaseOrder IdCustomer 200I YearMonth Cars R Us10June1999 Day Payment Order 40% Value Pid 200I 60% I Item Order Name 200I Cost Firestone Tire I Quantity Goodyear Tire Pid 1 3 Adapted from slides ©Jayavel Shanmugasundaram

Lecture Example Schema Graph Not just a tree Adapted from slides ©Jayavel Shanmugasundaram

Lecture Thus far, works well for trees only Intuition: Inline as many sub-elements as possible Do not inline only if it is a shared, recursive or set sub- element. Technique: Necessary and Sufficient Condition for shared/ recursive element: In-degree >= 2 in (simplified) schema graph Shared Inlining Technique Adapted from slides ©Jayavel Shanmugasundaram

Lecture Relational Schema Generation and XML Document Shredding Any XML Schema X can be mapped to a relational schema R, and … Any XML document XD conforming to X can be converted to tuples in R Further, XD can be recovered from the tuples in R What do you think of the approach, for IrisNet? Exercise: What would the Parking Space Finder relational schema look like? Would there be many or few joins in queries? Adapted from slides ©Jayavel Shanmugasundaram

Lecture Path Expression with Length 3 Adapted from slides ©Jayavel Shanmugasundaram

Lecture Varying Path Expression Length Group 1 DTDGroup 3 DTD Adapted from slides ©Jayavel Shanmugasundaram

Lecture Storing and Querying XML Documents Relational Database System XML Translation Layer XML Schema Relational Schema Translation Information XML Documents Tuples XML Query SQL Query Relational Result XML Result Adapted from slides ©Jayavel Shanmugasundaram

Lecture XPERANTO XML view over tables to reconstruct shredded XML documents Query Processor for XML views of Relational Data XML Document Shredder Relational Schema Generator Relational Schema Information Create XML Document Repository Store XML Documents Query over Stored XML Documents Create tablesStore rows in tables Query over tables Relational Database System Table 1 Table n

Lecture XML Query Processing: Outline XML vs. Relational XML on Relational DB: Shanmugasundaram et al, “Relational Databases for Querying XML Documents: Limitations and Opportunities”, VLDB’99, plus follow on papers LegoDB, STORED, Edge (2 slides) To be continued…

Lecture LegoDB [Bohannon et al, ICDE’02] An optimization approach: automatically explores a space of possible mappings selects the mapping which has the lowest cost for a given application Important features: Application-driven: takes into account schema, data statistics, and query workload Logical/physical independence: interface is XML-based (XML Schema, XQuery, XML data statistics) Leverage existing technology: XML standards; XML-specific operations for generating space of mappings; relational optimizer for evaluating configurations Adapted from slides ©Juliana Freire

Lecture But What If There’s No Schema? Revert to one row per edge STORED [Deutsch, Fernandez, Suciu, Sigmod’99] Looks at data, finds highly supported patterns for tables [Florescu, Kossman, Data Engineering Bulletin, 1999] Id Name 0 ParentIdType PurchaseOrdernullElement null 1 ValueOrdinal null AttributeId200I00 2 AttributeCustomerCars R Us10 3 ElementDatenull20 4 ElementDay ElementMonthJune13 6 ElementYear ……………… Adapted from slides ©Jayavel Shanmugasundaram

Lecture XML Query Processing: Outline XML vs. Relational XML on Relational DB: Shanmugasundaram et al, “Relational Databases for Querying XML Documents: Limitations and Opportunities”, VLDB’99, plus follow on papers LegoDB, STORED, Edge (2 slides) To be continued… (Thurs) Updates, Native XML DBMS Also in next lecture: Historical queries