Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,

Slides:



Advertisements
Similar presentations
XML e X tensible M arkup L anguage (XML) By: Albert Beng Kiat Tan Ayzer Mungan Edwin Hendriadi.
Advertisements

CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
1 Extensible Markup Language: XML HTML: portable, widely supported protocol for describing how to format data XML: portable, widely supported protocol.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
1 XEM: Managing the Evolution of XML Documents Author: Hong Su, Diane Kramer. Li Chen, Kajal Claypool and Elke A. Rundensteiner Presented by: Li Shuhong.
1 Extensible Markup Language: XML HTML: portable, widely supported protocol for describing how to format data XML: portable, widely supported protocol.
2005rel-xml-ii1 The SilkRoute system  The system goals  Scenario, examples  View Forests  View forest and query composition  View forest efficient.
1 COS 425: Database and Information Management Systems XML and information exchange.
Storing and Querying Ordered XML Using Relational Database System Swapna Dhayagude.
1 Statistics XML: –Altavista: 800,000 pages returned. –Amazon.com: 242 books. In comparison: –God: 12,000 books, 7 Million pages –Bible: 32,000 books,
XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Storage of XML Data XML data can be stored in –Non-relational data stores Flat files –Natural for storing XML –But has all problems discussed in Chapter.
1 New Ways of Querying the Web by Eliahu Brodsky and Alina Blizhovsky.
Database Systems and XML David Wu CS 632 April 23, 2001.
Summary. Chapter 9 – Triggers Integrity constraints Enforcing IC with different techniques –Keys –Foreign keys –Attribute-based constraints –Schema-based.
4/15/2002Bo Du 1 - Bo Du, April 15, XML - QL A Query Language for XML.
Page 1 Multidatabase Querying by Context Ramon Lawrence, Ken Barker Multidatabase Querying by Context.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
4/20/2017.
XML Fundementals XML vs.. HTML XML vs.. HTML XML Document (elements vs. attributes) XML Document (elements vs. attributes) XML and RDBMS XML and RDBMS.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
XML-QL A Query Language for XML Charuta Nakhe
IT420: Database Management and Organization XML 21 April 2006 Adina Crăiniceanu
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
CSCE 520- Relational Data Model Lecture 2. Relational Data Model The following slides are reused by the permission of the author, J. Ullman, from the.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs Sherif SakrSameh ElniketyYuxiong He NICTA & UNSW Sydney, Australia Microsoft Research Redmond,
Computing & Information Sciences Kansas State University Thursday, 15 Mar 2007CIS 560: Database System Concepts Lecture 24 of 42 Thursday, 15 March 2007.
1 What Is XML? eXtensible Markup Language for data –Standard for publishing and interchange –“Cleaner” SGML for the Internet Applications: –Data exchange.
Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,
RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ A. ONUR DOĞUÇ
Lecture 5: XML Tuesday, January 16, Outline XML, DTDs (Data on the Web, 3.1) Semistructured data in XML (3.2) Exporting Relational Data in XML (8.3.1)
Lecture A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 24 – Part 2 XML Query Processing Phil Gibbons April.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
XML and Database.
1 Indexing The syntax for creating a index is: CREATE [UNIQUE] INDEX index_name ON table_name (column1, column2,... column_n) [ COMPUTE STATISTICS ]; Why.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
XML e X tensible M arkup L anguage (XML) By: Albert Beng Kiat Tan Ayzer Mungan Edwin Hendriadi.
CSCE 520- Relational Data Model Lecture 2. Oracle login Login from the linux lab or ssh to one of the linux servers using your cse username and password.
+ 1 XML eXtensible Markup Language. + 2 XML Lecture Adapted from the work of Dr. Praveen Madiraju of Marquette University.
Structured Documents - XML and FrameMaker 7 Asit Pant.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Dr. N. MamoulisAdvanced Database Technologies1 Topic 8: Semi-structured Data In various application domains, the data are semi-structured; the database.
XML Databases. XML Like HTML –Tags –Fixed vocabulary of tags and fixed structure –Tags indicate formatting, not semantics Strict HTML – XHTML –Always.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Author: Akiyoshi Matonoy, Toshiyuki Amagasay, Masatoshi Yoshikawaz, Shunsuke Uemuray.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML Extensible Markup Language
XML Storage We must upgrade to XML. Everyone is talking about it. Well, that is going to cost us XXX on YYY and earn us WWW on ZZZ.
1 XML eXtensible Markup Language. 2 Introduction and Motivation Dr. Praveen Madiraju Modified from Dr.Sagiv’s slides.
XPERANTO: A Middleware for Publishing Object-Relational Data as XML Documents Michael Carey Daniela Florescu Zachary Ives Ying Lu Jayavel Shanmugasundaram.
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
XML: Extensible Markup Language
Prepared by : Ankit Patel (226)
Storing and Querying XML Documents Without Using Schema Information
XML Data Introduction, Well-formed XML.
eXtensible Markup Language (XML)
2/18/2019.
Wednesday, May 29, 2002 XML Storage Final Review
Wednesday, May 22, 2002 XML Publishing, Storage
Presentation transcript:

Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton, J Presenter: Mani Discussion: Vyas Ref: Paper, Wiki & older slides

Motivations: XML standard for representing Data on Internet XML DESCRIBES Data where as HTML describes HOW TO DISPLAY data What is the best way ? – Make use of existing RDBMS techniques for XML processing – Reinvent wheel: seriously ? (throw away 20 years of RDBMS research and develop semi-structured Query Language and Query Evaluation technique) We are sticking with Plan “A” for now

What`s in the paper ? What in world is XML, its Schema & ways to query them ? Algorithms to translate DTD (Document Type Descriptors) into relational format Translate XML queries into SQL queries and convert the result back to XML duh !

Approach Process DTD to generate relational Schema Parse XML (conforming to DTD) and load them into tuples of relational tables in a RDBMS Translate semi-structured Queries over XML into SQL over corresponding data in RDBMS Convert results back to XML

Discussion Question Duration : 5 minutes (Discuss as groups with 3 to 4 members per group) 1. If you were to build a XML database, which approach would you prefer? and Why? a) Start with a standard relational technology and try to remove these limitations. (OR) b) Start with a semi-structured system and try to add the power and sophistication of current relational DB. 2. What do you think are the possible reasons/uses for providing the output as a XML?

XML - Structure Root Element Attribute for “author” sub element Sub-elements Nested sub elements for Author

DTD Article should have one Title, one contactauthor one author but with 0 or more sub-elements Address can have an XML fragment

XML - QL Lorel XML-QL

Inlining Techniques to Store XML documents Basic Shared Hybrid

Basic For every element create a relation DTD for article (fig.2) DTD graph Join between article and author

Basic Pros – Good for queries such as “list all authors of books” Cons – Not suitable for “list all authors with first name = “xxxx” because filter & join should happen across 5 relations (author, name, firstname, lastname, address) –ref: fig.10 last 5 relation – Large number of relations are created since we create separate relation for every element

Shared Create relations only when – Nodes have in-degree > 1 – Nodes have in-degree = 0 – For mutual recursive nodes determine which node has strong connected components Create inline relations when – Nodes have in-degree = 1 Sample – fig 11

Shared Pros – Number of relations is reduced Cons – Reduces join by inlined relations Cons – Decreased performance over Basic

Hybrid Same as shared, but relations are inlined if node`s In-degree is > 1 but the nodes should not be recursive or be reached “*” node

Evaluation Efficiency of query processing Basic ran out of Virtual Memory because of number of relations, hence not considered for comparison For some DTDs Hybrid eliminates large number of Joins per query (due to recursions might be like author relation) For some DTDs Shared uses lesser Query than Hybrid (direct querying of relation w/o joins like title relation) Shared produced same# of joins per SQL as Hybrid for any path expressions (like author relation)

Discussion Question Duration : 3 Minutes (Discuss with the person sitting next to you) The paper uses average number of SQL joins required to process path expressions for a certain path length multiplied by average number of SQL query per XML query as a metric to evaluate the various techniques. Do you think this is a valid metric? Why or why not? Metric : Total average number of joins = average number of SQL queries * average number of joins in each SQL query

Conclusion Reuses mature Relational concepts instead of reinventing wheel Simple XML queries requires too many SQL or requires few SQL with many JOINS Extensions such as Support for Sets, Information Retrieval Style Indices (helps to index over ANY fields) to relational systems could effectively handle XML query workloads