1 Lecture 12: XML Publishing, XML Storage Monday, October 24, 2005.

Slides:



Advertisements
Similar presentations
XML to Relational Database Mapping
Advertisements

XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Querying XML (cont.). Comments on XPath? What’s good about it? What can’t it do that you want it to do? How does it compare, say, to SQL?
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
1 Lecture 12: XQuery in SQL Server Monday, October 23, 2006.
1 Lecture 9: XQuery. 2 XQuery Motivation XPath expressivity insufficient –no join queries (as in SQL) –no changes to the XML structure possible –no quantifiers.
1 Managing XML and Semistructured Data Part 1: Preliminaries, Motivation and Overview Acknowledgement: Part of the materials in this set of XML slides.
1 COS 425: Database and Information Management Systems XML and information exchange.
Database Systems and XML David Wu CS 632 April 23, 2001.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
1 Lecture 4 XML/Xpath/XQuery Tuesday, January 30, 2007.
XML and Databases 198:541. XML Motivation  Huge amounts of unstructured data on the web: HTML documents  No structure information  Only format instructions.
End of SQL XML April 22 th, Null Values If x=Null then 4*(3-x)/7 is still NULL If x=Null then x=“Joe” is UNKNOWN Three boolean values: –FALSE =
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
XQuery Implementation in a Relational Database System Shankar Pal Istvan Cseri, Oliver Seeliger, Michael Rys, Gideon Schaller, Wei Yu, Dragan Tomic, Adrian.
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
Viewing relational data as XML Using Microsoft SQL Server.
4/20/2017.
8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
Lecture 7 of Advanced Databases XML Querying & Transformation Instructor: Mr.Ahmed Al Astal.
Maziar Sanaii Ashtiani – SCT – EMU, Fall 2011/12.
Lecture 6 of Advanced Databases XML Querying & Transformation Instructor: Mr.Eyad Almassri.
Online Autonomous Citation Management for CiteSeer CSE598B Course Project By Huajing Li.
Sofia, Bulgaria | 9-10 October Using XQuery to Query and Manipulate XML Data Stephen Forte CTO, Corzen Inc Microsoft Regional Director NY/NJ (USA) Stephen.
CSCE 520- Relational Data Model Lecture 2. Relational Data Model The following slides are reused by the permission of the author, J. Ullman, from the.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
Company LOGO OODB and XML Database Management Systems – Fall 2012 Matthew Moccaro.
End of XML February 19 th, FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
Physical DB Issues, Indexes, Query Optimisation Database Systems Lecture 13 Natasha Alechina.
Chapter 2 Adapted from Silberschatz, et al. CHECK SLIDE 16.
PHP and MySQL CS How Web Site Architectures Work  User’s browser sends HTTP request.  The request may be a form where the action is to call PHP.
1 What Is XML? eXtensible Markup Language for data –Standard for publishing and interchange –“Cleaner” SGML for the Internet Applications: –Data exchange.
Module 18 Querying XML Data in SQL Server® 2008 R2.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
Lecture 5: XML Tuesday, January 16, Outline XML, DTDs (Data on the Web, 3.1) Semistructured data in XML (3.2) Exporting Relational Data in XML (8.3.1)
Lecture A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 24 – Part 2 XML Query Processing Phil Gibbons April.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
More XML: semantics, DTDs, XPATH February 18, 2004.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Mapping RDB Schema to.
XML e X tensible M arkup L anguage (XML) By: Albert Beng Kiat Tan Ayzer Mungan Edwin Hendriadi.
Submitted To: Ms. Poonam Saini, Asst. Prof., NITTTR Submitted By: Rohit Handa ME (Modular) CSE 2011 Batch.
CSCE 520- Relational Data Model Lecture 2. Oracle login Login from the linux lab or ssh to one of the linux servers using your cse username and password.
+ 1 XML eXtensible Markup Language. + 2 XML Lecture Adapted from the work of Dr. Praveen Madiraju of Marquette University.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
XML May 6th, Instructor AnHai Doan Brief bio –high school in Vietnam & undergrad in Hungary –M.S. at Wisconsin –Ph.D. at Washington under Alon &
1 CSE544 XML, XQuery Wednesday, April 5, Announcements Project Ideas are posted (to discuss in class) Groups due today Proposals due Monday Two.
Module 3: Using XML. Overview Retrieving XML by Using FOR XML Shredding XML by Using OPENXML Introducing XQuery Using the xml Data Type.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
1 CSE544 XML, XQuery Monday+Wednesday, April 11+13, 2004.
XML to Relational Database Mapping
XML: Extensible Markup Language
Lecture 15: Midterm Review
Lecture 11 XML Wednesday, Oct. 24, 2001.
Introduction to Database Systems CSE 444 Lecture 12 More Xquery and Xquery in SQL Server April 25, 2008.
2/18/2019.
Lecture 9: XML Monday, October 17, 2005.
Wednesday, May 29, 2002 XML Storage Final Review
Lecture 8: XML Data Wednesday, October
Wednesday, May 22, 2002 XML Publishing, Storage
Introduction to Database Systems CSE 444 Lecture 10 XML
Lecture 12: XQuery in SQL Server
Introduction to Database Systems CSE 444 Lecture 12 Xquery in SQL Server October 22, 2007.
Lecture 11: XML and Semistructured Data
Lecture 14: XML Publishing & Storage Midterm Review
Lecture 13: XQuery XML Publishing, XML Storage
Presentation transcript:

1 Lecture 12: XML Publishing, XML Storage Monday, October 24, 2005

2 XML from/to Relational Data XML publishing: –relational data  XML XML storage: –XML  relational data

3 Client/server DB Apps Relational Database Application Network Tuple streams SQL

4 XML Publishing Relational Database Application Web XML publishing Tuple streams XML SQL Xpath/ XQuery

5 XML Publishing Relational schema: Student(sid, name, address) Course(cid, title, room) Enroll(sid, cid, grade) StudentCourse Enroll

6 XML Publishing Operating Systems MGH084 John Seattle 3.8 … Database EE045 Mary Shoreline 3.9 … … Operating Systems MGH084 John Seattle 3.8 … Database EE045 Mary Shoreline 3.9 … … Other representations possible too Group by courses: redundant representation of students

7 XML Publishing First thing to do: design the DTD:

8 { FOR $x IN /db/Course/row RETURN { $x/title/text() } { $x/room/text() } { FOR $y IN /db/Enroll/row[cid/text() = $x/cid/text()] $z IN /db/Student/row[sid/text() = $y/sid/text()] RETURN { $z/name/text() } { $z/address/text() } { $y/grade/text() } } { FOR $x IN /db/Course/row RETURN { $x/title/text() } { $x/room/text() } { FOR $y IN /db/Enroll/row[cid/text() = $x/cid/text()] $z IN /db/Student/row[sid/text() = $y/sid/text()] RETURN { $z/name/text() } { $z/address/text() } { $y/grade/text() } } Now we write an XQuery to export relational data  XML Note: result is is the right DTD

9 XML Publishing Query: find Mary’s grade in Operating Systems FOR $x IN /xmlview/course[title/text()=“Operating Systems”], $y IN $x/student/[name/text()=“Mary”] RETURN { $y/grade/text() } FOR $x IN /xmlview/course[title/text()=“Operating Systems”], $y IN $x/student/[name/text()=“Mary”] RETURN { $y/grade/text() } XQuery SELECT Enroll.grade FROM Student, Enroll, Course WHERE Student.name=“Mary” and Course.title=“OS” and Student.sid = Enroll.sid and Enroll.cid = Course.cid SQL Can be done automatically

10 XML Publishing How do we choose the output structure ? Determined by agreement with partners/users Or dictated by committees –XML dialects (called applications) = DTDs XML Data is often nested, irregular, etc No normal forms for XML

11 XML Publishing in MS SQL FOR XML RAW FOR XML AUTO FOR XML EXPLICIT (won’t show in class)

12 FOR XML RAW SELECT Student.name, Student.grade FROM Student FOR XML RAW SELECT Student.name, Student.grade FROM Student FOR XML RAW......

13 FOR XML RAW SELECT ‘<row name=“’ as a, Student.name, ‘”, grade=“’ b, Student.grade, ‘/> as c FROM Student SELECT ‘<row name=“’ as a, Student.name, ‘”, grade=“’ b, Student.grade, ‘/> as c FROM Student Almost the same thing:

14 FOR XML AUTO SELECT Student.name, Course.name FROM Student, Enroll, Course WHERE Student.sid = Enroll.sid and Enroll.cid = Course.cid FOR XML AUTO SELECT Student.name, Course.name FROM Student, Enroll, Course WHERE Student.sid = Enroll.sid and Enroll.cid = Course.cid FOR XML AUTO Generates hierarchy, based on the order in SELECT

15 …......

16 FOR XML AUTO SELECT Course.name, Student.name FROM Student, Enroll, Course WHERE Student.sid = Enroll.sid and Enroll.cid = Course.cid FOR XML AUTO SELECT Course.name, Student.name FROM Student, Enroll, Course WHERE Student.sid = Enroll.sid and Enroll.cid = Course.cid FOR XML AUTO

17 … See SQL Server documentation; try it.

18 XML Storage Most often the XML data is small –E.g. a SOAP message –Parsed directly into the application (DOM API) Sometimes XML data is large –need to store/process it in a database The XML storage problem: –How do we choose the schema of the database ?

19 XML Storage Three solutions: Schema derived from DTD Storing XML as a graph: “Edge relation” Store it as a BLOB –Simple, boring, inefficient –Won’t discuss in class

20 Designing a Schema from DTD Design a relational schema for: <!DOCTYPE company [ ]> <!DOCTYPE company [ ]>

21 Designing a Schema from DTD First, construct the DTD graph: company personproduct ssnname officephonepidpriceavail.descr. * * * We ignore the order

22 Designing a Schema from DTD Next, design the relational schema, using common sense. company personproduct ssnname officephonepidpriceavail.descr. * * * Person(ssn, name, office) Phone(ssn, phone) Product(pid, name, price, avail., descr.) Which attributes may be NULL ? (Look at the DTD)

23 Designing a Schema from DTD What happens to queries: FOR $x IN /company/product[description] RETURN { $x/name, $x/description } FOR $x IN /company/product[description] RETURN { $x/name, $x/description } SELECT Product.name, Product.description FROM Product WHERE Product.description IS NOT NULL SELECT Product.name, Product.description FROM Product WHERE Product.description IS NOT NULL

24 Storing XML as a Graph Sometimes we don’t have a DTD: How can we store the XML data ? Every XML instance is a tree Store the edges in an Edge table Store the #PCDATA in a Value table

25 Storing XML as a Graph db book publisher titleauthor titleauthor titlestate “Complete Guide to DB2” “Chamberlin”“Transaction Processing” “Bernstein”“Newcomer” “Morgan Kaufman” “CA” SourceTagDest 0db1 1book2 2title3 2author4 1book5 5title6 5author7... SourceVal 3Complete guide... 4Chamberlin 6... Edge Value Can be ANY XML data (don’t know DTD)

26 Storing XML as a Graph What happens to queries: FOR $x IN /db/book[author/text()=“Chamberlin”] RETURN $x/title FOR $x IN /db/book[author/text()=“Chamberlin”] RETURN $x/title db book author title “Chamberlin”Return value xdb xbook xauthorxtitle vauthor vtitle

27 Storing XML as a Graph What happens to queries: SELECT vtitle.value FROM Edge xdb, Edge xbook, Edge xauthor, Edge xtitle, Value vauthor, Value vtitle WHERE xdb.source = 0 and xdb.tag = ‘db’ and xdb.dest = xbook.source and xbook.tag = ‘book’ and xbook.dest = xauthor.source and xauthor.tag = ‘author’ and xbook.dest = xtitle.source and xtitle.tag = ‘title’ and xauthor.dest = vauthor.source and vauthor.value = ‘Chamberlin” and xtitle.dest = vtitle.source SELECT vtitle.value FROM Edge xdb, Edge xbook, Edge xauthor, Edge xtitle, Value vauthor, Value vtitle WHERE xdb.source = 0 and xdb.tag = ‘db’ and xdb.dest = xbook.source and xbook.tag = ‘book’ and xbook.dest = xauthor.source and xauthor.tag = ‘author’ and xbook.dest = xtitle.source and xtitle.tag = ‘title’ and xauthor.dest = vauthor.source and vauthor.value = ‘Chamberlin” and xtitle.dest = vtitle.source A 6-way join !!!

28 Storing XML as a Graph Edge relation summary: Same relational schema for every XML document: Edge(Source, Tag, Dest) Value(Source, Val) Generic: works for every XML instance But inefficient: –Repeat tags multiple times –Need many joins to reconstruct data

29 XML in SQL Srv 2005 Create tables with attributes of type XML Use Xquery in SQL queries Rest of the slides are from: Shankar Pal et al., Indexing XML data stored in a relational database, VLDB’2004 (recommended reading)

30 CREATE TABLE DOCS ( ID int primary key, XDOC xml) SELECT ID, XDOC.query(’ for $s in “ ”]//SECTION return {data($s/TITLE)} ') FROM DOCS SELECT ID, XDOC.query(’ for $s in “ ”]//SECTION return {data($s/TITLE)} ') FROM DOCS

31 XML Methods in SQL Query() = returns XML data type Value() = extracts scalar values Exist() = checks conditions on XML nodes Nodes() = returns a rowset of XML nodes that the Xquery expression evaluates to

32 Internal Storage XML is “shredded” as a table A few important ideas: –Dewey decimal numbering of nodes; store in clustered B-tree indes –Use only odd numbers to allow insertions –Reverse PATH-ID encoding, for efficient processing of postfix expressions like //a/b/c –Add more indexes, e.g. on data values

33 Bad Bugs Nobody loves bad bugs. Tree Frogs All right-thinking people love tree frogs.

34

35 Infoset Table

36 = “ ”]/SECTION SELECT SerializeXML (N2.ID, N2.ORDPATH) FROM infosettab N1 JOIN infosettab N2 ON (N1.ID = N2.ID) WHERE N1.PATH_ID = AND N1.VALUE = ' ' AND N2.PATH_ID = PATH_ID(BOOK/SECTION) AND Parent (N1.ORDPATH) = Parent (N2.ORDPATH) SELECT SerializeXML (N2.ID, N2.ORDPATH) FROM infosettab N1 JOIN infosettab N2 ON (N1.ID = N2.ID) WHERE N1.PATH_ID = AND N1.VALUE = ' ' AND N2.PATH_ID = PATH_ID(BOOK/SECTION) AND Parent (N1.ORDPATH) = Parent (N2.ORDPATH)