Wednesday, May 29, 2002 XML Storage Final Review

Slides:



Advertisements
Similar presentations
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
Advertisements

Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Managing XML and Semistructured Data Lecture 8: Query Languages - XML-QL Prof. Dan Suciu Spring 2001.
1 Lecture 12: XQuery in SQL Server Monday, October 23, 2006.
1 Managing XML and Semistructured Data Part 1: Preliminaries, Motivation and Overview Acknowledgement: Part of the materials in this set of XML slides.
1 COS 425: Database and Information Management Systems XML and information exchange.
Storage of XML Data XML data can be stored in –Non-relational data stores Flat files –Natural for storing XML –But has all problems discussed in Chapter.
Summary. Chapter 9 – Triggers Integrity constraints Enforcing IC with different techniques –Keys –Foreign keys –Attribute-based constraints –Schema-based.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
...Looking back Why use a DBMS? How to design a database? How to query a database? How does a DBMS work?
XML: Extensible Markup Language FST-UMAC Gong Zhiguo.
Introduction to XQuery Resources: Official URL: Short intros:
Dan SuciuTools for XML Data Exchange Dan Suciu AT&T Labs Joint work with Mary Fernandez.
1 Maintaining Semantics in the Design of Valid and Reversible SemiStructured Views Yabing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science.
Midterm Exam Chapters 1,2,3,5, 6,7 (closed book) March 11, 2014.
1 What Is XML? eXtensible Markup Language for data –Standard for publishing and interchange –“Cleaner” SGML for the Internet Applications: –Data exchange.
Copyright © 2004 Pearson Education, Inc.. Chapter 26 XML and Internet Databases.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Exam and Lecture Overview.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
Lecture 5: XML Tuesday, January 16, Outline XML, DTDs (Data on the Web, 3.1) Semistructured data in XML (3.2) Exporting Relational Data in XML (8.3.1)
Lecture A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 24 – Part 2 XML Query Processing Phil Gibbons April.
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
More XML: semantics, DTDs, XPATH February 18, 2004.
1 Final Review Tuesday, March 6, The Final Date: Tuesday, March 13, 2007 Time: 6:30 - 8:30 Room: EE 037 You must come to campus Open book exam.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
S EMISTRUCTURED D ATA AND XML D ISCUSSION Q UESTION Think about your personal Itunes library. Should it be maintained in a database system?
CS4222 Principles of Database System
Diskusi-08 Jelaskan dan berikan contoh penggunaan theta join, equijoin, natural join, outer join, dan semijoin The slides for this text are organized into.
XML path expressions CSE 350 Fall 2003.
MODELS OF DATABASE AND DATABASE DESIGN
Latihan Answer the following questions using the relational schema from the Exercises at the end of Chapter 3: Create the Hotel table using the integrity.
Chapter 12 Information Systems.
CPSC-310 Database Systems
Lecture 15: Midterm Review
Latihan Create a separate table with the same structure as the Booking table to hold archive records. Using the INSERT statement, copy the records from.
Translation of ER-diagram into Relational Schema
Storing and Querying XML Documents Without Using Schema Information
COSC 6340 Projects & Homeworks Spring 2002
OrientX: an Integrated, Schema-Based Native XML Database System
(b) Tree representation
Team Project, Part II NOMO Auto, Part II IST 210 Section 4
Lecture 11 XML Wednesday, Oct. 24, 2001.
Introduction to Database Systems CSE 444 Lecture 23: Final Review
Lecture 12: XML, XPath, XQuery
Semi-Structured data (XML Data MODEL)
CMPT 354: Database System I
Alin Deutsch, University of Pennsylvania Mary Mernandez, AT&T Labs
Introduction to Database Systems CSE 444
Early Profile Pruning on XML-aware Publish-Subscribe Systems
2/18/2019.
Lecture 9: XML Monday, October 17, 2005.
Lecture 30: Final Review Wednesday, December 6, 2000.
Lecture 8: XML Data Wednesday, October
Lecture 24: Final Review Friday, March 10, 2006.
Lecture 30: Final Review Wednesday, December 10, 2003.
Query Optimization.
Introduction to Database Systems CSE 444 Lecture 23: Final Review
Wednesday, May 22, 2002 XML Publishing, Storage
Introduction to Database Systems CSE 444 Lecture 10 XML
Lecture 15: Querying XML Friday, October 27, 2000.
Final Review Friday, December 8, 2006.
Lecture 12: XQuery in SQL Server
Introduction to Database Systems CSE 444
Introduction to Database Systems CSE 444 Lecture 12 Xquery in SQL Server October 22, 2007.
Lecture 11: XML and Semistructured Data
Lecture 14: XML Publishing & Storage Midterm Review
Lecture 29: Final Review Wednesday, December 11, 2002.
Presentation transcript:

Wednesday, May 29, 2002 XML Storage Final Review Lecture 17 Wednesday, May 29, 2002 XML Storage Final Review

XML Storage in a Relational DB Use generic schema [Florescu, Kossman 1999] Use DTD to derive schema [Shanmugasundaram, et al. 1999] Use data mining to derive schema [Deutsch, Fernandez, Suciu 1999] Use the Path table [T.Amagasa, T.Shimura, S.Uemura 2001]

XML Stoarge: Ternary Relation [Florescu, Kossman 1999] Use generic relational schema (independent on the XML schema): Ref(source,label,dest) Val(node,value)

XML Stoarge: Ternary Relation Ref Val &o1 paper &o2 year title author author &o3 &o4 &o5 &o6 “The Calculus” “…” “…” “1986” [Florescu, Kossman 1999]

XML Stoarge: Ternary Relation Xpath to SQL translation: Xpath: SQL: /paper[year=“1986”]/author Select . . . . . . . . . . . . . . From . . . . . . . . . . . . . . . Where . . . . . . . . . . . . . .

XML Stoarge: Ternary Relation In practice may need more table: RefTag1(source,dest) RefTag2(source,dest) … IntVal(node,intVal) RealVal(node,realVal)

XML Storage: DTD to Schema [Christophides, Abiteboul, Cluet, Scholl 1994] [Shanmugasundaram, Tufte, He, Zhang, DeWitt, Naughton 1999] Idea: use the XML schema to derive the relational schema

XML Storage: DTD to Schema Relational schema: <!ELEMENT paper (title, author*, year?)> <!ELEMENT author (firstName, lastName)> Paper(pid, title, year) Author(aid, pid, firstName, lastName)

XML Storage: DTD to Schema Xpath to SQL translation: Xpath: SQL: /paper[year=“1986”]/author Select . . . . . . . . . . . . . . From . . . . . . . . . . . . . . . Where . . . . . . . . . . . . . .

XML Storage: Data Mining to Schema [Deutsch, Fernandez, Suciu 1999] Given: One large XML data instance No schema/DTD Query workload Problem: find a “good” relational schema for it Notice: even when a DTD is present, it may be imprecise: E.g. when a person may have 1-3 phones: phone*

XML Storage: Data Mining to Schema Paper1 Paper2 paper author title year fn ln [Deutsch, Fernandez, Suciu 1999]

XML Storage: Data Mining to Schema Xpath to SQL translation: Xpath: SQL: /paper[year=“1986”]/author

XML Storage: the Path Relation Method [T.Amagasa, T.Shimura, S.Uemura 2001] Store paths as strings Xpath expressions become the SQL like operator Additional information for parent/child, ancestor/descendant relationship

XML Storage: the Path Relation Method pathID Pathexpr 1 #/bib 2 #/bib#/paper 3 #/bib#/paper#/author 4 #/bib#/paper#/title 5 #/bib#/paper#/year 6 #/bib#/book#/author 7 #/bib#/book#/title 8 #/bib#/book#/publisher Path One entry for every path in the database Relatively small

XML Storage: the Path Relation Method Element NodeID pathID Start End ParentID 1 1000 - 2 5 200 3 8 20 4 21 30 31 100 6 101 150 7 151 180 300 500 . . . One entry for every element in the database Relatively large

XML Storage: the Path Relation Method NodeID Val 3 Smith 4 Vance 5 Tim 6 Wallace 7 The Best Cooking Book Ever 8 2 . . . Val One entry for every leaf in the database Relatively large

XML Storage: the Path Relation Method Xpath to SQL translation: Xpath: SQL: /bib/paper[year=“1986”]//figure Select . . . . . . . . . . . . . . From . . . . . . . . . . . . . . . Where . . . . . . . . . . . . . .

The Project What to do: A website. A short printed description. Could be a printout of the website. A presentation (this Friday). Due dates: soft deadline is Friday, 5/31 (for most of the project) hard deadline is Friday, 6/7 (for selected remaining experiments)

The Project What to address: What problem you are trying to solve ? Why is it interesting ? How did you approach it ? What did you achieve ? What did you implement, evaluate, learn ? Who did what in the project ?

The Project The Presentations: Friday, 1:30-2:20, Low 105 Following order: 1. 2. 3. 4.

The Final Monday, June 10, 2:30-4:30 Lowe 102 (this room) Open book exam !

The Final SQL XPath/XQuery Theory Database implementation XML processing

1. SQL Select-from-where Group-by, having Insert, delete, modify tables Create tables Need to understand E/R diagrams Excluded: constraints, triggers

2. XQuery Basic FLWR expressions Nested queries Joins Aggregates Please use correct syntax (slides often don’t do that) see XQuery’s use cases, www.w3.org/TR/xmlquery-use-cases Should be simpler than SQL

3. Theory First Order Logic Domain independence Expressive power Query complexity Conjunctive queries Containment Semijoin reduction

4. Database Implementation Data storage Indexing B+ trees Hash tables Execution Various algorithms and their complexity Optimization Know basic algebraic laws Dynamic programming

5. XML Processing Basic syntax (well-formed XML documents): Elements, attributes XML and semistructured data Schemas (DTDs) Publishing Define XML view in Xquery Translate XQuery to SQL Storing XML in relational databases

Grading Breakdown: Homework: 35% Project: 35% Final: 25% Intangibles: 5% Compared to the syllabus: more weight on the project, less on the final

...and finally ! Enjoy taking the final ! I enjoyed teaching this class 