Managing XML and Semistructured Data Lecture 14: Constraints and Keys Prof. Dan Suciu Spring 2001.

Slides:



Advertisements
Similar presentations
Informed search algorithms
Advertisements

2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions.
1 Web Data Management Path Expressions. 2 In this lecture Path expressions Regular path expressions Evaluation techniques Resources: Data on the Web Abiteboul,
Managing XML and Semistructured Data Lecture 12: XML Schema Prof. Dan Suciu Spring 2001.
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Managing XML and Semistructured Data Lecture 8: Query Languages - XML-QL Prof. Dan Suciu Spring 2001.
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
Managing XML and Semistructured Data Lecture : Indexes.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
Keys For XML Peter Buneman Susan Davidson Wenfei Fan Carmem Hara Wang Chiew Tan.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Managing XML and Semistructured Data Lecture 6: XPath Prof. Dan Suciu Spring 2001.
Managing XML and Semistructured Data Lecture 16: Indexes Prof. Dan Suciu Spring 2001.
Managing XML and Semistructured Data
Managing XML and Semistructured Data Lecture 1: Preliminaries and Overview Prof. Dan Suciu Spring 2001.
XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington.
1 Keys for XML Peter Buneman, Susan Davidson, Wenfei Fan Carmem Hara, Wang-Chiew Tan Carmem Hara, Wang-Chiew Tan University of Pennsylvania Temple University.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
4/20/2017.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 1. Semi Structured Data Object Exchange Model.
SECOND-ORDER DIFFERENTIAL EQUATIONS
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
DASWIS NF-SS: A Normal Form for Semistructured Schemata Xiaoying Wu, Tok Wang Ling, Sin Yeung Lee, Mong Li Lee National University of Singapore.
Chapter 2 Adapted from Silberschatz, et al. CHECK SLIDE 16.
Multiway Trees. Trees with possibly more than two branches at each node are know as Multiway trees. 1. Orchards, Trees, and Binary Trees 2. Lexicographic.
XML Data Management 10. Deterministic DTDs and Schemas Werner Nutt.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Identity Constraints.
Web Data Management Indexes. In this lecture Indexes –XSet –Region algebras –Indexes for Arbitrary Semistructured Data –Dataguides –T-indexes –Index Fabric.
Managing XML and Semistructured Data Lecture 13: XDuce and Regular Tree Languages Prof. Dan Suciu Spring 2001.
1 Binary Trees Informal defn: each node has 0, 1, or 2 children Informal defn: each node has 0, 1, or 2 children Formal defn: a binary tree is a structure.
RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ A. ONUR DOĞUÇ
Lecture 6: XML Query Languages Thursday, January 18, 2001.
Logical Database Design (1 of 3) John Ortiz Lecture 6Logical Database Design (1)2 Introduction  The logical design is a process of refining DB schema.
CSE 636 Data Integration Fall 2006 XML Query Languages XPath.
More XML: semantics, DTDs, XPATH February 18, 2004.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#16: Schema Refinement & Normalization.
DAY 15: ACCESS CHAPTER 1 Rahul Kavi October 6,
Management of XML and Semistructured Data Lecture 11: Schemas Wednesday, May 2nd, 2001.
Theory of Computation, Feodor F. Dragan, Kent State University 1 TheoryofComputation Spring, 2015 (Feodor F. Dragan) Department of Computer Science Kent.
Management of XML and Semistructured Data Lecture 10: Schemas Monday, April 30, 2001.
XML and Database.
Chapter 2 Real Numbers and algebraic expressions ©2002 by R. Villar All Rights Reserved Re-engineered by Mistah Flynn 2015.
The relational model A data model (in general) : Integrated collection of concepts for describing data (data requirements). Relational model was introduced.
Lecture 7: Foundations of Query Languages Tuesday, January 23, 2001.
Understand Primary, Foreign, and Composite Keys Database Administration Fundamentals LESSON 4.2.
1 Finite Model Theory Lecture 12 Regular Expressions, FO k.
Author: Akiyoshi Matonoy, Toshiyuki Amagasay, Masatoshi Yoshikawaz, Shunsuke Uemuray.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
Lecture 9: Query Complexity Tuesday, January 30, 2001.
Indexing Structures for Files and Physical Database Design
XML path expressions CSE 350 Fall 2003.
Chapter 5: Inverse, Exponential, and Logarithmic Functions
Managing XML and Semistructured Data
Managing XML and Semistructured Data
Managing XML and Semistructured Data
Managing XML and Semistructured Data
Semi-Structured data (XML Data MODEL)
Finite Model Theory Lecture 6
Lecture 10: Query Complexity
Lecture 9: XML Monday, October 17, 2005.
XML Constraints Constraints are a fundamental part of the semantics of the data; XML may not come with a DTD/type – thus constraints are often the only.
Wednesday, May 29, 2002 XML Storage Final Review
Indexing 4/11/2019.
Chapter 5: Exponential and Logarithmic Functions
Semi-Structured data (XML)
Presentation transcript:

Managing XML and Semistructured Data Lecture 14: Constraints and Keys Prof. Dan Suciu Spring 2001

In this lecture Constraints and Keys –Path constraints on semistructured data –Relative path constraints –Proposals for Keys in XML –Keys and Schema Resources Keys for XML by Buneman, Davidson, Fan, Hara, Tan, in WWW10, 2001.Keys for XML Data on the Web Abiteboul, Buneman, Suciu : section 7.7

Path Constraints in Semistructured Data Regular Path Queries with Constraints, Abiteboul and Vianu, PODS’98 Problem: given a set of path constraints optimize regular path expressions Especially useful for DAGs, less clear for trees

Path Constraints Data instance I = rooted, edge-labeled graph Regular path query q = regular expression Evaluation: q(I) = a set of nodes

Path Constraints Path constraints: p = p’ p  p’ A data instance I satisfies p=p’ if p(I) = p’(I) A data instance I satisfies p  p’ if p(I)  p’(I) Notation: I |= p=p’ or I |= p  p’

Path Constraints Examples (_)*.home =  –Says: home points back to the root person.person  person –Says: persons may have other person links, but they only point to other persons person.(_)*.(name.lastname?) = cache46932 –Says that the path is stored in the cache

Path Constraints Problem: Given a set of path constraints, E: –p 1 =/  p 1 ’ –… –p k =/  p k ’ and given queries q, q’ decide whether E implies q =/  q’ –Formally: for every I, if I |= E, then I |= q =/  q’ Notation: E |= q =/  q’

Path Constraints Examples (_)*.home =  |= q = q’ where: –q = (home.person | home.company)*.address –q’ = (person | company).address Notice that q’ is much simpler ! person.(_)*.(name.lastname?) = cache46932 |= q = q’ where: –q = person.(_)*.(name.lastname?).address –q’ = cache46932.address

Path Constraints Solving the implication problem along four dimensions The set of constraints E consists of: –Word constraints only (i.e. no regular expressions) –Arbitrary regular path expressions The queries q, q’ are: –Words only (i.e. no regular path expressions) –Arbitrary regular path expressions

Path Constraints Given E a set of path constraints Rewrite system: –If p =/  p’ is in E, then p.r  p’.r, for any r The rewrite system is sound (WHY ??) Notice: If p =/  p’ is in E, then r.p  r.p’, is not necessarily sound (WHY ???)

Path Constraints Theorem If E consists of word constraints only, then  is complete Moreover: If q, q’ are path expression, can check in PTIME Otherwise, can check in PSPACE None of this is obvious… Theorem. In general can check E |= q = q’ in EXPSPACE

Relative Path Constraints Path constraints on semistructured and structured data, Buneman, Fan, Weinstein, PODS’98 Idea: –Path constraints always start from the root –Hence very limited –Generalize at some arbitrary node Note: paper uses slightly different notation…

Relative Path Constraints r s1c1s2 c2 “Smith”“Chem3” “Jones”“Phil4” Taking Enrolled Students CoursesStudents Courses Enrolled Taking

Relative Path Constraints  Students.Taking  Courses -1  Courses.Enrolled  Students -1 Students: Taking  Enrolled Courses: Enrolled  Taking Definition. Relative path constraint: a: b  c or a: b  c -1  x,y(a(root,x)  b(x,y)  c(x,y)) or  x,y(a(root,x)  b(x,y)  c(y,x))

Relative Path Constraints Implication problem: Given a set of relative path constraints E Given a path constraint a:b  c Check if E |= a:b  c Notice: here we restrict to word problems (are hard enough)

Relative Path Constraints Bad news: The implication problem is, in general, undecidable Still: it is decidable in particular cases, such as: –When all a’s in a:b  c have the same length This includes the word path constraints, when all a’s are equal to  –When all b’s have |b|  1

Keys in XML Schema Lawnmower Baby Monitor Lapis Necklace Sturdy Shelves Lawnmower Baby Monitor Lapis Necklace Sturdy Shelves XML: XML Schema:

Keys in XML Schema In general, two flavors: Note: all Xpath expressions “start” at the element currently being defined The fields must identify a single node

Keys in XML Schema Unique = guarantees uniqueness Key = guarantees uniqueness and existence All Xpath expressions are “restricted”: –/a/b | /a/c OK for selector” –//a/b/*/c OK for field –To “help the implementors” (???) Note: better than DTD’s ID mechanism

Keys in XML Schema Examples Recall: must have A single forename, Single surname

Foreign Keys in XML Schema Examples

Another Proposal for Keys Keys for XML, Buneman, Davidson, Fan, Hara, Tan, in WWW’10, May, Cleaner definition Extends with relative keys Addresses satisfiability problem

A key is q  {p 1, …, p k } An instance I satisfies the key, if: –  x 1, x 2  q(root) ((  z 1  p 1 (x 1 ).  z 2  p 1 (x 2 ). z 1 =z 2 ) ...  (  z 1  p k (x 1 ).  z 2  p k (x 2 ). z 1 =z 2 ))  x 1 = x 2 ) Another Proposal for Keys value equality node equality

Another Proposal for Keys Examples: //person  //person  {name} //person  {firstname, lastname} –What happens with multiple names ? //person  {  } //person  {} –What is the difference between these two ? //*  {id} –What happens if an id doesn’t have an id child ? persons w/o name OK no distinct persons that have same value at most one person it’s okay because id elements can have empty id

Another Proposal for Keys Intuition for q  {p 1, …, p k } If I have k values, z 1, …, z k, then there exists at most one x  q(root) s.t. z 1  p 1 (x), …, z k  p k (x) Think of retrieving x from z 1, …, z k, using a hash table

Another Proposal for Keys Some inference rules for keys q  {p 1, …, p k } is a key  q  {p 1, …, p n } is a key, for k  n (superset of key is always a key) q.q’  {p} is a key  q  {q’.p} is a key (property of trees)

Another Proposal for Keys Relative key: q: q’  {p 1, …, p k } An instance I satisfies the relative key, if  x  q(I), q’  {p 1, …, p k } is a key for the instance rooted at x

Another Proposal for Keys Examples /bible/book/chapter: verse  {number} /bible/book: chapter  {number} /bible: book  {name}

Another Proposal for Keys No relative keys in XML-Schema But could work around:

Combining Keys and Schemas On XML Integrity Constraints in the Presence of DTDs, Fan and Libkin, PODS’2001 Keys + DTDs sometimes imply unexpected facts Main story: implication is undecidable

Combining Keys and Schemas DB Graphics AI OS.... DB Graphics AI OS....

Combining Keys and Schemas Keys and foreign keys: Keys: –//teacher –//subject Foreign keys:  But this is impossible ! In general: undecidable to check if it is possible