1 Lecture 5: XML and XQuery. 2 Semistructured Data uAnother data model, based on trees. uMotivation: flexible representation of data. wOften, data comes.

Slides:



Advertisements
Similar presentations
2/10/05Salman Azhar: Database Systems1 XML Query Languages Salman Azhar XPATH XQUERY These slides use some figures, definitions, and explanations from.
Advertisements

XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Introduction to XML, XPath, & XQuery CS186, Fall 2005 R &G - Chapters 7-27 Bill Gates, The Revolution, and a Network of Trees ( based on a true story)
1 Part 3: Query Languages Managing XML and Semistructured Data.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27.
CSE 636 Data Integration XML Semistructured Data Document Type Definitions.
Querying XML (cont.). Comments on XPath? What’s good about it? What can’t it do that you want it to do? How does it compare, say, to SQL?
1 XPath Path Expressions Conditions. 2 Paths in XML Documents uXPath is a language for describing paths in XML documents. uReally think of the semistructured.
1 CS145 Introduction About CS145 Relational Model, Schemas, SQL Semistructured Model, XML.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML.
Query Languages - XQuery Slides partially from Dan Suciu.
CSC056-Z1 – Database Management Systems – Vinnie Costa – Hofstra University1 Database Management Systems Session 10 Instructor: Vinnie Costa
Winter 2002Arthur Keller – CS 18018–1 Schedule Today: Mar. 12 (T) u Semistructured Data, XML, XQuery. u Read Sections Assignment 8 due. Mar. 14.
XML May 1 st, XML for Representing Data John 3634 Sue 6343 Dick 6363 John 3634 Sue 6343 Dick 6363 row name phone “John”3634“Sue”“Dick” persons.
1 XML Document Type Definitions XML Schema. 2 Well-Formed and Valid XML uWell-Formed XML allows you to invent your own tags. uValid XML conforms to a.
XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Fall 2001Arthur Keller – CS 18017–1 Schedule Nov. 27 (T) Semistructured Data, XML. u Read Sections Assignment 8 due. Nov. 29 (TH) The Real World,
1 XML Semistructured Data Extensible Markup Language Document Type Definitions.
1 XML Query Languages XPATH XQUERY. 2 XPATH and XQUERY uXPATH is a language for describing paths in XML documents. wReally think of the semistructured.
XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington.
1 XQuery Values FLWR Expressions Other Expressions.
Managing XML and Semistructured Data Lecture 2: XML Prof. Dan Suciu Spring 2001.
1 Query Languages for XML XPath XQuery XSLT. 2 The XPath/XQuery Data Model uCorresponding to the fundamental “relation” of the relational model is: sequence.
Xpath to XQuery February 23rd, Other Stuff HW 3 is out. Instructions for Phase 3 are out. Today: finish Xpath, start and finish Xquery. From Wednesday:
Querying XML February 12 th, Querying XML Data XPath = simple navigation through the tree XQuery = the SQL of XML XSLT = recursive traversal –will.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
1 XML Semistructured Data Extensible Markup Language Document Type Definitions.
XML Document Type Definitions XML Schema. Motivation for Semistructured data Serves as a model suitable for integration of databases Notations such as.
Xquery. Summary of XQuery FLWR expressions FOR and LET expressions Collections and sorting Resource W3C recommendation:
XML Semi-structured data XML Document Type Definitions (DTD)
Introduction to XQuery Resources: Official URL: Short intros:
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
CSCE 520- Relational Data Model Lecture 2. Relational Data Model The following slides are reused by the permission of the author, J. Ullman, from the.
Document Type Definitions XML Schema
End of XML February 19 th, FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
1 CE223 Database Systems Introduction DBMS Overview, Relational Model, Schemas, SQL Semistructured Model, XML.
1 CS1368 Introduction* Relational Model, Schemas, SQL Semistructured Model, XML * The slides in this lecture are adapted from slides used in Standford's.
1 XQuery uXQuery extends XPath to a query language that has power similar to SQL. uUses the same sequence-of-items data model. uXQuery is an expression.
Database Systems Part VII: XML Querying Software School of Hunan University
1 Introduction Relational Model, Schemas, SQL Semistructured Model, XML The slides were made by Jeffrey D. Ullman for the Introduction to Databases course.
PROCESSING AND QUERYING XML 1. ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 2.
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
Jeff Ullman: Introduction to XML 1 XML Semistructured Data Extensible Markup Language Document Type Definitions.
Semistructured Data Extensible Markup Language Document Type Definitions Zaki Malik November 04, 2008.
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
More XML: semantics, DTDs, XPATH February 18, 2004.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
Exam II Syllabus uStorage & Buffer Management uIndexing: Btrees & Hash uMulti-dimensional Indexing uQuery processing (relational ops) uQuery optimization.
CSCE 520- Relational Data Model Lecture 2. Oracle login Login from the linux lab or ssh to one of the linux servers using your cse username and password.
XML May 6th, Instructor AnHai Doan Brief bio –high school in Vietnam & undergrad in Hungary –M.S. at Wisconsin –Ph.D. at Washington under Alon &
XML Query Languages XPATH XQUERY Zaki Malik November 11, 2008.
CSE 6331 © Leonidas Fegaras XQuery 1 XQuery Leonidas Fegaras.
XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections and sorting 2.
Lecture 17: XPath and XQuery Wednesday, Nov. 7, 2001.
1 CSE544: Lecture 7 XQuery, Relational Algebra Monday, 4/22/02.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
Query Languages for XML
Querying XML and Semistructured Data
XML: Schemas, Queries Wednesday, 4/17/2002
Lecture 12: XML, XPath, XQuery
Lecture 11: XML and Semistructured Data
Query Languages for XML
Presentation transcript:

1 Lecture 5: XML and XQuery

2 Semistructured Data uAnother data model, based on trees. uMotivation: flexible representation of data. wOften, data comes from multiple sources with differences in notation, meaning, etc. uMotivation: sharing of documents among systems and databases.

3 Graphs of Semistructured Data uNodes = objects. uLabels on arcs (attributes, relationships). uAtomic values at leaf nodes (nodes with no arcs out). uFlexibility: no restriction on: wLabels out of a node. wNumber of successors with a given label.

4 Example: Data Graph Bud A.B. Gold1995 MapleJoe’s M’lob beer bar manf servedAt name addr prize yearaward root The bar object for Joe’s Bar The beer object for Bud Notice a new kind of data.

5 XML uXML = Extensible Markup Language. uWhile HTML uses tags for formatting (e.g., “italic”), XML uses tags for semantics (e.g., “this is an address”). uKey idea: create tag sets for a domain (e.g., genomics), and translate all data into properly tagged XML documents.

6 Well-Formed and Valid XML uWell-Formed XML allows you to invent your own tags. wSimilar to labels in semistructured data. uValid XML involves a DTD (Document Type Definition), a grammar for tags.

7 Well-Formed XML uStart the document with a declaration, surrounded by. uNormal declaration is:  “Standalone” = “no DTD provided.” uBalance of document is a root tag surrounding nested tags.

8 Tags uTags, as in HTML, are normally matched pairs, as …. uTags may be nested arbitrarily. uXML tags are case sensitive.

9 Example: Well-Formed XML Joe’s Bar Bud 2.50 Miller 3.00 … A NAME subobject A BEER subobject

10 XML and Semistructured Data uWell-Formed XML with nested tags is exactly the same idea as trees of semistructured data. uWe shall see that XML also enables nontree structures, as does the semistructured data model.

11 Example uThe XML document is: Joe’s Bar Bud2.50Miller3.00 PRICE BAR BARS NAME... BAR PRICE NAME BEER NAME

12 DTD Structure [ ( )>... more elements... ]>

13 DTD Elements uThe description of an element consists of its name (tag), and a parenthesized description of any nested tags. wIncludes order of subtags and their multiplicity. uLeaves (text elements) have #PCDATA (Parsed Character DATA ) in place of nested tags.

14 Example: DTD <!DOCTYPE BARS [ ]> A BARS object has zero or more BAR’s nested within. A BAR has one NAME and one or more BEER subobjects. A BEER has a NAME and a PRICE. NAME and PRICE are text.

15 Element Descriptions uSubtags must appear in order shown. uA tag may be followed by a symbol to indicate its multiplicity. w* = zero or more. w+ = one or more. w? = zero or one. uSymbol | can connect alternative sequences of tags.

16 Example: Element Description uA name is an optional title (e.g., “Prof.”), a first name, and a last name, in that order, or it is an IP address: <!ELEMENT NAME ( (TITLE?, FIRST, LAST) | IPADDR )>

17 Use of DTD’s 1.Set standalone = “no”. 2.Either: a)Include the DTD as a preamble of the XML document, or b)Follow DOCTYPE and the by SYSTEM and a path to the file where the DTD can be found.

18 Example (a) <!DOCTYPE BARS [ ]> Joe’s Bar Bud 2.50 Miller 3.00 … The DTD The document

19 Example (b) uAssume the BARS DTD is in file bar.dtd. Joe’s Bar Bud 2.50 Miller 3.00 … Get the DTD from the file bar.dtd

20 Attributes uOpening tags in XML can have attributes. uIn a DTD, declares an attribute for element E, along with its datatype.

21 Example: Attributes  Bars can have an attribute kind, a character string describing the bar. Character string type; no tags Attribute is optional opposite: #REQUIRED

22 Example: Attribute Use uIn a document that allows BAR tags, we might see: Akasaka Sapporo Note attribute values are quoted

23 ID’s and IDREF’s uAttributes can be pointers from one object to another. wCompare to HTML’s NAME = “foo” and HREF = “#foo”. uAllows the structure of an XML document to be a general graph, rather than just a tree.

24 Creating ID’s uGive an element E an attribute A of type ID. uWhen using tag in an XML document, give its attribute A a unique value. uExample:

25 Creating IDREF’s uTo allow objects of type F to refer to another object with an ID attribute, give F an attribute of type IDREF. uOr, let the attribute have type IDREFS, so the F –object can refer to any number of other objects.

26 Example: ID’s and IDREF’s uLet’s redesign our BARS DTD to include both BAR and BEER subelements.  Both bars and beers will have ID attributes called name.  Bars have SELLS subobjects, consisting of a number (the price of one beer) and an IDREF theBeer leading to that beer.  Beers have attribute soldBy, which is an IDREFS leading to all the bars that sell it.

27 The DTD <!DOCTYPE BARS [ ]> Beer elements have an ID attribute called name, and a soldBy attribute that is a set of Bar names. SELLS elements have a number (the price) and one reference to a beer. Bar elements have name as an ID attribute and have one or more SELLS subelements. Explained next

28 Example Document … <BEER name = “Bud” soldBy = “JoesBar SuesBar …”/> …

29 Empty Elements uWe can do all the work of an element in its attributes. wLike BEER in previous example.  Another example: SELLS elements could have attribute price rather than a value that is a price.

30 Example: Empty Element uIn the DTD, declare: uExample use: Note exception to “matching tags” rule

31 XPath Path Expressions Conditions

32 Paths in XML Documents uXPath is a language for describing paths in XML documents. uReally think of the semistructured data graph and its paths.

33 Example DTD <!DOCTYPE BARS [ ]>

34 Example Document … <BEER name = “Bud” soldBy = “JoesBar SuesBar … ”/> …

35 Path Descriptors uSimple path descriptors are sequences of tags separated by slashes (/). uIf the descriptor begins with /, then the path starts at the root and has those tags, in order. uIf the descriptor begins with //, then the path can start anywhere.

36 Value of a Path Descriptor uEach path descriptor, applied to a document, has a value that is a sequence of elements. uAn element is an atomic value or a node. uA node is matching tags and everything in between. wI.e., a node of the semistructured graph.

37 Example: /BARS/BAR/PRICE … <BEER name = “Bud” soldBy = “JoesBar SuesBar …”/> … /BARS/BAR/PRICE describes the set with these two PRICE elements as well as the PRICE elements for any other bars.

38 Example: //PRICE … <BEER name = “Bud” soldBy = “JoesBar SuesBar …”/>… //PRICE describes the same PRICE elements, but only because the DTD forces every PRICE to appear within a BARS and a BAR.

39 Wild-Card * uA star (*) in place of a tag represents any one tag. uExample: /*/*/PRICE represents all price objects at the third level of nesting.

40 Example: /BARS/* … <BEER name = “Bud” soldBy = “JoesBar SuesBar …”/> … /BARS/* captures all BAR and BEER elements, such as these.

41 Attributes uIn XPath, we refer to attributes by to their name. uAttributes of a tag may appear in paths as if they were nested within that tag.

42 Example: … <BEER name = “Bud” soldBy = “JoesBar SuesBar …”/> … selects all name attributes of immediate subelements of the BARS element.

43 Selection Conditions uA condition inside […] may follow a tag. uIf so, then only paths that have that tag and also satisfy the condition are included in the result of a path expression.

44 Example: Selection Condition u/BARS/BAR[PRICE < 2.75]/PRICE … The condition that the PRICE be < $2.75 makes this price but not the Miller price satisfy the path descriptor.

45 Example: Attribute in Selection = “Miller”] … Now, this PRICE element is selected, along with any other prices for Miller.

46 Axes uIn general, path expressions allow us to start at the root and execute steps to find a sequence of nodes at each step. uAt each step, we may follow any one of several axes. uThe default axis is child:: --- go to all the children of the current set of nodes.

47 Example: Axes u/BARS/BEER is really shorthand for /BARS/child::BEER. is really shorthand for the attribute:: axis. wThus, = “Bud” ] is shorthand for /BARS/BEER[attribute::name = “Bud”]

48 More Axes uSome other useful axes are: 1.parent:: = parent(s) of the current node(s). 2.descendant-or-self:: = the current node(s) and all descendants. wNote: // is really shorthand for this axis. 3.ancestor::, ancestor-or-self, etc.

49 XQuery Values FLWR Expressions Other Expressions

50 XQuery uXQuery extends XPath to a query language that has power similar to SQL. uXQuery is an expression language. wLike relational algebra --- any XQuery expression can be an argument of any other XQuery expression. wUnlike RA, with the relation as the sole datatype, XQuery has a subtle type system.

51 The XQuery Type System 1.Atomic values : strings, integers, etc. uAlso, certain constructed values like true(), date(“ ”). 2.Nodes. uSeven kinds. uWe’ll only worry about four, on next slide.

52 Some Node Types 1.Element Nodes are like nodes of semistructured data. uDescribed by !ELEMENT declarations in DTD’s. 2.Attribute Nodes are attributes, described by !ATTLIST declarations in DTD’s. 3.Text Nodes = #PCDATA. 4.Document Nodes represent files.

53 Example Document … <BEER name = “Bud” soldBy = “JoesBar SuesBar … ”/> …

54 Example Nodes BARS PRICE BEERBAR name = “JoesBar” theBeer = “Miller” theBeer = “Bud” SoldBy = “…” name = “Bud” Green = element Gold = attribute Purple = text

55 Document Nodes uForm: document(“ ”). uEstablishes a document to which a query applies. uExample: document(“/usr/ullman/bars.xml”)

56 FLWR Expressions 1.One or more for and/or let clauses. 2.Then an optional where clause. 3.A return clause.

57 Semantics of FLWR Expressions uEach for creates a loop. wlet produces only a local definition. uAt each iteration of the nested loops, if any, evaluate the where clause. uIf the where clause returns TRUE, invoke the return clause, and append its value to the output.

58 FOR Clauses for in,... uVariables begin with $. uA for-variable takes on each item in the sequence denoted by the expression, in turn. uWhatever follows this for is executed once for each value of the variable.

59 Example: FOR for $beer in return {$beer} u$beer ranges over the name attributes of all beers in our example document. uResult is a list of tagged names, like Bud Miller... “Expand the en- closed string by replacing variables and path exps. by their values.”

60 LET Clauses let :=,... uValue of the variable becomes the sequence of items defined by the expression. uNote let does not cause iteration; for does.

61 Example: LET let $d := document(“bars.xml”) let $beers := return {$beers} uReturns one element with all the names of the beers, like: Bud Miller …

62 Following IDREF’s uXQuery (but not XPath) allows us to use paths that follow attributes that are IDREF’s. uIf x denotes a sequence of one or more IDREF’s, then x =>y denotes all the elements with tag y whose ID’s are one of these IDREF’s.

63 Example uFind all the beer elements where the beer is sold by Joe’s Bar for less than uStrategy: 1.$beer will for-loop over all beer elements. 2.For each $beer, let $joe be either the Joe’s- Bar element, if Joe sells the beer, or the empty sequence if not. 3.Test whether $joe sells the beer for < 3.00.

64 Example: The Query let $d := document(”bars.xml”) for $beer in $d/BARS/BEER let $joe := let $joePrice := where $joePrice < 3.00 return {$beer} Attribute soldBy is of type IDREFS. Follow each ref to a BAR and check if its name is Joe’s Bar. Find that PRICE subelement of the Joe’s Bar element that represents whatever beer is currently $beer. Only pass the values of $beer, $joe, $joePrice to the RETURN clause if the string inside the PRICE element $joePrice is < 3.00

65 Order-By Clauses uFLWR is really FLWOR: an order-by clause can precede the return. uForm: order by wWith optional ascending or descending. uThe expression is evaluated for each output element. uDetermines placement in output sequence.

66 Example: Order-By uList all prices for Bud, lowest first. let $d := document(“bars.xml”) for $p in order by $p return { $p }

67 Predicates uNormally, conditions imply existential quantification. uExample: means “all the bars that have a name.” uExample: = means “Joe and Sue have at least one price in common.”

68 Path Expression Examples Doc = &o1 &o12&o24&o29 &o43 &o70&o71 &96 &243 &206 &25 “Serge” “Abiteboul” 1997 “Victor” “Vianu” paper book paper references author title year http author title publisher author title page firstnamelastname firstname lastname firstlast Bib &o44&o45&o46 &o47&o48 &o49 &o50 &o51 &o52 Bib/paper = Bib/book/publisher = Bib/paper/author/lastname = Bib/paper = Bib/book/publisher = Bib/paper/author/lastname = Note that order of elements matters!

69 FOR vs. LET: Example FOR $x IN document("bib.xml") /bib/book RETURN $x FOR $x IN document("bib.xml") /bib/book RETURN $x Returns:... LET $x IN document("bib.xml") /bib/book RETURN $x LET $x IN document("bib.xml") /bib/book RETURN $x Returns:...

70 XQuery Example 1 Find all book titles published after 1995: FOR $x IN document("bib.xml") /bib/book WHERE $x/year > 1995 RETURN $x/title FOR $x IN document("bib.xml") /bib/book WHERE $x/year > 1995 RETURN $x/title Result: abc def ghi

71 XQuery Example 2 For each author of a book by Morgan Kaufmann, list all books she published: FOR $a IN distinct( document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t FOR $a IN distinct( document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t distinct = a function that eliminates duplicates (after converting inputs to atomic values)

72 Results for Example 2 Jones abc def Smith ghi Observe how nested structure of result elements is determined by the nested structure of the query.

73 XQuery Example 3 count = (aggregate) function that returns the number of elements FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p For each publisher p - Let the list of books published by p be b Count the # books in b, and return p if b > 100

74 XQuery Example 4 Find books whose price is larger than average: LET $a=avg( document("bib.xml") /bib/book/price) FOR $b in document("bib.xml") /bib/book WHERE $b/price > $a RETURN $b LET $a=avg( document("bib.xml") /bib/book/price) FOR $b in document("bib.xml") /bib/book WHERE $b/price > $a RETURN $b

75 Collections in XQuery uOrdered and unordered collections w/bib/book/author = an ordered collection wDistinct(/bib/book/author) = an unordered collection uExamples: wLET $a = /bib/book  $a is a collection; stmt iterates over all books in collecion w$b/author  also a collection (several authors...) RETURN $b/author Returns a single collection!... However:

76 Collections in XQuery What about collections in expressions ? u$b/price  list of n prices u$b/price * 0.7  list of n numbers?? u$b/price * $b/quantity  list of n x m numbers ?? wValid only if the two sequences have at most one element wAtomization u$book1/author eq "Kennedy" - Value Comparison u$book1/author = "Kennedy" - General Comparison

77 Sorting in XQuery FOR $p IN distinct(document("bib.xml")//publisher) ORDERBY $p RETURN $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] ORDERBY $b/price DESCENDING RETURN $b/title, $b/price FOR $p IN distinct(document("bib.xml")//publisher) ORDERBY $p RETURN $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] ORDERBY $b/price DESCENDING RETURN $b/title, $b/price

78 Conditional Expressions: If-Then- Else FOR $h IN //holding ORDERBY $h/title RETURN $h/title, IF = "Journal" THEN $h/editor ELSE $h/author FOR $h IN //holding ORDERBY $h/title RETURN $h/title, IF = "Journal" THEN $h/editor ELSE $h/author

79 Existential Quantifiers FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title

80 Universal Quantifiers FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title

81 Other Stuff in XQuery uBefore and After wfor dealing with order in the input uFilter wdeletes some edges in the result tree uRecursive functions uNamespaces uReferences, links … uLots more stuff …

82 Appendix XML Schema and XQuery Data Model

83 XML Schema uIncludes primitive data types (integers, strings, dates, etc.) uSupports value-based constraints (integers > 100) uUser-definable structured types uInheritance (extension or restriction) uForeign keys uElement-type reference constraints

84 Sample XML Schema …

85 XML-Query Data Model u Describes XML data as a tree u Node ::= DocNode | ElemNode | ValueNode | AttrNode | NSNode | PINode | CommentNode | InfoItemNode | RefNode

86 XML-Query Data Model Element node (simplified definition): uelemNode : (QNameValue, {AttrNode }, [ ElemNode | ValueNode])  ElemNode uQNameValue = means “a tag name” Reads: “Give me a tag, a set of attributes, a list of elements/values, and I will return an element”

87 XML Query Data Model Example: <book price = “55” currency = “USD”> Foundations … Abiteboul Hull Vianu 1995 <book price = “55” currency = “USD”> Foundations … Abiteboul Hull Vianu 1995 book1= elemNode(book, {price2, currency3}, [title4, author5, author6, author7, year8]) price2 = attrNode(…) /* next */ currency3 = attrNode(…) title4 = elemNode(title, string9) … book1= elemNode(book, {price2, currency3}, [title4, author5, author6, author7, year8]) price2 = attrNode(…) /* next */ currency3 = attrNode(…) title4 = elemNode(title, string9) …

88

89 XQuery Values uItem = node or atomic value. uValue = ordered sequence of zero or more items. uExamples: 1.() = empty sequence. 2.(“Hello”, “World”) 3.(“Hello”, 2.50, 10)

90 Nesting of Sequences Ignored uA value can, in principle, be an item of another value. uBut nested list structures are expanded. uExample: ((1,2),(),(3,(4,5))) = (1,2,3,4,5) = 1,2,3,4,5. uImportant when values are computed by concatenating other values.

91 Effective Boolean Values uThe effective boolean value (EBV) of an expression is: 1.The actual value if the expression is of type boolean. 2.FALSE if the expression evaluates to 0, “” [the empty string], or () [the empty sequence]. 3.TRUE otherwise.

92 EBV Examples has EBV TRUE or FALSE, depending on whether the name attribute is ”JoesBar”. has EBV TRUE if some bar is named the Golden Rail, and FALSE if there is no such bar.

93 Boolean Operators uE 1 and E 2, E 1 or E 2, not(E ), if (E 1 ) then E 2 else E 3 apply to any expressions. uTake EBV’s of the expressions first. uExample: not(3 eq 5 or 0) has value TRUE. uAlso: true() and false() are functions that return values TRUE and FALSE.

94 Quantifier Expressions some $x in E 1 satisfies E 2 1.Evaluate the sequence E 1. 2.Let $x (any variable) be each item in the sequence, and evaluate E 2. 3.Return TRUE if E 2 has EBV TRUE for at least one $x. uAnalogously: every $x in E 1 satisfies E 2

95 Document Order uComparison by document order: >. uExample: << is true iff the Bud element appears before the Miller element in the document $d.

96 Set Operators uunion, intersect, except operate on sequences of nodes. wMeanings analogous to SQL. wResult eliminates duplicates. wResult appears in document order.

97 Other Operators uUse Fortran comparison operators to compare atomic values only. weq, ne, gt, ge, lt, le. uArithmetic operators: +, -, *, div, idiv, mod. wApply to any expressions that yield arithmetic or date/time values.