Friday, September 4 th, 2009 The Systems Group at ETH Zurich XML and Databases Exercise Session 5 courtesy of Ghislain Fourny/ETH © Department of Computer Science | ETH Zürich
Friday, September 4 th, 2009 PUL Algebra Theory Good to know: XML and Data Models 2 This is text. Physical view (syntax)
Friday, September 4 th, 2009 PUL Algebra Theory Good to know: XML and Data Models 3 This is text. Physical view (syntax) Logical view (data model) a a d d This is c c b b.. text e:f
Friday, September 4 th, 2009 PUL Algebra Theory Data models around 4
Friday, September 4 th, 2009 PUL Algebra Theory Data models around Information Set (Infoset) Post Schema-Validation Infoset (PSVI) XQuery 1.0 and XPath 2.0 Data Model (XDM) 5
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 1: XQuery Data Model Main features of the XDM: 6
Friday, September 4 th, 2009 PUL Algebra Theory Main features of the XDM: Instances are sequences of items An item is an atomic item or a node among seven kinds (,,,,, ) Exercise 1: XQuery Data Model 7 =( )((, ), )=(,, )
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 1: XQuery Data Model Main features of the XDM: Instances are sequences of items An item is an atomic item or a node among seven kinds: -Document node -Element node -Attribute node -Text node -Comment node -Processing instruction node -Namespace node 8 With XPath 2.0, namespace nodes become deprecated, and they are not available at all in XQuery 1.0.
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 1: XQuery Data Model Main features of the XDM: Instances are sequences of items An item is an atomic item or a node among seven kinds Why? Allows to decompose and reuse expressions XML is ordered! 9 In XQuery, anything that goes into or comes out of an expression is a sequence of items!
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 1: XQuery Data Model 10 In XQuery, anything that goes into or comes out of an expression is a sequence of items! Expression for if then else where order by while any every let return exit with = +
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 1: XQuery Data Model Differences with Infoset: 11
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 1: XQuery Data Model Differences with Infoset: Sequences of items (again!) Simple and complex types – (almost) no types in Infoset If no validation, xs:untyped 12 The only types in Infoset are attribute types from the DTD.
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Infoset 13
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: The 11 Information Items Document Element Attribute PI Namespace Character Comment Unexpanded Entity Reference DTD Unparsed Entity Notation 14
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: The 11 Information Items Document Element Attribute Processing Instruction Namespace Character Comment Unexpanded Entity Reference DTD Unparsed Entity Notation 15 These seven ones are easy to remember, right?
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Document Information Items 16 Document Information Item [children] Element Information Item metadata [document element] Element Information Item metadata [notations] empty [unparsed entities] empty [base URI ].../info.xml [character encoding scheme] UTF ‐ 8 [standalone] [version] 1.0 doc
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Element Information Items 17 Element Information Item metadata [namespace name] [local name] metadata [prefix] dc [children] Element Information Items title, publisher [attributes] empty [namespace attributes] Attribute Information Item namespacedc [in ‐ scope namespaces] Namespace Information Items namespacedcsystems, xml [base URI].../info.xml [parent] Document Information Item metadat a
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Element Information Items 18 Element Information Item title [namespace name] [local name] title [prefix] dc [children] Character Information Items, as many as characters in "Systems Group" [attributes] Attribute Information Items lang, year [namespace attributes] empty [in ‐ scope namespaces] Namespace Information Items namespacedcsystems, xml [base URI].../info.xml [parent] Element Information Item metadata title
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Attribute Information Items 19 Attribute Information Item namespacedc [namespace name] [local name] dc [prefix] xmlns [normalized value] [specified] true [attribute type] (no DTD!) [references] unknown [owner element] Element Information Item metadata namespacedc
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Attribute Information Items 20 Attribute Information Item lang [namespace name] [local name] lang [prefix] xml [normalized value] "en" [specified] true [attribute type] (no DTD!) [references] unknown [owner element] Element Information Item title lang
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Attribute Information Items 21 Attribute Information Item year [namespace name] empty [local name] year [prefix] empty [normalized value] "2008" [specified] true [attribute type] (no DTD!) [references] unknown [owner element] Element Information Item title year
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Namespace Information Items 22 Namespace Information Item namespacedcsystems [prefix] dc [namespace name] Namespace Information Item xml [prefix] xml [namespace name] namespacedcsystems xml
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 2: Infoset - the tree 23 doc metadat a namespacedc title namespacedcsystems xml ETH Zurich e:f publisher lang year Systems Group namespacedcsystems xml namespacedcsystems xml
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI What does PSVI have, which Infoset has not? 24
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI What does PSVI have, which Infoset has not? Schema Type information! 25
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 26 Target Namespace This is new. We have a target namespace for the schema. All global elements and types will be in this namespace
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 27 And with this attribute, even nested elements will be in the target namespace.
Friday, September 4 th, 2009 PUL Algebra Theory OXYGEN TUTORIAL 28
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 29
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 30
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 31
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 32
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 33
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 34
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 35
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 36
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 37
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: A schema 38
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 39 This is how you associate the schema with the document if there is a target namespace.
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 40 For metadata: [type definition type] [type definition namespace] [type definition anonymous] [type definition name]
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 41 For metadata: [type definition type] complex [type definition namespace] [type definition anonymous] false [type definition name] metadata ‐ type
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 42 For title: [type definition type] [type definition namespace] [type definition anonymous] [type definition name]
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 43 For title: [type definition type] complex [type definition namespace] [type definition anonymous] true [type definition name]
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 44 For publisher: [type definition type] [type definition namespace] [type definition anonymous] [type definition name]
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 45 For publisher: [type definition type] simple [type definition namespace] [type definition anonymous] false [type definition name] string
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 46 For xml:lang [type definition type] [type definition namespace] [type definition anonymous] [type definition name]
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 47 For xml:lang [type definition type] [type definition namespace] [type definition anonymous] [type definition name] From xml.xsd
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 48 For xml:lang [type definition type] simple [type definition namespace] [type definition anonymous] true [type definition name]
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 49 For year: [type definition type] [type definition namespace] [type definition anonymous] [type definition name]
Friday, September 4 th, 2009 PUL Algebra Theory Exercise 3: PSVI 50 For year: [type definition type] simple [type definition namespace] [type definition anonymous] false [type definition name] integer
Friday, September 4 th, 2009 PUL Algebra Theory What you should remember today Data models interpret XML data at a logical level Infoset (with or without a DTD) PSVI (after an XML Schema validation) XDM (before or after XML Schema validation) Now that we have data, we can process it! 51
Friday, September 4 th, 2009 PUL Algebra Theory 52 Hope to see you next week!