XML, XML Schema, XPath and XQuery Query Languages

Slides:



Advertisements
Similar presentations
Web Data Management XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections.
Advertisements

XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Introduction to XML, XPath, & XQuery CS186, Fall 2005 R &G - Chapters 7-27 Bill Gates, The Revolution, and a Network of Trees ( based on a true story)
1 Part 3: Query Languages Managing XML and Semistructured Data.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Querying XML (cont.). Comments on XPath? What’s good about it? What can’t it do that you want it to do? How does it compare, say, to SQL?
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
1 Lecture 12: XQuery in SQL Server Monday, October 23, 2006.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML.
XQuery language Presented by: Tayeb sbihi supervised by: Dr. H. Haddouti.
1 COS 425: Database and Information Management Systems XML and information exchange.
Query Languages - XQuery Slides partially from Dan Suciu.
CSC056-Z1 – Database Management Systems – Vinnie Costa – Hofstra University1 Database Management Systems Session 10 Instructor: Vinnie Costa
XML May 1 st, XML for Representing Data John 3634 Sue 6343 Dick 6363 John 3634 Sue 6343 Dick 6363 row name phone “John”3634“Sue”“Dick” persons.
XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
Managing XML and Semistructured Data Lecture 2: XML Prof. Dan Suciu Spring 2001.
Xpath to XQuery February 23rd, Other Stuff HW 3 is out. Instructions for Phase 3 are out. Today: finish Xpath, start and finish Xquery. From Wednesday:
1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005.
Querying XML February 12 th, Querying XML Data XPath = simple navigation through the tree XQuery = the SQL of XML XSLT = recursive traversal –will.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
Xquery. Summary of XQuery FLWR expressions FOR and LET expressions Collections and sorting Resource W3C recommendation:
Lecture 21 XML querying. 2 XSL (eXtensible Stylesheet Language) In HTML, default styling is built into browsers as tag set for HTML is predefined and.
Introduction to XQuery Resources: Official URL: Short intros:
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
End of XML February 19 th, FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
CIS550 Handout 7 Fall CIS 550 Handout 7 -- XPATH and XQuery.
Lecture 6: XML Query Languages Thursday, January 18, 2001.
Database Systems Part VII: XML Querying Software School of Hunan University
CSE 636 Data Integration Fall 2006 XML Query Languages XPath.
PROCESSING AND QUERYING XML 1. ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 2.
WPI, MOHAMED ELTABAKH PROCESSING AND QUERYING XML 1.
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
1 XQuery Slides From Dr. Suciu. 2 XQuery Based on Quilt, which is based on XML-QL Uses XPath to express more complex queries.
XML May 6th, Instructor AnHai Doan Brief bio –high school in Vietnam & undergrad in Hungary –M.S. at Wisconsin –Ph.D. at Washington under Alon &
1 Lecture 13: XQuery XML Publishing, XML Storage Monday, October 28, 2002.
XPath --XML Path Language Motivation of XPath Data Model and Data Types Node Types Location Steps Functions XPath 2.0 Additional Functionality and its.
CSE 6331 © Leonidas Fegaras XQuery 1 XQuery Leonidas Fegaras.
1 Lecture 5: Relational Algebra and XML Monday, April 26th, 2004.
XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections and sorting 2.
Lecture 17: XPath and XQuery Wednesday, Nov. 7, 2001.
1 CSE544: Lecture 7 XQuery, Relational Algebra Monday, 4/22/02.
1 Lecture 12: XML, XPath, XQuery Friday, October 24, 2003.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
Lecture 14: Relational Algebra Projects XML?
Querying and Transforming XML Data
Management of XML and Semistructured Data
Lecture 11: Xpath/XQuery
Querying XML and Semistructured Data
Management of XML and Semistructured Data
XML: Schemas, Queries Wednesday, 4/17/2002
Lecture 12: XML, XPath, XQuery
Semi-Structured data (XML Data MODEL)
Introduction to Database Systems CSE 444 Lecture 12 More Xquery and Xquery in SQL Server April 25, 2008.
Lecture 9: XML Monday, October 17, 2005.
CSE 544: Lecture 5 XML 4/15/2002.
Lecture 8: XML Data Wednesday, October
Introduction to Database Systems CSE 444 Lecture 10 XML
Lecture 12: XQuery in SQL Server
Introduction to Database Systems CSE 444 Lecture 12 Xquery in SQL Server October 22, 2007.
Semi-Structured data (XML)
Processing and Querying XML
Lecture 11: XML and Semistructured Data
Presentation transcript:

XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington

XML Data

XML W3C standard to complement HTML origins: structured text SGML motivation: HTML describes presentation XML describes content HTML e XML subset SGML CS561 - Spring 2007.

From HTML to XML HTML describes the presentation CS561 - Spring 2007.

HTML <h1> Bibliography </h1> <p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 <p> <i> Data on the Web </i> Abiteboul, Buneman, Suciu <br> Morgan Kaufmann, 1999 CS561 - Spring 2007.

XML XML describes the content <bibliography> <book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography> XML describes the content CS561 - Spring 2007.

XML Terminology tags: book, title, author, … start tag: <book>, end tag: </book> elements: <book>…<book>,<author>…</author> elements are nested empty element: <red></red> abbrv. <red/> an XML document: single root element well formed XML document: if it has matching tags CS561 - Spring 2007.

XML: Attributes <book price = “55” currency = “USD”> <title> Foundations of Databases </title> <author> Abiteboul </author> … <year> 1995 </year> </book> attributes are alternative ways to represent data CS561 - Spring 2007.

More XML: Oids and References <person id=“o555”> <name> Jane </name> </person> <person id=“o456”> <name> Mary </name> <children idref=“o123 o555”/> </person> <person id=“o123” mother=“o456”><name>John</name> oids and references in XML are just syntax CS561 - Spring 2007.

So Far Differences between “xml data” versus “relational data” ? Data model? Typed? Homogeneity? Correctness? Usage/Purpose ? CS561 - Spring 2007.

“XML Data Model” Numerous competing models: Document Object Model (DOM): class hierarchy (node, element, attribute,…) defines API to inspect/modify the document XML query data model (formal) CS561 - Spring 2007.

XML Namespaces http://www.w3.org/TR/REC-xml-names name ::= [prefix:]localpart <book xmlns:isbn=“www.isbn-org.org/def”> <title> … </title> <number> 15 </number> <isbn:number> …. </isbn:number> </book> CS561 - Spring 2007.

XML Namespaces syntactic: <number> , <isbn:number> semantic: provide URL for “shared” schema <tag xmlns:mystyle = “http://…”> … <mystyle:title> … </mystyle:title> <mystyle:number> … </tag> defined here CS561 - Spring 2007.

So Far What are “namespaces” good for ? Are they typically available for relational databases? CS561 - Spring 2007.

Schemas for XML

DTD - Element Type Definitions <!ELEMENT paper (title,author*, year, (journal|conference) )> CS561 - Spring 2007.

XML Schemas generalizes DTDs (SGML derivative) now, instead uses XML syntax two main documents: structure and data types XML Schema more powerful but more complex CS561 - Spring 2007.

DTD: <!ELEMENT paper (title,author*,year, (journal|conference))> XML Schema <xsd:element name=“paper” type=“papertype”/> <xsd:complexType name=“papertype”> <xsd:sequence> <xsd:element name=“title” type=“xsd:string”/> <xsd:element name=“author” minOccurs=“0”/> <xsd:element name=“year”/> <xsd: choice> < xsd:element name=“journal”/> <xsd:element name=“conference”/> </xsd:choice> </xsd:sequence> </xsd:complexType </xsd:element> DTD: <!ELEMENT paper (title,author*,year, (journal|conference))> CS561 - Spring 2007.

So Far Differences between “xml schema” versus “relational schema” ? Purpose ? Do we need it ? Definition time? Strictness of typing ? Underlying model ? CS561 - Spring 2007.

Elements versus Types in XML Schema DTD: <!ELEMENT person (name, address) > <xsd:element name=“person”> <xsd:complexType> <xsd:sequence> <xsd:element name=“name” type=“xsd:string”/> <xsd:element name=“address” type=“xsd:string”/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“person” type=“ttt” /> <xsd:complexType name=“ttt”> <xsd:sequence> <xsd:element name=“name” type=“xsd:string”/> <xsd:element name=“address” type=“xsd:string”/> </xsd:sequence> </xsd:complexType> CS561 - Spring 2007.

Elements versus Types in XML Schema Simple types (integers, strings, ...) Complex types (regular expressions, like in DTDs) Element-type-element alternation: Root element has a complex type Complex type is a regular expression of elements Those elements have their complex types ... ... Leaves have simple types CS561 - Spring 2007.

Local and Global Types in XML Schema Local type: <xsd:element name=“person”> [define locally the person’s type] </xsd:element> Global type: <xsd:element name=“person” type=“ttt”/> <xsd:complexType name=“ttt”> [define here the type ttt] </xsd:complexType> Global types: can be reused in other elements CS561 - Spring 2007.

Local v.s. Global Elements in XML Schema Local element: <xsd:complexType name=“ttt”> <xsd:sequence> <xsd:element name=“address” type=“...”/>... </xsd:sequence> </xsd:complexType> Global element: <xsd:element name=“address” type=“...”/> <xsd:complexType name=“ttt”> <xsd:sequence> <xsd:element ref=“address”/> ... </xsd:sequence> </xsd:complexType> Global elements: like in DTDs CS561 - Spring 2007.

Regular Expressions in XML Schema Recall the element-type-element alternation: <xsd:complexType name=“....”> [regular expression on elements] </xsd:complexType> Regular expressions: <xsd:sequence> A B C </...> <xsd:choice> A B C </...> <xsd:group> A B C </...> <xsd:... minOccurs=“0” maxOccurs=“unbounded”> ..</...> <xsd:... minOccurs=“0” maxOccurs=“1”> ..</...> CS561 - Spring 2007.

Regular Expressions in XML Schema <xsd:sequence> A B C </...> = A B C <xsd:choice> A B C </...> = A | B | C <xsd:group> A B C </...> = (A B C) <xsd:... minOccurs=“0” maxOccurs=“unbounded”> ..</...> = (...)* <xsd:... minOccurs=“0” maxOccurs=“1”> ..</...> = (...)? CS561 - Spring 2007.

Derived Types by Extensions <complexType name="Address"> <sequence> <element name="street" type="string"/> <element name="city" type="string"/> </sequence> </complexType> <complexType name="USAddress"> <complexContent> <extension base= "ipo:Address"> <sequence> <element name="state" type="ipo:USState"/> <element name="zip" type="positiveInteger"/> </extension> </complexContent> Corresponds to inheritance CS561 - Spring 2007.

Key Constraints in XML

Keys in XML Schema XML: XML Schema for Key : <key name="NumKey"> <purchaseReport> <regions> <zip code="95819"> <part number="872-AA" quantity="1"/> <part number="926-AA" quantity="1"/> <part number="833-AA" quantity="1"/> <part number="455-BX" quantity="1"/> </zip> <zip code="63143"> <part number="455-BX" quantity="4"/> </regions> <parts> <part number="872-AA">Lawnmower</part> <part number="926-AA">Baby Monitor</part> <part number="833-AA">Lapis Necklace</part> <part number="455-BX">Sturdy Shelves</part> </parts> </purchaseReport> XML Schema for Key : <key name="NumKey"> <selector xpath="parts/part"/> <field xpath="@number"/> </key> CS561 - Spring 2007.

Keys in XML Schema In general, syntax is : Notes: <key name=“someDummyNameHere"> <selector xpath=“p"/> <field xpath=“p1"/> <field xpath=“p2"/> . . . <field xpath=“pk"/> </key> Notes: All XPath expressions “start” at the element currently being defined The fields must identify a single “node”. CS561 - Spring 2007.

Keys in XML Schema Unique = guarantees uniqueness Key = guarantees uniqueness and existence All XPath expressions are “restricted”: /a/b | /a/c OK for selector //a/b/*/c OK for field Note: better than DTD’s ID mechanism CS561 - Spring 2007.

Examples of Keys in XML Schema <key name="fullName"> <selector xpath=".//person"/> <field xpath="firstname"/> <field xpath="surname"/> </key> <unique name="nearlyID"> <selector xpath=".//*"/> <field xpath="@id"/> </unique> Note: Must have single firstname, Single surname CS561 - Spring 2007.

Foreign Keys in XML Schema Example <keyref name="personRef" refer="fullName"> <selector xpath=".//personPointer"/> <field xpath="@first"/> <field xpath="@last"/> </keyref> CS561 - Spring 2007.

So Far Differences between “keys/foreign-keys”in xml versus relational model? Purpose ? Underlying model ? CS561 - Spring 2007.

“The Basic Building Block” XPath “The Basic Building Block”

XPath Goal = Permit access some nodes from document XPath main construct : Axis navigation Navigation step : axis + node-test + predicates Examples descendant::node() child::author attribute::booktitle =“XML” CS561 - Spring 2007.

XPath XPath path consists of one or more navigation steps, separated by “/” Navigation step : axis + node-test + predicates Examples /descendant::node() /child::author /descendant::node() /child::author [parent /attribute::booktitle =“XML”][2] XPath offers shortcuts : no axis means child // º /descendant-or-self::node()/ CS561 - Spring 2007.

XPath- Child Axis Navigation author is shorthand for child::author. Examples: aaa -- all the children nodes labeled aaa aaa/bbb -- all the bbb grandchildren of aaa children */bbb all the bbb grandchildren of any child Notes: . -- the context node / -- the root node aaa bbb ccc 1 2 3 4 5 6 7 context node CS561 - Spring 2007.

XPath- Child Axis Navigation /doc -- all doc children of the root ./aaa -- all aaa children of the context node (equivalent to aaa) text() -- all text children of context node node() -- all children of the context node (includes text and attribute nodes) .. -- parent of the context node .// -- the context node and all its descendants // -- the root node and all its descendants //text() -- all the text nodes in the document CS561 - Spring 2007.

Predicates [2] -- the second child node of the context node chapter[5] -- the fifth chapter child of context node [last()] -- the last child node of the context node chapter[title=“introduction”] -- the chapter children of the context node that have one or more title children whose string-value is “introduction” (string-value is concatenation of all text on descendant text nodes) person[.//firstname = “joe”] -- the person children of the context node that have in their descendants a firstname element with string-value “Joe” CS561 - Spring 2007.

Axis navigation So far, our expressions have moved us down by moving to children nodes. Exceptions are : . stay where you are / go to the root // all descendants of the root .// all descendants of the context node CS561 - Spring 2007.

Axis navigation XPath has several axes: ancestor, ancestor-or-self, attribute, child, descendant, descendant-or-self, following, following-sibling, namespace, parent, preceding, preceding-sibling, self Some of these describe single nodes: self, parent Some describe sequences of nodes: All others CS561 - Spring 2007.

XPath Navigation Axes CS561 - Spring 2007. ancestor preceding-sibling following-sibling self child attribute preceding following namespace descendant CS561 - Spring 2007.

XPath Abbreviated Syntax (nothing) child:: @ attribute:: // /descendant-or-self::node() . self::node() .// descendant-or-self::node .. parent::node() / (document root) CS561 - Spring 2007.

So Far Differences between SQL and XPATH? What are similar query capabilities? What features does SQL have, but not XPATH? What features does XPATH support, but not SQL? Is XPath a full-fledged query language? CS561 - Spring 2007.

Query Languages - XQuery

Summary of XQuery Resources FLWR expressions FOR and LET expressions Collections and sorting Resources XQuery: A Query Language for XML Chamberlin, Florescu, et al. W3C recommendation: www.w3.org/TR/xquery/ CS561 - Spring 2007.

XQuery Designed based on Quilt (which is based on XML-QL) http://www.w3.org/TR/xquery/2/2001 XML Query data model (ordered) CS561 - Spring 2007.

FLWR (“Flower”) Expressions FOR ... LET... FOR... LET... WHERE... RETURN... CS561 - Spring 2007.

XQuery Find the titles of all books published after 1995: FOR $x IN document("bib.xml")/bib/book WHERE $x/year > 1995 RETURN $x/title How does result look like? CS561 - Spring 2007.

XQuery Find all book titles published after 1995: FOR $x IN document("bib.xml")/bib/book WHERE $x/year > 1995 RETURN $x/title Result: <title> abc </title> <title> def </title> <title> ghi </title> CS561 - Spring 2007.

XQuery Example FOR $a IN (document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN <result> $a, FOR $t IN /bib/book[author=$a]/title RETURN $t </result> CS561 - Spring 2007.

XQuery Example What is query result ? For each author of a book by Morgan Kaufmann, list all books she published: FOR $a IN (document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN <result> $a, FOR $t IN /bib/book[author=$a]/title RETURN $t </result> What is query result ? CS561 - Spring 2007.

XQuery Result: <result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <author> Smith </author> <title> ghi </title> CS561 - Spring 2007.

XQuery Example: Duplicates For each author of a book by Morgan Kaufmann, list all books she published: FOR $a IN distinct(document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN <result> $a, FOR $t IN /bib/book[author=$a]/title RETURN $t </result> distinct = a function that eliminates duplicates CS561 - Spring 2007.

Example XQuery Result Result: <result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <author> Smith </author> <title> ghi </title> CS561 - Spring 2007.

XQuery FOR $x in expr LET $x = expr binds $x to each element in the list expr Useful for iteration over some input list LET $x = expr binds $x to the entire list expr Useful for common subexpressions and for grouping and aggregations CS561 - Spring 2007.

XQuery with LET Clause <big_publishers> FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p </big_publishers> count = a (aggregate) function that returns number of elements CS561 - Spring 2007.

XQuery Find books whose price is larger than average: LET $a = avg(document("bib.xml")/bib/book/@price) FOR $b in document("bib.xml")/bib/book WHERE $b/@price > $a RETURN $b CS561 - Spring 2007.

FOR versus LET FOR Binds node variables  iteration LET Binds collection variables  one value CS561 - Spring 2007.

FOR v.s. LET FOR $x IN document("bib.xml")/bib/book Returns: <result> <book>...</book></result> ... FOR $x IN document("bib.xml")/bib/book RETURN <result> $x </result> Returns: <result> <book>...</book> <book>...</book> ... </result> LET $x := document("bib.xml")/bib/book RETURN <result> $x </result> CS561 - Spring 2007.

Collections in XQuery Ordered and unordered collections /bib/book/author = an ordered collection distinct(/bib/book/author) = an unordered collection LET $a = /bib/book  $a is a collection $b/author  a collection (several authors...) Returns: <result> <author>...</author> <author>...</author> ... </result> RETURN <result> $b/author </result> CS561 - Spring 2007.

XQuery Summary FOR-LET-WHERE-RETURN = FLWR FOR/LET Clauses List of tuples WHERE Clause List of tuples RETURN Clause Instances of XQuery data model CS561 - Spring 2007.

XQuery Some more query features CS561 - Spring 2007.

Sorting in XQuery <publisher_list> FOR $p IN distinct(document("bib.xml")//publisher) RETURN <publisher> <name> $p/text() </name> , FOR $b IN document("bib.xml")//book[publisher = $p] RETURN <book> $b/title , $b/@price </book> SORTBY (price DESCENDING) </publisher> SORTBY (name) </publisher_list> CS561 - Spring 2007.

Sorting in XQuery Sorting arguments: refer to name space of RETURN clause, not of FOR clause TIP: To sort on an element you don’t want to display, first return it, then remove it with an additional query. CS561 - Spring 2007.

If-Then-Else FOR $h IN //holding RETURN <holding> $h/title, IF $h/@type = "Journal" THEN $h/editor ELSE $h/author </holding> SORTBY (title) CS561 - Spring 2007.

Existential Quantifiers FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title CS561 - Spring 2007.

Universal Quantifiers FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title CS561 - Spring 2007.

So Far Similarities between SQL and XQuery? Differences between SQL and XQuery? CS561 - Spring 2007.

XML, XML Data Model XML Schema, XPath XQuery