Presentation is loading. Please wait.

Presentation is loading. Please wait.

CIS550 Handout 7 Fall 2001 1 CIS 550 Handout 7 -- XPATH and XQuery.

Similar presentations


Presentation on theme: "CIS550 Handout 7 Fall 2001 1 CIS 550 Handout 7 -- XPATH and XQuery."— Presentation transcript:

1 CIS550 Handout 7 Fall 2001 1 CIS 550 Handout 7 -- XPATH and XQuery

2 CIS550 Handout 7 Fall 2001 2 URLs -- XPath http://www.w3.org/TR/xpath This is the “recommendation”. Dense. Few examples. Difficult to extract the “big picture” from the morass of detail http://www.zvon.org/xxl/XPathTutorial/ General/examples.html A tutorial with some simple examples. Maybe too simple. There are lots of tutorials on the web.

3 CIS550 Handout 7 Fall 2001 3 URLs -- XQuery http://www.w3.org/TR/xquery/ The basic recommendation. Plenty of examples, so work through these first. http://www.w3.org/TR/query-semantics/ A formal semantics for XQuery. Despite its forbidding title, it is remarkably readable. It also discusses a type system for XQuery. http://www.w3.org/TR/xmlquery-use-cases A bunch of example queries and their solution in XQuery (not surprising, since XQuery is Turing- complete!)

4 CIS550 Handout 7 Fall 2001 4 How to Identify nodes in a Tree -- Regular Path Expressions db emps depts mgr emp “Mary” “John” “Bill” name emp name In the normal syntax of regular expressions: db.emps.emp db.(depts.dept.mgr |emps.emp) db._*.name dept N.B. Regular path expressions have nothing to do with regular expresions in DTDs

5 CIS550 Handout 7 Fall 2001 5 More examples With the DTD: … the regular path expression (PERSON.MOTHER)* identifies matrilineal ancestry XPATH is a “superset of a subset” of regular path expressions. (It cannot express this set of nodes.) However, it is not limited to moving “down” the tree.

6 CIS550 Handout 7 Fall 2001 XPath Primary goal = to permit to access some nodes from a given document XPath main construct : axis navigation An XPath path consists of one or more navigation steps, separated by / A navigation step is a triplet: axis + node-test + list of predicates Examples –/descendant::node()/child::author –/descendant::node()/child::author[parent/attribute::booktitle = “XML”][2] XPath also offers some shortcuts –no axis means child –//  / descendant-or-self::node()/

7 CIS550 Handout 7 Fall 2001 7 XPath- child axis navigation author is shorthand for child::author. Examples: –aaa -- all the child nodes labeled aaa (1,3) –aaa/bbb -- all the bbb grandchildren of aaa children (4) –*/ bbb all the bbb grandchildren of any child (4,6) –. -- the context node –/ -- the root node aaa bbb cccaaa bbb ccc 1 23 4 567 context node

8 CIS550 Handout 7 Fall 2001 8 XPath- child axis navigation (cont) –/ doc -- all the doc children of the root –./ aaa -- all the aaa children of the context node (equivalent to aaa ) –text() -- all the text children of the context node –node() -- all the children of the context node (includes text and attribute nodes) –.. -- parent of the context node –.// -- the context node and all its descendants –// -- the root node and all its descendants –//para -- all the para nodes in the document –//text() -- all the text nodes in the document –@font the font attribute node of the context node

9 CIS550 Handout 7 Fall 2001 9 Predicates –[2] -- the second child node of the context node –chapter[5] -- the fifth chapter child of the context node –[last()] -- the last child node of the context node –chapter[title=“introduction”] -- the chapter children of the context node that have one or more title children whose string-value is “introduction” (the string-value is the concatenation of all the text on descendant text nodes) –person[.//firstname = “joe”] -- the person children of the context node that have in their descendants a firstname element with string-value “ Joe ” –From the XPath specification: NOTE: If $x is bound to a node set then $x = “foo” does not mean the same as not ($x != “foo”).

10 CIS550 Handout 7 Fall 2001 10 Unions of Path Expressions employee | consultant -- the union of the employee and consultant nodes that are children of the context node For some reason person/(employee|consultant) --as in regular path expressions -- is not allowed However person/node()[boolean(employee|consultant)] is allowed!! From the XPATH specification: –The boolean function converts its argument to a boolean as follows: a number is true if and only if it is neither positive or negative zero nor NaN a node-set is true if and only if it is non-empty a string is true if and only if its length is non-zero an object of a type other than the four basic types is converted to a boolean in a way that is dependent on that type

11 CIS550 Handout 7 Fall 2001 11 Axis navigation So far, nearly all our expressions have moved us down the by moving to child nodes. Exceptions were –. -- stay where you are –/ go to the root –// all descendants of the root –.// all descendants of the context node All other expressions have been abbreviations for child::… e.g. child::para. child :is an example of an axis XPath has several axes: ancestor, ancestor-or-self, attribute, child, descendant, descendant-or-self, following, following- sibling, namespace, parent, preceding, preceding-sibling, self –Some of these ( self, parent ) describe single nodes, others describe sequences of nodes.

12 CIS550 Handout 7 Fall 2001 XPath Navigation Axes (merci, Arnaud Sahuguet) ancestor descendant followingpreceding following-siblingpreceding-sibling child attribute namespace self

13 CIS550 Handout 7 Fall 2001 XPath abbreviated syntax (nothing)child:: @attribute:: ///descendant-or-self::node().self::node().//descendant-or-self::node..parent::node() /(document root)

14 CIS550 Handout 7 Fall 2001 14 XPath Reasonably widely adopted -- in XML-Schema and query languages. Neither more expressive nor less expressive than regular path expressions (can’t do (ab)* ) Particularly messy in some areas: –defining order of results –overloading of operations, e.g. [chapter/title = “Introduction”] why not [ “Introduction” IN chapter/title] ?

15 CIS550 Handout 7 Fall 2001 15 XQuery proposed by Chamberlin, Robbie and Florescu (from the authors’ slides) Leverage the most effective features of several existing and proposed query languages Design a small, clean, implementable language Cover the functionality required by all the XML Query use cases in a single language Write queries that fit on a slide

16 CIS550 Handout 7 Fall 2001 16 XQuery = XPath + “comprehension” syntax XML -QL Quilt where in in … construct bind variables use variables for x in y in … where return bind variables use variables

17 CIS550 Handout 7 Fall 2001 17 Examples from XQuery List the titles of books published by Morgan Kaufmann in 1998. FOR $b IN document("bib.xml")//book WHERE $b/publisher = "Morgan Kaufmann" AND $b/year = "1998" RETURN $b/title XPath expressions in orange

18 CIS550 Handout 7 Fall 2001 18 Examples from XQuery (cont) List each publisher and the average price of its books. FOR $p IN distinct(document("bib.xml")//publisher) LET $a := avg( document("bib.xml")//book[publisher = $p]/price) RETURN {$p/text()} {$a} LET binds a variable to a value. It does not cause an iteration. Does this create a (well-formed) XML document?

19 CIS550 Handout 7 Fall 2001 19 Examples from XQuery (cont) List the publishers who have published more than 100 books. { FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")//book[publisher = $p] WHERE count($b) > 100 RETURN $p } What about efficiency?

20 CIS550 Handout 7 Fall 2001 20 Invert the structure of the input document so that each distinct author element contains a sequence of book- titles. { FOR $a IN distinct(document("bib.xml")//author) RETURN {$a/text()} { FOR $b IN document("bib.xml")//book[author = $a] RETURN $b/title } } Examples from XQuery (cont)

21 CIS550 Handout 7 Fall 2001 21 More Examples (Quilt) (from http://db.cis.upenn.edu/Kweelt/useCases/R/Q1.qlt ) Relational data -- two DTDs: <!DOCTYPE items [ <!ELEMENT item_tuple (itemno, description, offered_by, start_date?, end_date?, reserve_price? )> ]> <!DOCTYPE bids [ ]>

22 CIS550 Handout 7 Fall 2001 22 The data 1001 Red Bicycle U01 1999-01-05 1999-01-20 40 1002 Motorcycle U02 1999-02-11 1999-03-15 500 … U02 1001 35 99-01-07 U04 1001 40 99-01-08 …

23 CIS550 Handout 7 Fall 2001 23 Query 1 FUNCTION date() { "1999-02-01" } ( FOR $i IN document("items.xml")//item_tuple WHERE $i/start_date LEQ date() AND $i/end_date GEQ date() AND contains($i/description, "Bicycle") RETURN $i/itemno, $i/description SORTBY (itemno) ) simple function definitions dates are formatted so that lexicographic ordering gives the right result

24 CIS550 Handout 7 Fall 2001 24 Output from Q1 1003 Old Bicycle 1007 Racing Bicycle

25 CIS550 Handout 7 Fall 2001 25 Query Q2 For all bicycles, list the item number, description, and highest bid (if any), ordered by item number. ( FOR $i IN document("items.xml")//item_tuple LET $b := document("bids.xml")//bid_tuple[itemno = $i/itemno] WHERE contains($i/description, "Bicycle") RETURN $i/itemno, $i/description, IF ($b) THEN NumFormat("#####.##", max(-1, $b/bid)) ELSE "" SORTBY (itemno) ) lots of coercion

26 CIS550 Handout 7 Fall 2001 26 Output from Q2 1001 Red Bicycle 55 1003 Old Bicycle 20 1007 Racing Bicycle 225 1008 Broken Bicycle

27 CIS550 Handout 7 Fall 2001 27 Query Q3 Find cases where a user with a rating worse (alphabetically greater than "C" ) offers an item with a reserve price of more than 1000. ( FOR $u IN document("users.xml")//user_tuple, $i IN document("items.xml")//item_tuple WHERE $u/rating GT 'C' AND $i/reserve_price GT 1000 AND $i/offered_by = $u/userid RETURN $u/name/text(), $u/rating/text(), $i/description/text(), $i/reserve_price ) Comparing sets with singletons Same rules as in XPath? In this case the DTD gives uniqueness


Download ppt "CIS550 Handout 7 Fall 2001 1 CIS 550 Handout 7 -- XPATH and XQuery."

Similar presentations


Ads by Google