Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML.

Similar presentations


Presentation on theme: "CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML."— Presentation transcript:

1 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML

2 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 358 Query Languages for XML XPath is a simple query language based on describing similar paths in XML documents. XQuery extends XPath in a style similar to SQL, introducing iterations, subqueries, etc. XPath and XQuery expressions are applied to an XML document and return a sequence of qualifying items. Items can be primitive values or nodes (elements, attributes, documents). The items returned do not need to be of the same type.

3 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 359 XPath A path expression returns the sequence of all qualifying items that are reachable from the input item following the specified path. A path expression is a sequence consisting of tags or attributes and special characters such as slashes (“/”). Absolute path expressions are applied to some XML document and returns all elements that are reachable from the document’s root element following the specified path. Relative path expressions are applied to an arbitrary node.

4 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 360 XPath Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … Applied to the above document, the XPath expression /bibliography/book/author returns the sequence Abiteboul Hull Vianu...

5 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 361 Attributes If we do not want to return the qualifying elements, but the value one of their attributes, we end the path expression with @attribute. Applied to the above document, the XPath expression /bibliography/book/@bookID returns the sequence “b100“...

6 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 362 Axes XPath provides a variety of axes, i.e. modes of navigation through semistructured data. At each step of a path expression, we can prefix a tag or attribute name by an axis name and a colon. For example, the path expression /child::bibliography/child::book/attribute::bookID is equivalent to /bibliography/book/@bookID. Descendants are all direct and indirect children of a node.

7 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 363 Axes Axes include parent, ancestor, descendant, next-sibling, previous-sibling, self, and descendant-or-self. XPath has the following shorthands for axes: /child, // descendant-or-self, @attribute,.self,..parent.

8 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 364 Axes Foundations… Abiteboul Hull... Codd A relational database model... Applied to the above document, the path expression /bibliography//author returns the sequence Abiteboul Hull Codd.

9 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 365 Wildcards We can use wildcards instead of actual tags and attributes: * means any tag, and @* means any attribute. Examples /bibliography/*/author returns the sequence Abiteboul Hull. /bibliography//author/@* returns the sequence “IBM“ “a739“.

10 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 366 Conditions We can restrict the qualifying paths to those that satisfy a given condition, surrounded by square brackets. Conditions can be anything returning a boolean value. In particular, conditions can be: [ = ] there exists a subpath with the specified value [i] the element is the i-th element of the specified type Example /bibliography/book[/title=“Foundations…”]/author[2] returns Hull.

11 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 367 XQuery XQuery extends XPath, i.e. every XPath expression is an XQuery expression. Beyond XPath expressions, XQuery introduces FLWOR expressions. Format: for let where order-by return for/let clauses where clause order-by/return clause sequence of items

12 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 368 XQuery FLWOR expressions are similar to SQL select.. from... where... queries. XQuery allows zero, one or more for and let clauses. The where clause is optional. There is one optional order-by clause. Finally, there is exactly one return clause. XQuery is case-sensitive. XQuery (and XPath) is a W3C standard.

13 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 369 XQuery XQuery is a functional language. Any XQuery expression can be used in any place that an expression is expected. SQL also allows subqueries in many places. However, SQL does, e.g., not allow any subquery to be any operand of any comparison in a WHERE clause. This implies that every XQuery operator must be defined for operands that are sequences of items, not just for individual items.

14 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 370 XQuery Clauses for $x in expr Defines node variable $x. The expression expr evaluates to a sequence of items. The variable $x is assigned to each item, in turn, and the body of the for clause is executed once for each assignment. let $x := expr Defines collection variable $x. The expression expr evaluates to a sequence of items. The variable is bound to the entire sequence of items. Useful for common subexpressions and for aggregations.

15 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 371 XQuery Clauses where condition The condition is a boolean expression. The clause is applied to some item. If and only if the condition evaluates to true, the following return clause is executed for that item. return expression The result of a FLWOR clause is a sequence of items. Expression defines the result format for the current (qualifying) item. The sequence of items produced by expression is appended to the sequence of items produced so far.

16 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 372 Document Nodes The context for a for or let clause is often provided by a document node. Typically, the document comes from a file. The doc function constructs a document node from a file with a given name. Examples doc("bib.xml") doc(“infolab.stanford.edu/~hector/movies.xml”)

17 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 373 Interpretation as XQuery Expression XQuery expressions can be used wherever an XML expression of any kind is permitted. Any text string is acceptable as content of a tag or value of an attribute. If a string contains an XQuery expression that should be evaluated, this substring must be surrounded by curly brackets {}. Example for $b in doc("bib.xml")/bibliography/book return {$b/title}

18 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 374 XQuery Examples for $x in doc("bib.xml")/bibliography/book return {$x} for $x in doc("bib.xml")/bibliography/book return {$x} Returns:... let $x := doc("bib.xml")/bibliography/book return {$x} let $x := doc("bib.xml")/bibliography/book return {$x} Returns:... Find all books. for vs. let

19 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 375 XQuery Examples Result: abc def ghi for $x in doc("bib.xml")/bibliography/book where $x/year > 1995 return $x/title for $x in doc("bib.xml")/bibliography/book where $x/year > 1995 return $x/title Find all titles of books published after 1995.

20 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 376 Ordering the Query Result The order-by clause allows you to order the results of an XQuery expression. order-by list of expressions The sort order is based on the value of the first expression. Ties are broken based on the value of the second (if necessary third etc.) expression. By default, the order is ascending. A descending sort order can be specified using descending.

21 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 377 Elimination of Duplicates The built-in function distinct-values eliminates duplicates from a sequence of result items. In principle, it applies only to primitive (atomic) types. It can also be applied to elements, but then it will remove their tags, replacing them by quotes “”. Example If return $b/title produces aaa bbb aaa then distinct-values (return $b/title) produces “aaa” “bbb”.

22 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 378 XQuery Examples Find all books published by Morgan Kaufman and list them in descending order of their prices. Uses order-by with option descending. for $b in doc("bib.xml") /bibliography/book[publisher=“Morgan Kaufmann”]) order-by $b/price descending return $b for $b in doc("bib.xml") /bibliography/book[publisher=“Morgan Kaufmann”]) order-by $b/price descending return $b

23 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 379 XQuery Examples For each author of a book published by Morgan Kaufmann, list the author and the titles of all books she published. Uses nested subquery and function distinct-values. for $a in distinct-values(doc("bib.xml") /bibliography/book[publisher=“Morgan Kaufmann”]/author) return {$a} {for $t in /bib/book[author=$a]/title return $t} for $a in distinct-values(doc("bib.xml") /bibliography/book[publisher=“Morgan Kaufmann”]/author) return {$a} {for $t in /bib/book[author=$a]/title return $t}

24 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 380 XQuery Examples Result: Jones abc def Smith ghi

25 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 381 Joins We can join two or more documents, by using one variable for each of the documents. We let a variable range over the elements of the corresponding document, within a for-clause. Need to be careful when comparing elements for equality, since their equality is by element identity, not by element content. Typically, we want to compare the element content. The built-in function data(E) returns the content of an element E.

26 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 382 Example Find all pairs of titles of books from the same year. Uses two variables ranging over books and the data function applied to their year elements. let $books:=doc("bib.xml") for $b1 in doc("bib.xml")/bibliography/book, $b2 in doc("bib.xml")/bibliography/book where data($b1/year) = data($b2 /year) return {$b1/title} {$b2/title} let $books:=doc("bib.xml") for $b1 in doc("bib.xml")/bibliography/book, $b2 in doc("bib.xml")/bibliography/book where data($b1/year) = data($b2 /year) return {$b1/title} {$b2/title}

27 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 383 Comparison Operators XQuery supports the standard comparison operators such as, =. Comparison operators are applied to a sequence of items. Comparisons have an existential nature. I.e., they return true if and only if at least one of the items satisfies the condition of the comparison. for $b in doc("bib.xml")/bibliography/book/ where $b/author/firstname = “A” and $b/author/lastname = “B” return $b Books returned can have one author with firstname A and another author with lastname B.

28 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 384 Comparison Operators XQuery also supports special comparison operators that only compare sequences consisting of a single item: eq, ne, lt, gt, ge. These comparisons fail if one of the operands contains more than one item. XQuery also provides built-in functions for approximate string matching, in particular contains($p, "windsurfing").

29 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 385 Quantification XQuery supports the existential and the universal quantifier. Universal quantifier every $v in expression1 satisfies expression 2 Existential quantifier some $v in expression1 satisfies expression 2 Expression1 evaluates to a sequence of items, expression 2 is a boolean expression.

30 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 386 Aggregation XQuery provides built-in functions for the standard aggregations such as SUM, MIN, COUNT and AVG. They can be applied to any XQuery expression, i.e. to any sequence of items. Example avg(doc("bib.xml")/bibliography/book/price) count(doc("bib.xml")/bibliography/book/price) Computes the average book price and the number of books, resp.

31 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 387 XQuery Examples Find books whose price is larger than the average price. Uses aggregate operator (avg), applied to the result of a path expression. let $a:=avg(doc("bib.xml")/bibliography/book/price) for $b in doc("bib.xml")/bibliography/book where $b/price > $a return $b let $a:=avg(doc("bib.xml")/bibliography/book/price) for $b in doc("bib.xml")/bibliography/book where $b/price > $a return $b

32 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 388 XQuery Examples Find title of books with a paragraph containing the terms “sailing” and “windsurfing”. Uses existential quantifier (some) and string matching (contains). for $b in doc("bib.xml")//book where some $p in $b//para satisfies contains($p, "sailing") and contains($p, "windsurfing") return $b/title for $b in doc("bib.xml")//book where some $p in $b//para satisfies contains($p, "sailing") and contains($p, "windsurfing") return $b/title

33 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 389 XQuery Examples Find the title of books where every paragraph contains the terms “sailing”. Uses universal quantifier (every) and string matching (contains). for $b in doc("bib.xml")//book where every $p in $b//para satisfies contains($p, "sailing") return $b/title for $b in doc("bib.xml")//book where every $p in $b//para satisfies contains($p, "sailing") return $b/title

34 CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 390 Summary XQuery is the standard XML query language. It is a functional language, i.e. any XQuery expression can be used in any place where an expression is expected. An XQuery expression consists of for, let, where, order and return clauses, of which some are optional. The main new concept compared to SQL are path expressions that return sets of elements reachable via the given path. Path expressions are defined in XPath, a sublanguage of XQuery. In addition, XQuery has equivalent constructs for most of the main SQL constructs, in particular quantifiers and aggregate functions.


Download ppt "CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML."

Similar presentations


Ads by Google