1 Part 3: Query Languages Managing XML and Semistructured Data.

Slides:



Advertisements
Similar presentations
Spring Part III: Introduction to XPath XML Path Language.
Advertisements

Web Data Management XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections.
XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Introduction to XML, XPath, & XQuery CS186, Fall 2005 R &G - Chapters 7-27 Bill Gates, The Revolution, and a Network of Trees ( based on a true story)
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Querying XML (cont.). Comments on XPath? What’s good about it? What can’t it do that you want it to do? How does it compare, say, to SQL?
IS432: Semi-Structured Data Dr. Azeddine Chikh. 7. XQuery.
1 Lecture 12: XQuery in SQL Server Monday, October 23, 2006.
Query Languages Aswin Yedlapalli. XML Query data model Document is viewed as a labeled tree with nodes Successors of node may be : - an ordered sequence.
1 Lecture 9: XQuery. 2 XQuery Motivation XPath expressivity insufficient –no join queries (as in SQL) –no changes to the XML structure possible –no quantifiers.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML.
XQuery language Presented by: Tayeb sbihi supervised by: Dr. H. Haddouti.
Query Languages - XQuery Slides partially from Dan Suciu.
CSC056-Z1 – Database Management Systems – Vinnie Costa – Hofstra University1 Database Management Systems Session 10 Instructor: Vinnie Costa
XML May 2 nd, Agenda XML as a data model Querying XML Manipulating XML A lot of discussion, politics and stories.
Managing XML and Semistructured Data Lecture 6: XPath Prof. Dan Suciu Spring 2001.
XML May 1 st, XML for Representing Data John 3634 Sue 6343 Dick 6363 John 3634 Sue 6343 Dick 6363 row name phone “John”3634“Sue”“Dick” persons.
1 Introduction to Database Systems CSE 444 Lecture 11 Xpath/XQuery April 23, 2008.
1 Lecture 11: Xpath/XQuery Friday, October 20, 2006.
SDPL 2001Notes 8.2: XQuery1 8.2 W3C XML Query Language –Thanks for Helena Ahonen-Myka (University of Helsinki) for borrowing her slide originals for this.
XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Lecture #6 XML November 2 nd, Administration Thanks for the mid-term comments Comment on the book & readings Project #2 Project #1 Homework #4 Homework.
XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
Xpath to XQuery February 23rd, Other Stuff HW 3 is out. Instructions for Phase 3 are out. Today: finish Xpath, start and finish Xquery. From Wednesday:
1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005.
Querying XML February 12 th, Querying XML Data XPath = simple navigation through the tree XQuery = the SQL of XML XSLT = recursive traversal –will.
Processing of structured documents Spring 2003, Part 8 Helena Ahonen-Myka.
Xquery. Summary of XQuery FLWR expressions FOR and LET expressions Collections and sorting Resource W3C recommendation:
Introduction to XQuery Resources: Official URL: Short intros:
1 XQuery Slides From Dr. Suciu. 2 FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
Semistructured data and XML CS 645 April 5, 2006 Some slide content courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives.
End of XML February 19 th, FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
Management of XML and Semistructured Data Lecture 5: Query Languages Wednesday, 4/1/2001.
Lecture 6: XML Query Languages Thursday, January 18, 2001.
SDPL 2002Notes 9: XQuery1 9 Querying XML Data and Documents n XQuery, W3C XML Query Language –"work in progress", Working Draft, 30 April 2002 –joint work.
CSE 636 Data Integration Fall 2006 XML Query Languages XPath.
PROCESSING AND QUERYING XML 1. ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 2.
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
More XML: semantics, DTDs, XPATH February 18, 2004.
1 XQuery Slides From Dr. Suciu. 2 XQuery Based on Quilt, which is based on XML-QL Uses XPath to express more complex queries.
XML May 6th, Instructor AnHai Doan Brief bio –high school in Vietnam & undergrad in Hungary –M.S. at Wisconsin –Ph.D. at Washington under Alon &
1 Lecture 13: XQuery XML Publishing, XML Storage Monday, October 28, 2002.
IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.
IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.
1 Lecture 5: Relational Algebra and XML Monday, April 26th, 2004.
XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections and sorting 2.
Lecture 17: XPath and XQuery Wednesday, Nov. 7, 2001.
1 CSE544: Lecture 7 XQuery, Relational Algebra Monday, 4/22/02.
1 Lecture 12: XML, XPath, XQuery Friday, October 24, 2003.
XML path expressions CSE 350 Fall 2003.
Managing XML and Semistructured Data
Lecture 11: Xpath/XQuery
End of XQuery DBMS Internals
Querying XML and Semistructured Data
XML: Schemas, Queries Wednesday, 4/17/2002
Lecture 12: XML, XPath, XQuery
Introduction to Database Systems CSE 444 Lecture 12 More Xquery and Xquery in SQL Server April 25, 2008.
Xquery Slides From Dr. Suciu.
Lecture 15: Querying XML Friday, October 27, 2000.
Lecture 12: XQuery in SQL Server
Introduction to Database Systems CSE 444 Lecture 12 Xquery in SQL Server October 22, 2007.
Lecture 11: XML and Semistructured Data
Lecture 13: XQuery XML Publishing, XML Storage
XML, XML Schema, XPath and XQuery Query Languages
Presentation transcript:

1 Part 3: Query Languages Managing XML and Semistructured Data

2 In this section…  Lorel (A Lightweight Object REpository Language - developed at Standford) Lorel  XPath specification data model Examples [xpath, axis]xpathaxis syntax  XQuery  FLWR expressions  FOR and LET expressions  Collections and sorting  (XML-QL the earlier version in AT&T Labs)XML-QL  Resources: The Lorel Query Language for Semistructured DataThe Lorel Query Language for Semistructured Data by Abiteboul, Quass, McHugh, Widom, Wiener, in International Journal on Digital Libraries, A formal semantics of patterns in XSLT A formal semantics of patterns in XSLT by Phil Wadler. XML Path Language (XPath) XQuery: A Query Language for XMLXQuery: A Query Language for XML Chamberlin, Florescu, et al. W3C recommendation:

3 Querying XML Data  A core query language (extracting + restructuring)  XPath (core expressions) allows simple navigation through the tree  XQuery is used as the SQL of XML  XSLT (Extensible Stylesheet Language Transformation) = recursive traversal based on pattern matching - will not discuss here

4 Sample Data for Queries … Smith 1999 Database Systems Roux Combalusier 1976 Database Systems … Smith 1999 Database Systems Roux Combalusier 1976 Database Systems

5 Will illustrate with: XML DB = &o1 &o12&o24&o29 &96 &30 paper book author date title author date title biblio &o47&o48 &o50 &o52 &25 Smith1999 Database Systems Roux Combalusier 1976 Database Systems... A Core Query Language A SQL-like language for querying semi-structured data

6 Query 1: SELECT author: X FROM biblio.book.author X &o1 &o12&o24&o29 &96 &30 paper book author date title author date title biblio &o47&o48 &o50 &o52 &25 Smith 1999Database Systems Roux Combalusier 1976 Database Systems... answer author Answer = {author: “Smith”, author: “Roux”, author: “Combalusier”} Answer = {author: “Smith”, author: “Roux”, author: “Combalusier”}

7 Query 2: SELECT row: X FROM biblio._ X WHERE “Smith” in X.author &o1 &o12&o24&o29 &96 &30 paper book author date title author date title biblio &o47&o48 &o50 &o52 &25 Smith 1999Database Systems Roux Combalusier 1976 Database Systems... answer row... Answer = {row: {author:“Smith”, date: 1999, title: “Database…”}, row: … } Answer = {row: {author:“Smith”, date: 1999, title: “Database…”}, row: … }

8 Query 3: SELECT row: ( SELECT author: Y FROM X.author Y) FROM biblio.book X &o1 &o12&o24&o29 &96 &30 paper book author date title author date title biblio &o47&o48 &o50 &o52 &25 Smith 1999Database Systems Roux Combalusier 1976 Database Systems... answer row &a1 &a2 author Answer = {row: {author:“Smith”}, row: {author:“Roux”, author:“Combalusier”,}, }

9 Query 4: SELECT ( SELECT row: {author: Y, title: T} FROM X.author Y, X.title T) FROM biblio.book X WHERE “Roux” in X.author &o1 &o12&o24&o29 &96 &30 paper book author date title author date title biblio &o47&o48 &o50 &o52 &25 Smith 1999Database Systems Roux Combalusier 1976 Database Systems... answer row &a1 &a2 author title Answer = {row: {author:“Roux”, title: “Database…”}, row: {author:“ Combalusier ”, title: “Database…”}, } Answer = {row: {author:“Roux”, title: “Database…”}, row: {author:“ Combalusier ”, title: “Database…”}, } title

10 Lorel  Minor syntactic differences in regular path expressions (% instead of _, # instead of _*)  Common path convention: becomes: SELECT biblio.book.author FROM biblio.book WHERE biblio.book.year = 1999 SELECT X.author FROM biblio.book X WHERE X.year = 1999

11 Lorel  Existential variables: What happens with books having multiple authors ? Author is existentially quantified: SELECT biblio.book.year FROM biblio.book WHERE biblio.book.author = “Roux” SELECT X.year FROM biblio.book X, X.author Y WHERE Y = “Roux”

12 Lorel  Path in: What happens on graphs with cycles ?  Constructing new results Several default rules  Casting between datatypes Very useful in practice FROM X

13 XPath  (11/99)  Building block for other W3C standards: XSL Transformations (XSLT) XML Link (XLink) XML Pointer (XPointer) XML Query  Was originally part of XSL

14 XPath: Summary bibmatches a bib element *matches any element /matches the root element /bibmatches a bib element under root bib/papermatches a paper in bib bib//papermatches a paper in bib, at any depth //papermatches a paper at any depth paper|bookmatches a paper or a a price attribute price attribute in book, in bib matches…

15 Example for XPath Queries Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998 Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998

16 Data Model for XPath bib book publisherauthor.. Addison-WesleySerge Abiteboul The root The root element

17 XPath: Simple Expressions Result: Result: empty (there were no papers) /bib/book/year /bib/paper/year

18 XPath: Restricted Kleene Closure Result: Serge Abiteboul Rick Hull Victor Vianu Jeffrey D. Ullman Result: Rick //author /bib//first-name

19 XPath: Text Nodes Result: Serge Abiteboul Jeffrey D. Ullman !Rick Hull doesn’t appear because he has firstname, lastname Functions in XPath: text() = matches the text value node() = matches any node (= * or text()) name() = returns the name of the current tag /bib/book/author/text()

20 XPath: Wildcard Result: Rick Hull * Matches any element //author/*

21 XPath: Attribute Nodes Result: means that price is has to be an attribute

22 XPath: Predicates Result: Rick Hull /bib/book/author[firstname]

23 XPath: More Predicates Result: … … /bib/book/author[firstname][address[//zip][city]]/lastname

24 XPath: More Predicates < “60”] < “25”] /bib/book[author/text()]

25 XQuery  Based on Quilt (which is based on XML-QL)Quilt  TR/xquery/ 2/ TR/xquery/  XML Query data model Ordered ! FLWOR (flower) Expressions FOR... LET... WHERE... ORDER BY… RETURN... FOR... LET... WHERE... ORDER BY… RETURN...

26 XQuery Query: Find all book titles published after 1995: FOR $x IN document("bib.xml")/bib/book WHERE $x/year > 1995 RETURN $x/title FOR $x IN document("bib.xml")/bib/book WHERE $x/year > 1995 RETURN $x/title * bib.xml is shown on slide 15 Result: Principles of Database… Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998 Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998

27 XQuery Query: Find book titles by the coauthors of “Foundations of Databases”: FOR $x IN bib/book[title/text() = “Foundations …”]/author $y IN bib/book[author/text() = $x/text()]/title RETURN $y/text() FOR $x IN bib/book[title/text() = “Foundations …”]/author $y IN bib/book[author/text() = $x/text()]/title RETURN $y/text() Result: Foundations … The answer will contain duplicates !

28 XQuery Same as before, but eliminate duplicates: FOR $x IN bib/book[title/text() = “Database Theory”]/author $y IN distinct(bib/book[author/text() = $x/text()]/title) RETURN $y/text() FOR $x IN bib/book[title/text() = “Database Theory”]/author $y IN distinct(bib/book[author/text() = $x/text()]/title) RETURN $y/text() Result: Foundations … distinct = a function that eliminates duplicates

29 SQL and XQuery Side-by-side Product(pid, name, maker) Company(cid, name, city) Query: Find all products made in Seattle SELECT x.name FROM Product x, Company y WHERE x.maker=y.cid and y.city=“Seattle” FOR $x IN /db/Product/row $y IN /db/Company/row WHERE $x/maker/text()=$y/cid/text() and $y/city/text() = “Seattle” RETURN $x/name SQL XQuery FOR $y IN /db/Company/row[city/text()=“Seattle”] $x IN /db/Product/row[maker/text()=$y/cid/text()] RETURN $x/name Cool XQuery

30 XQuery: Nesting Query: For each author of a book by Morgan Kaufmann, list all books s/he published: FOR $a IN distinct(document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN { $a, FOR $t IN /bib/book[author=$a]/title RETURN $t } FOR $a IN distinct(document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN { $a, FOR $t IN /bib/book[author=$a]/title RETURN $t } Jones abc def Smith ghi Jones abc def Smith ghi Result:

31 XQuery  FOR $x IN expr -- binds $x to each value in the list expr  LET $x = expr -- binds $x to the entire list expr Useful for common subexpressions and for aggregations FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p count = a (aggregate) function that returns the number of elms

32 XQuery Query: Find books whose price is larger than average: FOR $a IN /bib/book LET $b:=avg(/bib/book/price/text()) WHERE $a/price/text() > $b RETURN $a FOR $a IN /bib/book LET $b:=avg(/bib/book/price/text()) WHERE $a/price/text() > $b RETURN $a

33 XQuery $b is a collection of elements, not a single element count = a (aggregate) function that returns the number of elements { FOR $p IN distinct(//publisher/text()) LET $b := document("bib.xml")/book[publisher/text() = $p] WHERE count($b) > 100 RETURN $p } { FOR $p IN distinct(//publisher/text()) LET $b := document("bib.xml")/book[publisher/text() = $p] WHERE count($b) > 100 RETURN $p } Query: Find all publishers that published more than 100 books:

34 FOR v.s. LET FOR  Binds node variables  iteration LET  Binds collection variables  one value Examples FOR $x IN document("bib.xml") /bib/book RETURN $x FOR $x IN document("bib.xml") /bib/book RETURN $x Returns:... LET $x := document("bib.xml") /bib/book RETURN $x LET $x := document("bib.xml") /bib/book RETURN $x Returns:...

35 Sorting in XQuery FOR $p IN distinct(document("bib.xml")//publisher) RETURN { $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] RETURN {$b/title, SORTBY(price DESCENDING) } SORTBY(name) FOR $p IN distinct(document("bib.xml")//publisher) RETURN { $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] RETURN {$b/title, SORTBY(price DESCENDING) } SORTBY(name)

36 Sorting in XQuery  Sorting arguments: refer to the name space of the RETURN clause, not the FOR clause  To sort on an element you don’t want to display, first return it, then remove it with an additional query. FOR $p IN distinct(document("bib.xml")//publisher) RETURN { $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] RETURN { $b/title, $b/price } ORDER BY price DESCENDING } ORDER BY name FOR $p IN distinct(document("bib.xml")//publisher) RETURN { $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] RETURN { $b/title, $b/price } ORDER BY price DESCENDING } ORDER BY name

37 Collections in XQuery  Ordered and unordered collections /bib/book/author = an ordered collection Distinct(/bib/book/author) = an unordered collection  LET $b = /bib/book  $b is a collection  $b/author  a collection (several authors...) RETURN $b/author Returns:...

38 If-Then-Else FOR $h IN //holding RETURN { $h/title, IF = "Journal" THEN $h/editor ELSE $h/author } ORDER BY title FOR $h IN //holding RETURN { $h/title, IF = "Journal" THEN $h/editor ELSE $h/author } ORDER BY title

39 Quantifiers FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title Existential Quantifiers FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title Universal Quantifiers

40 Other Stuff in XQuery  BEFORE and AFTER for dealing with order in the input  FILTER deletes some edges in the result tree  Recursive functions Currently: arbitrary recursion Perhaps more restrictions in the future ?

41 Group-By in XQuery ??  No GROUPBY currently in XQuery  A recent proposal (next) What do YOU think ?

42 Group-By in XQuery ?? FOR $b IN document(" $y IN WHERE $b/publisher="Morgan Kaufmann" RETURN GROUPBY $y WHERE count($b) > 10 IN $y FOR $b IN document(" $y IN WHERE $b/publisher="Morgan Kaufmann" RETURN GROUPBY $y WHERE count($b) > 10 IN $y SELECT year FROM Bib WHERE Bib.publisher="Morgan Kaufmann" GROUPBY year HAVING count(*) > 10 SELECT year FROM Bib WHERE Bib.publisher="Morgan Kaufmann" GROUPBY year HAVING count(*) > 10  with GROUPBY Equivalent SQL 

43 Group-By in XQuery ?? FOR $b IN document(" $a IN $b/author, $y IN RETURN GROUPBY $a, $y IN $a, $y, count($b) FOR $Tup IN distinct(FOR $b IN document(" $a IN $b/author, $y IN RETURN $a $y ), $a IN $Tup/a/node(), $y IN $Tup/y/node() LET $b = RETURN $a, $y, count($b)  with GROUPBY Without GROUPBY 

44 Group-By in XQuery ?? FOR $b IN document(" $a IN $b/author, $y IN $t IN $b/title, $p IN $b/publisher RETURN GROUPBY $p, $y IN $p, $y, GROUPBY $a IN $a, GROUPBY $t IN $t  Nested GROUPBY’s

45 XQuery Summary:[Demo]Demo FOR-LET-WHERE-RETURN = FLWR FOR/LET Clauses WHERE Clause RETURN Clause List of tuples of bounded variables List of pruned tuples of bounded variables Instance of XQuery data model