Querying XML and Semistructured Data

Slides:



Advertisements
Similar presentations
XML Data Management 8. XQuery Werner Nutt. Requirements for an XML Query Language David Maier, W3C XML Query Requirements: Closedness: output must be.
Advertisements

Web Data Management XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections.
XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
S EMISTRUCTURED D ATA AND XML H OW THE W EB IS T ODAY HTML documents often generated by applications consumed by humans only easy access: across.
Query Languages for XML: XQuery Adrian Pop, Paul Pop Computer and Information Science Dept. Linköpings universitet.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Introduction to XML, XPath, & XQuery CS186, Fall 2005 R &G - Chapters 7-27 Bill Gates, The Revolution, and a Network of Trees ( based on a true story)
1 Part 3: Query Languages Managing XML and Semistructured Data.
1 Rewriting Nested XML Queries Using Nested Views Nicola Onose joint work with Alin Deutsch, Yannis Papakonstantinou, Emiran Curtmola University of California,
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27.
Querying XML (cont.). Comments on XPath? What’s good about it? What can’t it do that you want it to do? How does it compare, say, to SQL?
IS432: Semi-Structured Data Dr. Azeddine Chikh. 7. XQuery.
1 Lecture 12: XQuery in SQL Server Monday, October 23, 2006.
Query Languages Aswin Yedlapalli. XML Query data model Document is viewed as a labeled tree with nodes Successors of node may be : - an ordered sequence.
1 Lecture 9: XQuery. 2 XQuery Motivation XPath expressivity insufficient –no join queries (as in SQL) –no changes to the XML structure possible –no quantifiers.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML.
XQuery language Presented by: Tayeb sbihi supervised by: Dr. H. Haddouti.
Query Languages - XQuery Slides partially from Dan Suciu.
Models and languages for semistructured data Bridging documents and databases.
CSC056-Z1 – Database Management Systems – Vinnie Costa – Hofstra University1 Database Management Systems Session 10 Instructor: Vinnie Costa
XML May 2 nd, Agenda XML as a data model Querying XML Manipulating XML A lot of discussion, politics and stories.
XML May 1 st, XML for Representing Data John 3634 Sue 6343 Dick 6363 John 3634 Sue 6343 Dick 6363 row name phone “John”3634“Sue”“Dick” persons.
1 Introduction to Database Systems CSE 444 Lecture 11 Xpath/XQuery April 23, 2008.
1 Lecture 11: Xpath/XQuery Friday, October 20, 2006.
SDPL 2001Notes 8.2: XQuery1 8.2 W3C XML Query Language –Thanks for Helena Ahonen-Myka (University of Helsinki) for borrowing her slide originals for this.
XQuery – The W3C XML Query Language Jonathan Robie, Software AG Don Chamberlin, IBM Research Daniela Florescu, INRIA.
XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
XML QUERY LANGUAGE Prepared by Prof. Zaniolo, Hung-chih Yang, Ling-Jyh Chen Modified by Fernando Farfán.
XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington.
Xpath to XQuery February 23rd, Other Stuff HW 3 is out. Instructions for Phase 3 are out. Today: finish Xpath, start and finish Xquery. From Wednesday:
1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005.
Querying XML February 12 th, Querying XML Data XPath = simple navigation through the tree XQuery = the SQL of XML XSLT = recursive traversal –will.
Processing of structured documents Spring 2003, Part 8 Helena Ahonen-Myka.
Xquery. Summary of XQuery FLWR expressions FOR and LET expressions Collections and sorting Resource W3C recommendation:
Introduction to XQuery Resources: Official URL: Short intros:
1 XQuery Slides From Dr. Suciu. 2 FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
A Quilt, not a Camel Don Chamberlin Jonathan Robie Daniela Florescu May 19, 2000.
End of XML February 19 th, FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
Database Systems Part VII: XML Querying Software School of Hunan University
SDPL 2002Notes 9: XQuery1 9 Querying XML Data and Documents n XQuery, W3C XML Query Language –"work in progress", Working Draft, 30 April 2002 –joint work.
PROCESSING AND QUERYING XML 1. ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 2.
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
1 XQuery Slides From Dr. Suciu. 2 XQuery Based on Quilt, which is based on XML-QL Uses XPath to express more complex queries.
XML May 6th, Instructor AnHai Doan Brief bio –high school in Vietnam & undergrad in Hungary –M.S. at Wisconsin –Ph.D. at Washington under Alon &
1 Lecture 13: XQuery XML Publishing, XML Storage Monday, October 28, 2002.
IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.
19 th International Unicode Conference San Jose, CA September W3C XML Query Paul Cotton, Microsoft 19 th Unicode Conference Sept 12, 2001.
CSE 6331 © Leonidas Fegaras XQuery 1 XQuery Leonidas Fegaras.
1 Lecture 5: Relational Algebra and XML Monday, April 26th, 2004.
XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections and sorting 2.
Lecture 17: XPath and XQuery Wednesday, Nov. 7, 2001.
1 CSE544: Lecture 7 XQuery, Relational Algebra Monday, 4/22/02.
1 Lecture 12: XML, XPath, XQuery Friday, October 24, 2003.
SDPL 2005Notes 7: XQuery1 7 Querying XML n How to access different sources (DBs, docs) as XML? n XQuery, W3C XML Query Language –"work in progress", (last.
Managing XML and Semistructured Data
8 Querying XML How to access different sources (DBs, docs) as XML?
Lecture 11: Xpath/XQuery
End of XQuery DBMS Internals
XML: Schemas, Queries Wednesday, 4/17/2002
Lecture 12: XML, XPath, XQuery
Introduction to Database Systems CSE 444 Lecture 12 More Xquery and Xquery in SQL Server April 25, 2008.
XQuery Leonidas Fegaras.
Xquery Slides From Dr. Suciu.
Lecture 12: XQuery in SQL Server
Introduction to Database Systems CSE 444 Lecture 12 Xquery in SQL Server October 22, 2007.
Processing and Querying XML
Lecture 13: XQuery XML Publishing, XML Storage
XML, XML Schema, XPath and XQuery Query Languages
Presentation transcript:

Querying XML and Semistructured Data CSE 350 Fall 2003

Query 1 SELECT row: X FROM biblio.X WHERE “Smith” in X.author . . . book paper book Answer = {row: {author:“Smith”, date: 1999, title: “Database…”}, } &o12 &o24 &o29 . . . author title date author title author date &25 &30 &o52 1976 Roux &96 Database Systems &o47 &o48 &o50 Combalusier Smith 1999 Database Systems

Query 2 SELECT author: X FROM biblio.book.author X . . . Answer = {author: “Smith”, author: “Roux”, author: “Comalusier”} &o1 book paper book &o12 &o24 &o29 . . . author title date author title author date &25 &30 &o52 1976 Roux &96 Database Systems &o47 &o48 &o50 Combalusier Smith 1999 Database Systems

Query 3 SELECT row: ( SELECT author: X FROM X.author Y) FROM biblio.book X biblio &o1 book paper book Answer = {row: {author:“Smith”}, row: {author:“Roux”, author:“Combalusier”,}, } &o12 &o24 &o29 . . . author title date author title author date &25 &30 &o52 1976 Roux &96 Database Systems &o47 &o48 &o50 Combalusier Smith 1999 Database Systems

SELECT ( SELECT row: {author: Y, title: T} FROM X. author Y, X SELECT ( SELECT row: {author: Y, title: T} FROM X.author Y, X.title T) FROM biblio.book X WHERE “Roux” in X.author Query 4 biblio &o1 book Answer = {row: {author:“Roux”, title: “Database…”}, row: {author:“Combalusier”, title: “Database…”}, } paper book &o12 &o24 &o29 . . . author title date author title author date &25 &30 &o52 1976 Roux &96 Database Systems &o47 &o48 &o50 Combalusier Smith 1999 Database Systems

Semantics Given query Q and database DB SELECT E[X1, …, Xn] FROM F WHERE C Given query Q and database DB Answer(Q,DB) is defined in two steps Step 1: compute all bindings: Cij are node oids or atomic values Must satisfy paths in F and conditions in C Step 2: answer is E[C11, …, C1n] U … U E[Cm1, …, Cmn] For nested subqueries, apply semantics recursively X1 X2 … Xn Ci1 Ci2 Cin

XQuery http://www.w3.org/TR/xquery (12/2001) XML Query data model Ordered FLWR (“Flower”) Expressions FOR ... LET... WHERE... RETURN...

XQuery Find all book titles published after 1995: FOR $x IN document("bib.xml")/bib/book WHERE $x/year > 1995 RETURN $x/title Result: <title> abc </title> <title> def </title> <title> ghi </title>

XQuery For each author of a book by Morgan Kaufmann, list all books s/he published: FOR $a IN distinct(document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author) RETURN <result> $a, FOR $t IN /bib/book[author=$a]/title RETURN $t </result> distinct = a function that eliminates duplicates

XQuery Result: <result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <author> Smith </author> <title> ghi </title>

XQuery Summary of the FLWR structure FOR: Binds node variables  iteration LET: Binds collection variables  one value FOR/LET Clauses WHERE Clause RETURN Clause List of tuples Instance of Xquery data model

XQuery FOR $x in expr -- binds $x to each value in the list expr LET $x = expr -- binds $x to the entire list expr Useful for common subexpressions and for aggregations

FOR v.s. LET FOR $x IN document("bib.xml")/bib/book Returns: <result> <book>...</book></result> ... FOR $x IN document("bib.xml")/bib/book RETURN <result> $x </result> LET $x IN document("bib.xml")/bib/book RETURN <result> $x </result> Returns: <result> <book>...</book> <book>...</book> ... </result>

XQuery Example: Find books whose price is larger than avg. Example: using distinct and count LET $a=avg(document("bib.xml")/bib/book/price) FOR $b in document("bib.xml")/bib/book WHERE $b/price > $a RETURN $b <big_publishers> FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p </big_publishers>

Collections in XQuery Ordered and unordered collections /bib/book/author = an ordered collection Distinct(/bib/book/author) = an unordered collection LET $a = /bib/book  $a is a collection $b/author  a collection (several authors...) $b/price  list of n prices $b/price * 0.7  list of n numbers RETURN <result> $b/author </result> Returns: <result> <author>...</author> <author>...</author> ... </result>

Sorting in XQuery <publisher_list> FOR $p IN distinct(document("bib.xml")//publisher) RETURN <publisher> <name> $p/text() </name> , FOR $b IN document("bib.xml")//book[publisher = $p] RETURN <book> $b/title , $b/price </book> SORTBY(price DESCENDING) </publisher> SORTBY(name) </publisher_list> Sorting arguments: refer to the name space of the RETURN clause, not the FOR clause

More Xquery: If-Then-Else FOR $h IN //holding RETURN <holding> $h/title, IF $h/@type = "Journal" THEN $h/editor ELSE $h/author </holding> SORTBY (title) More Xquery: existential & universal quantifiers FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title " FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title

Other Stuff in XQuery BEFORE and AFTER FILTER for dealing with order in the input FILTER deletes some edges in the result tree No GROUPBY currently in Xquery Some recent proposals exist Recursive functions Currently: arbitrary recursion Perhaps more restrictions in the future ?

Lorel Minor syntactic differences in regular path expressions (% instead of _, # instead of _*) Existential variables: What happens with books having multiple authors ? Author is existentially quantified: SELECT biblio.book.year FROM biblio.book WHERE biblio.book.author = “Roux” SELECT biblio.book.year FROM biblio.book X, X.author Y WHERE Y = “Roux”

UnQL Patterns: Equivalent to: SELECT row: X WHERE {biblio.book: {author “Roux”, title X}} in DB, SELECT row: X FROM biblio.book Y, Y.author Z, Y.title X WHERE Z=“Roux”

UnQL Label variables: “find all publication types and their titles where Roux is an author” SELECT row: {type: L, title : Y} WHERE {biblio.L: {author “Roux”, title X}} in DB,

UnQL Unrestricted use of label variables creates problems: So, in UnQL regular path expressions cannot contain label variables: Pat ::= Var | Const | {L1:Pat1, …, Ln:Patn} L ::= RegularPathExpression | LabelVariable SELECT row: {type: L, title : Y} WHERE {biblio.(book|L).title X} in DB, SELECT row: {type: L, title : Y} WHERE {biblio.(L)*.title X} in DB,