PROCESSING AND QUERYING XML 1. ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 2.

Slides:



Advertisements
Similar presentations
Web Data Management XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections.
Advertisements

XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Introduction to XML, XPath, & XQuery CS186, Fall 2005 R &G - Chapters 7-27 Bill Gates, The Revolution, and a Network of Trees ( based on a true story)
1 Part 3: Query Languages Managing XML and Semistructured Data.
Querying XML (cont.). Comments on XPath? What’s good about it? What can’t it do that you want it to do? How does it compare, say, to SQL?
Introduction to XML Algebra
QSX (LN 3)1 Query Languages for XML XPath XQuery XSLT (not being covered today!) (Slides courtesy Wenfei Fan, Univ Edinburgh and Bell Labs)
1 Lecture 12: XQuery in SQL Server Monday, October 23, 2006.
1 Lecture 9: XQuery. 2 XQuery Motivation XPath expressivity insufficient –no join queries (as in SQL) –no changes to the XML structure possible –no quantifiers.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML.
XQuery language Presented by: Tayeb sbihi supervised by: Dr. H. Haddouti.
Query Languages - XQuery Slides partially from Dan Suciu.
CSC056-Z1 – Database Management Systems – Vinnie Costa – Hofstra University1 Database Management Systems Session 10 Instructor: Vinnie Costa
1 Introduction To XML Algebra Wan Liu Bintou Kane Advanced Database Instructor: Elka 2/11/
XML May 1 st, XML for Representing Data John 3634 Sue 6343 Dick 6363 John 3634 Sue 6343 Dick 6363 row name phone “John”3634“Sue”“Dick” persons.
1 Introduction to Database Systems CSE 444 Lecture 11 Xpath/XQuery April 23, 2008.
1 Lecture 11: Xpath/XQuery Friday, October 20, 2006.
SDPL 2001Notes 8.2: XQuery1 8.2 W3C XML Query Language –Thanks for Helena Ahonen-Myka (University of Helsinki) for borrowing her slide originals for this.
XML, XML Schema, Xpath and Xquery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Lecture 12. Default Processing in XSLT The default processing in XSLT is to process the XPath root node The default processing for various node types.
1 Introduction to XML Algebra Based on talk prepared for CS561 by Wan Liu and Bintou Kane.
XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington.
Xpath to XQuery February 23rd, Other Stuff HW 3 is out. Instructions for Phase 3 are out. Today: finish Xpath, start and finish Xquery. From Wednesday:
1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005.
Querying XML February 12 th, Querying XML Data XPath = simple navigation through the tree XQuery = the SQL of XML XSLT = recursive traversal –will.
Processing of structured documents Spring 2003, Part 8 Helena Ahonen-Myka.
Xquery. Summary of XQuery FLWR expressions FOR and LET expressions Collections and sorting Resource W3C recommendation:
Lecture 7 of Advanced Databases XML Querying & Transformation Instructor: Mr.Ahmed Al Astal.
Introduction to XQuery Resources: Official URL: Short intros:
1 XQuery Slides From Dr. Suciu. 2 FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
1/17 ITApplications XML Module Session 7: Introduction to XPath.
Introduction to XPath Web Engineering, SS 2007 Tomáš Pitner.
Lecture 6 of Advanced Databases XML Querying & Transformation Instructor: Mr.Eyad Almassri.
End of XML February 19 th, FLWR (“Flower”) Expressions FOR... LET... WHERE... RETURN... FOR... LET... WHERE... RETURN...
CIS550 Handout 7 Fall CIS 550 Handout 7 -- XPATH and XQuery.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for.
Computing & Information Sciences Kansas State University Thursday, 15 Mar 2007CIS 560: Database System Concepts Lecture 24 of 42 Thursday, 15 March 2007.
Database Systems Part VII: XML Querying Software School of Hunan University
SDPL 2002Notes 9: XQuery1 9 Querying XML Data and Documents n XQuery, W3C XML Query Language –"work in progress", Working Draft, 30 April 2002 –joint work.
WPI, MOHAMED ELTABAKH PROCESSING AND QUERYING XML 1.
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
CS 157B: Database Management Systems II February 13 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
1 XQuery Slides From Dr. Suciu. 2 XQuery Based on Quilt, which is based on XML-QL Uses XPath to express more complex queries.
XML May 6th, Instructor AnHai Doan Brief bio –high school in Vietnam & undergrad in Hungary –M.S. at Wisconsin –Ph.D. at Washington under Alon &
1 Lecture 13: XQuery XML Publishing, XML Storage Monday, October 28, 2002.
CSE 6331 © Leonidas Fegaras XQuery 1 XQuery Leonidas Fegaras.
1 Lecture 5: Relational Algebra and XML Monday, April 26th, 2004.
XQuery 1. In this lecture Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections and sorting 2.
Lecture 17: XPath and XQuery Wednesday, Nov. 7, 2001.
1 CSE544: Lecture 7 XQuery, Relational Algebra Monday, 4/22/02.
1 Lecture 12: XML, XPath, XQuery Friday, October 24, 2003.
1 The XPath Language. 2 XPath Expressions Flexible notation for navigating around trees A basic technology that is widely used uniqueness and scope in.
SDPL 2005Notes 7: XQuery1 7 Querying XML n How to access different sources (DBs, docs) as XML? n XQuery, W3C XML Query Language –"work in progress", (last.
Lecture 11: Xpath/XQuery
End of XQuery DBMS Internals
Querying XML and Semistructured Data
XML and Databases.
XML: Schemas, Queries Wednesday, 4/17/2002
Lecture 12: XML, XPath, XQuery
Introduction to Database Systems CSE 444 Lecture 12 More Xquery and Xquery in SQL Server April 25, 2008.
Lecture 12: XQuery in SQL Server
Introduction to Database Systems CSE 444 Lecture 12 Xquery in SQL Server October 22, 2007.
CS561-Spring 2012 WPI, Mohamed eltabakh
Processing and Querying XML
Siddhesh Bhobe Persistent eBusiness Solutions
XML, XML Schema, XPath and XQuery Query Languages
Presentation transcript:

PROCESSING AND QUERYING XML 1

ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 2

PROCESSING XML 3

XML PARSING 4 DOM SAX

DOM STRUCTURE MODEL 5

EXAMPLE OF DOM TREE 6

SAX: SIMPLE API FOR XML 7

DOM SUMMARY 8 Object-Oriented approach to traverse the XML node tree Automatic processing of XML docs Operations for manipulating XML tree Memory-intensive

SAX SUMMARY 9 Pros: The whole file doesn’t need to be loaded into memory XML stream processing Simple and fast Allows you to ignore less interesting data Cons: limited expressive power (query/update) when working on streams => application needs to build (some) parse-tree when necessary

DOM VS. SAX 10

ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 11

XPATH Goal = Permit access some nodes from document XPath main construct : Axis navigation Navigation step : axis + node-test + predicates Examples descendant::book  //book child::author ./author or author attribute::booktitle =“XML” = “XML” 12

CS561 - Spring XPATH- CHILD AXIS NAVIGATION / doc -- all doc children of the root./ aaa -- all aaa children of the context node (equivalent to aaa ) text() -- all text children of context node node() -- all children of the context node (includes text and attribute nodes).. -- parent of the context node.// -- the context node and all its descendants // -- the root node and all its descendants * - all children of content node //text() -- all the text nodes in the document

XPATH EXAMPLE //aaa Nodes 1, 3, 5./aaa or aaa Nodes 1, 3 14 bbb None //aaa/ccc Node 7

PREDICATES 15 [2] or./[2] -- the second child node of the context node chapter[5] -- the fifth chapter child of context node [last()] -- the last child node of the context node chapter[title=“introduction”] -- the chapter children of the context node that have one or more title children whose string-value is “introduction” (string-value is concatenation of all text on descendant text nodes) Person[//firstname = “joe”] -- the person children of the context node that have in their descendants a firstname element with string-value “Joe”

MORE EXAMPLES Find all books written by author ‘Bob’ //book[author/first-name= ‘Bob’] Given a content bookstore node, find all books with price > $50 > 50] Find all books where the category equals the bookstore specialty 16 Bob … ….

17 AXIS NAVIGATION XPath has several axes: ancestor, ancestor-or-self, attribute, child, descendant, descendant-or-self, following, following-sibling, namespace, parent, preceding, preceding-sibling, self ancestor descendant followingpreceding following-siblingpreceding-sibling child attribute namespace self

ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 18

CS561 - Spring FLWR ( “ Flower ” ) Expressions FOR... LET... WHERE... RETURN... XQuery is more powerful and expressive that XPath

CS561 - Spring XQUERY Find the titles of all books published after 1995: FOR $x IN document("bib.xml") /bib/book WHERE $x/year > 1995 RETURN $x/title FOR $x IN document("bib.xml") /bib/book WHERE $x/year > 1995 RETURN $x/title How does result look like?

CS561 - Spring XQUERY Find the titles of all books published after 1995: FOR $x IN document("bib.xml") /bib/book WHERE $x/year > 1995 RETURN $x/title FOR $x IN document("bib.xml") /bib/book WHERE $x/year > 1995 RETURN $x/title Result: abc def ghi

CS561 - Spring XQUERY EXAMPLE FOR $a IN ( document("bib.xml") /bib/book[publisher= “ Morgan Kaufmann ” ]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t FOR $a IN ( document("bib.xml") /bib/book[publisher= “ Morgan Kaufmann ” ]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t

CS561 - Spring XQUERY EXAMPLE FOR $a IN ( document("bib.xml") /bib/book[publisher= “ Morgan Kaufmann ” ]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t FOR $a IN ( document("bib.xml") /bib/book[publisher= “ Morgan Kaufmann ” ]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t For each author of a book by Morgan Kaufmann, list all books she published: What is query result ?

CS561 - Spring XQUERY Result: Jones abc def Jones abc def Smith ghi How to eliminate duplicates?

CS561 - Spring XQUERY EXAMPLE: DUPLICATES For each author of a book by Morgan Kaufmann, list all books she published: FOR $a IN distinct ( document("bib.xml") /bib/book[publisher= “ Morgan Kaufmann ” ]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t FOR $a IN distinct ( document("bib.xml") /bib/book[publisher= “ Morgan Kaufmann ” ]/author) RETURN $a, FOR $t IN /bib/book[author=$a]/title RETURN $t distinct = a function that eliminates duplicates

CS561 - Spring EXAMPLE XQUERY RESULT Result: Jones abc def Smith ghi

CS561 - Spring XQUERY FOR $x in expr binds $x to each element in the list expr Useful for iteration over some input list LET $x = expr binds $x to the entire list expr Useful for common subexpressions and for grouping and aggregations

CS561 - Spring XQUERY WITH LET CLAUSE count = a (aggregate) function that returns number of elements FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p FOR $p IN distinct(document("bib.xml")//publisher) LET $b := document("bib.xml")/book[publisher = $p] WHERE count($b) > 100 RETURN $p

CS561 - Spring XQUERY Find books whose price is larger than average: LET $a = avg( document("bib.xml") FOR $b in document("bib.xml") /bib/book WHERE > $a RETURN $b LET $a = avg( document("bib.xml") FOR $b in document("bib.xml") /bib/book WHERE > $a RETURN $b

CS561 - Spring FOR vs. LET FOR Binds node variables  iteration LET Binds collection variables  one value

CS561 - Spring FOR vs. LET FOR $x IN document("bib.xml") /bib/book RETURN $x FOR $x IN document("bib.xml") /bib/book RETURN $x Returns:... LET $x := document("bib.xml") /bib/book RETURN $x LET $x := document("bib.xml") /bib/book RETURN $x Returns:...

CS561 - Spring COLLECTIONS IN XQUERY Ordered and unordered collections /bib/book/author = an ordered collection distinct(/bib/book/author) = an unordered collection LET $a = /bib/book  $a is a collection $b/author  a collection (several authors...) RETURN $b/author Returns:...

CS561 - Spring XQUERY SUMMARY FOR-LET-WHERE-RETURN = FLWR FOR/LET Clauses WHERE Clause RETURN Clause List of tuples Instances of XQuery data model

CS561 - Spring SORTING IN XQUERY FOR $p IN distinct(document("bib.xml")//publisher) RETURN $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] RETURN $b/title, SORTBY (price DESCENDING) SORTBY (name) FOR $p IN distinct(document("bib.xml")//publisher) RETURN $p/text(), FOR $b IN document("bib.xml")//book[publisher = $p] RETURN $b/title, SORTBY (price DESCENDING) SORTBY (name)

CS561 - Spring IF-THEN-ELSE FOR $h IN //holding RETURN $h/title, IF = "Journal" THEN $h/editor ELSE $h/author SORTBY (title) FOR $h IN //holding RETURN $h/title, IF = "Journal" THEN $h/editor ELSE $h/author SORTBY (title)

CS561 - Spring EXISTENTIAL QUANTIFIERS FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title FOR $b IN //book WHERE SOME $p IN $b//para SATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN $b/title

CS561 - Spring UNIVERSAL QUANTIFIERS FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title FOR $b IN //book WHERE EVERY $p IN $b//para SATISFIES contains($p, "sailing") RETURN $b/title

38 EXAMPLE: XML SOURCE DOCUMENTS Invoice.xml 2 AT&T $ Sprint $ $0.75 maria Customer.xml 1 Tom 2 George

39 XQUERY EXAMPLE List account number, customer name, and invoice total for all invoices that have carrier = “ Sprint ”. FOR $i in (invoices.xml)//invoice, $c in (customers.xml)//customer WHERE $i/carrier = “ Sprint ” and $i/account_number= $c/account RETURN $i/account_number, $c/name, $i/total

40 EXAMPLE: XQUERY OUTPUT 1 Tom $1.20

41 ALGEBRA TREE EXECUTION customer (2)customer(1)Invoice (1)invoice (2)invoice (3) Source (Invoices.xml)Source (cutomers.xml) Follow (*.invoice)Follow (*.customer) Select (carrier= “Sprint” ) invoice (2) Join (*.invoice.account_number=*.customer.account) invoice(2) customer(1) Expose (*.account_number, *.name, *.total ) Account_number name total

ROADMAP Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 42

XML vs. TABLES 43 Before providing a native support for XML, documents is translated back and forth to relational tables

GENERATING XML FROM RELATIONAL DATA 44 Step 1 : Set up the database connection // Create an instance of the JDBC driver so that it has // a chance to register itself Class.forName( sun.jdbc.odbc.JdbcOdbcDriver ).newInstance(); // Create a new database connection. Connection con = DriverManager.getConnection( jdbc:odbc:myData, “”, “” ); // Create a statement object that we can execute queries with Statement stmt = con.createStatement();

Generating XML (cont’d) 45 Step 2 : Execute the JDBC query String query = “Select Name, Age from Customers”; ResultSet rs = stmt.executeQuery(query);

Generating XML (cont’d) 46 Step 3 : Create the XML! StringBuffer xml = “ ” ; while (rs.next()) { xml.append(“ ”); xml.append(rs.getString(“Name”)); xml.append(“ ”); xml.append(rs.getInt(“Age”)); xml.append(“ ”); } xml.append(“ ”);

STORING XML IN RELATIONAL TABLES 47 Step 1 : Set up the parser StringReader stringReader = new StringRead(xmlString); InputSource inputSource = new InputSource(stringReader); DOMParser domParser = new DOMParser(); domParser.parse(inputSource); Document document = domParser.getDocument();

Storing XML (Cont’d) 48 Step 2 : Read values from parsed XML document NodeList nameList = doc.getElementsByTagName(“custName”); NodeList ageList = doc.getElementsByTagName(“custAge”);

Storing XML (Cont’d) 49 Step 3 : Set up database connection Class.forName(sun.jdbc.odbc.JdbcOdbcDriver).newInstance(); Connection con = DriverManager.getConnection (jdbc:odbc:myDataBase, “”, “”); Statement stmt = con.createStatement();

Storing XML (Cont’d) 50 Step 4 : Insert data using appropriate JDBC update query String sql = “INSERT INTO Customers (Name, Age) VALUES (?,?)”; PreparedStatement pstmt = conn.prepareStatement(sql); int size = nameList.getLength(); for (int i = 0; i < size; i++) { pstmt. setString(1, nameList.item(i).getFirstChild().getNodeValue()); pstmt.setInt(2, ageList.item(i).getFirstChild().getNodeValue()); pstmt.executeUpdate(sql); }

USE CASE: ORACLE 10g New Data type “ XMLType ” to store XML documents Before XMLType Either parse the documents and store it in relational tables Or, store the documents in CLOB, BLOB 51 Creating a Table with an XMLType Column CREATE TABLE mytable1 (key_column VARCHAR2(10) PRIMARY KEY, xml_column XMLType); Creating a Table of XMLType CREATE TABLE mytable2 OF XMLType; Creating a Table of XMLType with schema CREATE TABLE mytable3 OF XMLType XMLSCHEMA “....”;

INSERTING XML INTO ORACLE 52 Inserting XML Content into an XMLType Table using SQL CREATE DIRECTORY xmldir AS path_to_folder_containing_XML_file'; INSERT INTO mytable2 VALUES (XMLType(bfilename('XMLDIR', 'purchaseOrder.xml'), nls_charset_id('AL32UTF8'))); Inserting XML Content into an XMLType Table using Java/DOM public void doInsert(Connection conn, Document doc) throws Exception { String SQLTEXT = "INSERT INTO purchaseorder VALUES (?)"; XMLType xml = null; xml = XMLType.createXML(conn,doc); OraclePreparedStatement sqlStatement = null; sqlStatement = (OraclePreparedStatement) conn.prepareStatement(SQLTEXT); sqlStatement.setObject(1,xml); sqlStatement.execute(); } Many ways…

ORACLE XPATH OPERATIONS/FUNCTIONS 53 Update warehouse SET warehouse_spec = AppendChildXML(warehouse_spec, 'Warehouse/Building', XMLType(' Grandco ’) Where ExtractValue(warehouse_spec, ‘//Warehouse/Building') = 'Rented'; SELECT warehouse_id, warehouse_name, ExtractValue(warehouse_spec, '/Warehouse/Building/Owner') "Prop.Owner" FROM warehouses WHERE ExistsNode(warehouse_spec, '/Warehouse/Building/Owner') = 1; UPDATE warehouses SET warehouse_spec = InsertChildXML(warehouse_spec, '/Warehouse/Building', 'Owner', XMLType(' LesserCo ')) WHERE warehouse_id = 3;

MORE XPATH EXAMPLES 54

ORACLE XQUERY 55

WHAT WE COVERED Models for Parsing XML Documents XPath Language XQuery Language XML inside DBMSs 56