Introduction to XML and XQuery Guangjun (Kevin) Xie.

Slides:



Advertisements
Similar presentations
XML: Extensible Markup Language
Advertisements

XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
UFCEKG-20-2 Data, Schemas & Applications Lecture 5 XML & PHP.
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
2015/5/5 A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML Ning Zhang(University of Waterloo) Varun Kacholia(Indian Institute.
XQuery John Annechino Steven Pow. Agenda What is XQuery? Uses of XQuery XQuery vs. XSLT Syntax –Built-In Functions –FLWOR –if-then-else –User-Defined.
XQUERY. What is XQuery? XQuery is the language for querying XML data The best way to explain XQuery is to say that XQuery is to XML what SQL is to database.
1 XQuery Web and Database Management System. 2 XQuery XQuery is to XML what SQL is to database tables XQuery is designed to query XML data What is XQuery?
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
© 2002 by Prentice Hall 1 SI 654 Database Application Design Winter 2003 Dragomir R. Radev.
Friday, September 4 th, 2009 The Systems Group at ETH Zurich XML and Databases Exercise Session 6 courtesy of Ghislain Fourny/ETH © Department of Computer.
Web-site Management System Strudel Presented by: LAKHLIFI Houda Instructor: Dr. Haddouti.
11/08/2002WIDM20021 An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 7. XQuery.
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
Introduction to XLink Transparency No. 1 XML Information Set W3C Recommendation 24 October 2001 (1stEdition) 4 February 2004 (2ndEdition) Cheng-Chia Chen.
XQuery: 1 W3C (World Wide Web Consortium) What is W3C? –An industry consortium, best known for standardizing HTML and XML. –Working Groups create or adopt.
A Graphical Environment to Query XML Data with XQuery
XQuery language Presented by: Tayeb sbihi supervised by: Dr. H. Haddouti.
1 COS 425: Database and Information Management Systems XML and information exchange.
XML and The Relational Data Model
XML QUERY LANGUAGE Prepared by Prof. Zaniolo, Hung-chih Yang, Ling-Jyh Chen Modified by Fernando Farfán.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Processing of structured documents Spring 2003, Part 8 Helena Ahonen-Myka.
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
Manohar – Why XML is Required Problem: We want to save the data and retrieve it further or to transfer over the network. This.
4/20/2017.
XQuery Your gateway to manipulating XML in SQL Server 2005.
Advisor: Prof. Zaniolo Hung-chih Yang Ling-Jyh Chen XML Query Language.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
XML-QL A Query Language for XML Charuta Nakhe
1 XML INTEROPERABILITY Manjusha Ravindranath. 2 CONTENTS Introduction Interoperability XSSQL syntax Usecases document Group By -Without aggregation -With.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
1 XTree for Declarative XML Querying Zhuo Chen, Tok Wang Ling, Mengchi Liu, and Gillian Dobbie January 2004.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
1 XPath XPath became a W3C Recommendation 16. November 1999 XPath is a language for finding information in an XML document XPath is used to navigate through.
Extensible Markup and Beyond
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
Introduction to XQuery Bun Yue Professor, CS/CIS UHCL.
Querying Structured Text in an XML Database By Xuemei Luo.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
XSLT Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Data Storage Joe Carroll Russell Gibbons. Agenda What is XML Storage of XML Benefits of XML Databases Problems with XML Databases Discussion.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
Namespace information are represented as namespace node which maps in scope on an element Attach to every element node where namespace is declared root.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
More XML: semantics, DTDs, XPATH February 18, 2004.
XML and Database.
Submitted To: Ms. Poonam Saini, Asst. Prof., NITTTR Submitted By: Rohit Handa ME (Modular) CSE 2011 Batch.
IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.
Friday, September 4 th, 2009 The Systems Group at ETH Zurich XML and Databases Exercise Session 5 courtesy of Ghislain Fourny/ETH © Department of Computer.
Lecture 23 XQuery 1.0 and XPath 2.0 Data Model. 2 Example 31.7 – User-Defined Function Function to return staff at a given branch. DEFINE FUNCTION staffAtBranch($bNo)
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XQUERY The contents of this slide are obtained from various sources including, Wikipedia, W3School, Stanford website etc. January 2011 Dr.Kwanchai Eurviriyanukul.
XML. What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display data.
XML: Extensible Markup Language
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Semi-Structured data (XML Data MODEL)
Lecture 9: XML Monday, October 17, 2005.
Semi-Structured data (XML)
Lecture 11: XML and Semistructured Data
Presentation transcript:

Introduction to XML and XQuery Guangjun (Kevin) Xie

Nov 28, 2005York University2 Road Map XML data model XML data vs Relational data XPath 2.0 XQuery Processing XQuery

Nov 28, 2005York University3 XML Data Model XML Information Set (Infoset) Infoset is an abstract data set containing all information in an XML document provide a consistent set of definitions to refer to the information in a well-formed XML document Usually, Infosets result from parsing XML documents; but it could also be synthetic  By use of an API, such as DOM  By transforming from existing infoset An infoset consists of a number of information items.

Nov 28, 2005York University4 XML Data Model XML Infoset "information set" and "information item" are similar in meaning to the generic terms "tree" and "node” An information item is an abstract description of some part of an XML document. Each information item has a set of associated named properties, indicated as [property name]

Nov 28, 2005York University5 XML Data Model Information Items 11 types of information items 1.Document Information Item 2.Element Information Items 3.Attribute Information Items 4.Character Information Items 5.Processing Instruction Information Items 6.Unexpanded Entity Reference Information Items 7.Comment Information Items 8.The Document Type Declaration Information Item 9.Unparsed Entity Information Items 10.Notation Information Items 11.Namespace Information Items  We will discuss the first 3 today

Nov 28, 2005York University6 XML Data Model Document Information Item Exactly one doc item in an infoset Other information accessible thru its properties:  [children] – containing PIs, comments, etc  [document element] – element item corresponding to the document element  [version] – XML version of the document  …  etc

Nov 28, 2005York University7 XML Data Model Element Information Items One element item for each element in XML document The “root” element item is the [document element] prop. of document info item Properties:  [namespace name] – the ns part of tag name  [local name] – the local part of tag name  [children] – all other info items inside  [attributes] – attributes elems of this item  [parent] – info. Item containing this item  … etc.

Nov 28, 2005York University8 XML Data Model Attribute information items One attribute item for each attribute in an XML element Properties:  [namespace name] – the ns part of tag name  [local name] – the local part of tag name  [attribute type] – the data type of this attribute  [owner element] – the element info item containing this attr  …  etc

Nov 28, 2005York University9 XML Data Model Infoset example <msg:message doc:date=" " xmlns:doc=“ xmlns:msg=" >Phone home! The information set contains: A document information item. An element information item with namespace name " local part "message", and prefix "msg". An attribute information item with the namespace name " local part "date", prefix "doc", and normalized value " ". Three namespace information items for the and namespaces. Two attribute information items for the namespace attributes. Eleven character information items for the character data.

Nov 28, 2005York University10 XML Data Model Infoset Example Version=1.0 msg:message xmlns:msgxmlns:doc Phoenhoem! doc:date Legend: Document info. Item Element info. Item Attribute info. Item Character info. Item

Nov 28, 2005York University11 Road Map XML data model XML data vs Relational data XPath 2.0 XQuery Processing XQuery

Nov 28, 2005York University12 XML Data vs Relational Data Relational DB stems from commercial data processing  Information usually has regular structure XML has roots in text documents processing  Often have irregular structure. Both are general model and capable of representing all forms of information. Different heritages cause them to be optimized for different types of applications.

Nov 28, 2005York University13 XML Data vs Relational Data Nesting XML Model  Deeply nested structure  Flexible (un-predefined)  Query easily handled by “descendants” axis in XPath 2.0 Relational Model  Flat table structure  Primary-foreign keys represent nesting relationship  Complex and flexible nesting may result in awkward queries

Nov 28, 2005York University14 XML Data vs Relational Data Metadata XML Model  Metadata mixed with ordinary data  Hight ratio of metadata to ordinary data Relational Model  Metadata easily factored out  Difficult when query involve metadata  Ex: find the names of columns containing the value “red”

Nov 28, 2005York University15 XML Data vs Relational Data Ordering XML Model  Intrinsic ordering can’t derived from value  Ex: sentences in a book is essential  Impose challenge for the query language Relational Model  Ordering is dependent on values  Rows not considered to have ordering

Nov 28, 2005York University16 XML Data vs Relational Data Null Values XML Model  Representing missing value by absence of element  Retrieving missing value results empty list  Need rule on how handle empty list Relational Model  “null” value to represent missing value  Rules for operators in the presence of null

Nov 28, 2005York University17 XML Data vs Relational Data Structural Transformations XML Model  Queries on XML documents and generate new XML documents  XPath 2.0 – navigating inside a document  XQuery – joining elements, constructing new elements/structures Relational Model  Queries on tables and generate new tables

Nov 28, 2005York University18 XML Data vs Relational Data Data Definition XML Model  Mixture of primitive data and nested elements  Elements may be optional  Constraints on cardinality and order  Impose challenges on type inference  Ex: proving output satisfies a given schema? Relational Model  Specifying the properties of columns  All rows have same columns  Relatively simple

Nov 28, 2005York University19 Road Map XML data model XML data vs Relational data XPath 2.0 XQuery Processing XQuery

Nov 28, 2005York University20 XPath 2.0 What’s XPath? XPath is a specification for defining parts of an XML document.  XPath 2.0 provides a method to locate individual node or set of nodes in a XML data model. XPath 2.0 is close related to XQuery  Same data model based on XML data model (infoset)  XQuery uses XPath to refer to information in the data model XPath 2.0 uses path expressions to navigate in XML documents XPath 2.0 uses path expressions to select nodes in an XML document. An XPath expression evaluates to a sequence of nodes These path expressions look very much like the expressions you see when you work with a traditional computer file system. XPath 2.0 is a W3C recommendation

Nov 28, 2005York University21 XPath 2.0 Data model Represent various values including  the input and the output of a query  all values of expressions used during the intermediate calculations. Based on XML infoset data model Shared with XQuery Model XML data as trees  Sequence based data model  Using sequence to represent set of trees or tree fragments  Everything is sequence  Sequences never contain other sequences

Nov 28, 2005York University22 XPath 2.0 Data model A tree whose root node is a Document Node is referred to as a document. A tree whose root node is not a Document Node is referred to as a fragment.

Nov 28, 2005York University23 XPath 2.0 Data model Every instance of the data model is a sequence A sequence may contain nodes, atomic values, or any mixture of nodes and atomic values A sequence is an ordered collection of zero or more items An item is either a node or an atomic value A single item appearing on its own is modeled as a sequence containing one item.

Nov 28, 2005York University24 XPath 2.0 Data model There are seven kinds of Nodes in the data model:  Document node  Element node  Attribute node  Text node  Namespace node  processing instruction node  Comment node

Nov 28, 2005York University25 XPath 2.0 Sample XML Document Everyday Italian Giada De Laurentiis Harry Potter J K. Rowling XQuery Kick Start James McGovern Per Bothner Kurt Cagle James Linn Vaidyanathan Nagarajan Learning XML Erik T. Ray Books.xml

Nov 28, 2005York University26 XPath 2.0 Example Everyday Italian Giada De Laurentiis Harry Potter J K. Rowling XQuery Kick Start James McGovern Per Bothner Kurt Cagle James Linn Vaidyanathan Nagarajan Learning XML Erik T. Ray /bookstore/book evaluated to a sequence of nodes, each node corresponding to a book element: //book evaluated to the same result

Nov 28, 2005York University27 XPath 2.0 Example XQuery Kick Start James McGovern Per Bothner Kurt Cagle James Linn Vaidyanathan Nagarajan Learning XML Erik T. Ray evaluates to a sequence containing 2 book element nodes:

Nov 28, 2005York University28 XPath 2.0 Example some $x in //book satisfies $x/price > 49 evaluates to a sequence containing a atomic value TRUE every $x in //book satisfies $x/price > 49 evaluates to a sequence containing a atomic value FALSE

Nov 28, 2005York University29 XPath 2.0 Example Everyday Italian Giada De Laurentiis /bookstore/book[position()=1] evaluated to a sequence containing one element node:

Nov 28, 2005York University30 Road Map XML data model XML data vs Relational data XPath 2.0 XQuery Processing XQuery

Nov 28, 2005York University31 XQuery What’s XQuery? The language for querying XML data  XQuery is a language for finding and extracting elements and attributes from XML documents. XQuery for XML is like SQL for relational databases  Lots of the concepts and techniques used in SQL processing and optimization can be applied to XQuery processing and optimization.

Nov 28, 2005York University32 XQuery What’s XQuery? XQuery is built on XPath 2.0 expressions  XQuery 1.0 and XPath 2.0 share the same data model  Support the same functions and operators.  Understanding XPath 2.0 is essential to understanding XQuery. Supported by all the major database venders  IBM  Oracle  Microsoft  etc

Nov 28, 2005York University33 XQuery What’s XQuery? closed with respect to a data model  value of every expression in the language is guaranteed to be in the data model.  XPath 2.0 is also closed  Designed to be a functional language  No side-effect  Processing and producing sequences XQuery is becoming a W3C standard  Current draft version is XQuery 1.0  Not yet a W3C Recommendation (XQuery is a Working Draft)

Nov 28, 2005York University34 XQuery FLWOR expression  For expression binds a variable with each element in a sequence iteratively  Let expression binds a variable with a sequence  Where expression applies conditions during For expression binding  Order By sort the output of the For expression  Return expression returns a sequence

York University35 XQuery sample XML document – bib.xml TCP/IP Illustrated Stevens W. Addison-Wesley Advanced Programming in the Unix environment Stevens W. Addison-Wesley Data on the Web Abiteboul Serge Buneman Peter Suciu Dan Morgan Kaufmann Publishers The Economics of Technology and Content for Digital TV Gerbarg Darcy CITI Kluwer Academic Publishers

Nov 28, 2005York University36 XQuery sample XML document – reviews.xml Data on the Web A very good discussion of semi-structured database systems and XML. Advanced Programming in the Unix environment A clear and detailed discussion of UNIX programming. TCP/IP Illustrated One of the best books on TCP/IP.

York University37 XQuery sample XML document – prices.xml Advanced Programming in the Unix environment bstore2.example.com Advanced Programming in the Unix environment bstore1.example.com TCP/IP Illustrated bstore2.example.com TCP/IP Illustrated bstore1.example.com Data on the Web bstore2.example.com Data on the Web bstore1.example.com 39.95

Nov 28, 2005York University38 XQuery Example 1 Solution in XQuery: { for $b in doc("bib.xml")/bib/book where $b/publisher = "Addison-Wesley" and > 1991 return { $b/title } } Result: TCP/IP Illustrated Advanced Programming in the Unix environment List books published by Addison-Wesley after 1991, including their year and title

Nov 28, 2005York University39 XQuery Example 2 Solution in XQuery: for $b in doc("bib.xml")/bib/book, $t in $b/title, $a in $b/author return { $t } { $a } Result: TCP/IP Illustrated Stevens W. Advanced Programming in the Unix environment Stevens W. Data on the Web Abiteboul Serge Data on the Web Buneman Peter Data on the Web Suciu Dan Create a flat list of all the title-author pairs

Nov 28, 2005York University40 XQuery Example 3 Solution in XQuery: for $b in doc("bib.xml")/bib/book return { $b/title } { $b/author } Result: TCP/IP Illustrated Stevens W. Advanced Programming in the Unix environment Stevens W. Data on the Web Abiteboul Serge Buneman Peter Suciu Dan The Economics of Technology and Content for Digital TV > For each book in the bibliography, list the title and authors

Nov 28, 2005York University41 XQuery Example 4 Solution in XQuery: { for $b in doc("bib.xml")//book, $a in doc("reviews.xml")//entry where $b/title = $a/title return { $b/title } { $a/price/text() } { $b/price/text() } } Result: TCP/IP Illustrated Advanced Programming in the Unix environment Data on the Web For each book found at both bib.xml and reviews.xml, list the title of the book and its price from each source

Nov 28, 2005York University42 XQuery Example 5 Solution in XQuery: { for $b in doc("bib.xml")//book where $b/publisher = "Addison-Wesley" and > 1991 order by $b/title return { } { $b/title } } Result: Advanced Programming in the Unix environment TCP/IP Illustrated List the titles and years of all books published by Addison-Wesley after 1991, in alphabetic order

Nov 28, 2005York University43 XQuery Example 6 Solution in XQuery: { let $doc := doc("prices.xml") for $t in distinct-values($doc//book/title) let $p := $doc//book[title = $t]/price return { min($p) } } Result: In the document “prices.xml”, find the minimum price for each book, in the form of a “miniprice” element with the book title as its title attribute

York University44 XQuery sample XML document – book.xml Data on the Web Serge Abiteboul Peter Buneman Dan Suciu Introduction Text... Audience Text... Web Data and the Two Cultures Text... Traditional client/server architecture Text... A Syntax For Data Text... Graph representations of structures Text... Base Types Text... Representing Relational Databases Text... Examples of Relations Representing Object Databases Text...

Nov 28, 2005York University45 XQuery Example 7 Solution in XQuery: declare function local:toc( $book-or-section as element()) as element()* { for $section in $book-or-section/section return { $section/title, local:toc($section) } }; { for $s in doc("book.xml")/book return local:toc($s) } Introduction Audience Web Data and the Two Cultures A Syntax For Data Base Types Representing Relational Databases Representing Object Databases Prepare a (nested) table of contents, listing all sections and their titles. Preserve the original attributes of each element, if any

Nov 28, 2005York University46 Road Map XML data model XML data vs Relational data XPath 2.0 XQuery Processing XQuery

Nov 28, 2005York University47 Processing XQuery Approaches for querying XML data Mapping XML data into relational data  Query with SQL  May produces too many relations  Loses of information may occurs Ex: ordering, explicit hierarchical relationship between elements Using specific query languages  Usually integrated with SQL and relational data management  SQL/XML or XQuery

Nov 28, 2005York University48 Processing XQuery IBM System RX SQL/XQuery compiler A new XQuery parser is added to the existing relational query processing All components extended to process XQuery

Nov 28, 2005York University49 Processing XQuery Oracle XQuery Compilation Engine Parser convert XQuery into XQueryX XQueryX is an XML representation of XQuery (another W3C candidate recommendation) XML parser construct a DOM tree from XQueryX Work on the DOM afterward Corresponding components are extended for XQuery too

Nov 28, 2005York University50 Processing XQuery Microsoft XQuery compilation XQuery compiled into XML algebra tree, which is an internal representation Algebra tree can be optimized and executed by relational query processor Optimizations are rule-based Mapper traverses the algebra tree, converting each XML operator into a relational operator sub-tree

Nov 28, 2005York University51 References M. Nicola, Bert van der Linden. Native XML Support in DB2 Universal Database. Proceeding of the 31 st VLDB Conference, Trondheim, Norway, 2005 Kevin Beyer, Chun Zhang, etc. System RX: One Part Relational, One Part XML. SIGMOD 2005, Baltimore, Maryland, USA. Shankar Pal, Istvan Cseri, etc. XQuery Implementation in a Relational Database System. Proceedings of the 31 st VLDB Conference Zhen Hua Liu, Vikas Arora. Native XQuery Processing in Oracle XMLDB. SIGMOD 2005, Baltimore, Maryland, USA Scott Boag, Don Chamberlin, etc. XQuery 1.0: An XML Query Language. Mary Fernandaz, Norman Walsh, etc. XQuery 1.0 and XPath 2.0 Data Model.