Query Languages for XML

Slides:



Advertisements
Similar presentations
2/10/05Salman Azhar: Database Systems1 XML Query Languages Salman Azhar XPATH XQUERY These slides use some figures, definitions, and explanations from.
Advertisements

XML Data Management 8. XQuery Werner Nutt. Requirements for an XML Query Language David Maier, W3C XML Query Requirements: Closedness: output must be.
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
1 XQuery Web and Database Management System. 2 XQuery XQuery is to XML what SQL is to database tables XQuery is designed to query XML data What is XQuery?
XPath Eugenia Fernandez IUPUI. XML Path Language (XPath) a data model for representing an XML document as an abstract node tree a mechanism for addressing.
Friday, September 4 th, 2009 The Systems Group at ETH Zurich XML and Databases Exercise Session 6 courtesy of Ghislain Fourny/ETH © Department of Computer.
1 XPath Path Expressions Conditions. 2 Paths in XML Documents uXPath is a language for describing paths in XML documents. uReally think of the semistructured.
XSL Unit 6 November 2. XSL –eXtensible Stylesheet Language –Basically a stylesheet for XML documents XSL has three parts: –XSLT –XPath –XSL-FO.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 357 Database Systems I Query Languages for XML.
1 COS 425: Database and Information Management Systems XML and information exchange.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
1 XML Query Languages XPATH XQUERY. 2 XPATH and XQUERY uXPATH is a language for describing paths in XML documents. wReally think of the semistructured.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
1 XQuery Values FLWR Expressions Other Expressions.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
SD2520 Databases using XML and JQuery
XML Document Type Definitions XML Schema. Motivation for Semistructured data Serves as a model suitable for integration of databases Notations such as.
10/06/041 XSLT: crash course or Programming Language Design Principle XSLT-intro.ppt 10, Jun, 2004.
CSE3201/CSE4500 XPath. 2 XPath A locator for elements or attributes in an XML document. XPath expression gives direction.
Lecture 21 XML querying. 2 XSL (eXtensible Stylesheet Language) In HTML, default styling is built into browsers as tag set for HTML is predefined and.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
CSE3201/CSE4500 Information Retrieval Systems
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
XP New Perspectives on XML Tutorial 6 1 TUTORIAL 6 XSLT Tutorial – Carey ISBN
XSLT and XPath, by Dr. Khalil1 XSL, XSLT and XPath Dr. Awad Khalil Computer Science Department AUC.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
Advance Database S Week-7 Dr.Kwanchai Eurviriyanukul
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
ECA 228 Internet/Intranet Design I XSLT Example. ECA 228 Internet/Intranet Design I 2 CSS Limitations cannot modify content cannot insert additional text.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
XSLT part of XSL (Extensible Stylesheet Language) –includes also XPath and XSL Formatting Objects used to transform an XML document into: –another XML.
1 XQuery uXQuery extends XPath to a query language that has power similar to SQL. uUses the same sequence-of-items data model. uXQuery is an expression.
August Chapter 6 - XPath & XPointer Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
Database Systems Part VII: XML Querying Software School of Hunan University
 2002 Prentice Hall, Inc. All rights reserved. 1 Chapter 12 – XSL: Extensible Stylesheet Language Transformations (XSLT) Outline 12.1Introduction 12.2Setup.
1 XML Data Management XPath Principles Werner Nutt.
More XML XPATH, XSLT CS 431 – February 23, 2005 Carl Lagoze – Cornell University.
XP New Perspectives on XML, 2 nd Edition Tutorial 7 1 TUTORIAL 7 CREATING A COMPUTATIONAL STYLESHEET.
XML Query Languages XPATH XQUERY Zaki Malik November 11, 2008.
CSE3201/CSE4500 XPath. 2 XPath A locator for items in XML document. XPath expression gives direction of navigation.
XPath --XML Path Language Motivation of XPath Data Model and Data Types Node Types Location Steps Functions XPath 2.0 Additional Functionality and its.
XSLT: How Do We Use It? Nancy Hallberg Nikki Massaro Kauffman.
CSE 6331 © Leonidas Fegaras XQuery 1 XQuery Leonidas Fegaras.
1 The XPath Language. 2 XPath Expressions Flexible notation for navigating around trees A basic technology that is widely used uniqueness and scope in.
XML Schema – XSLT Week 8 Web site:
XQUERY The contents of this slide are obtained from various sources including, Wikipedia, W3School, Stanford website etc. January 2011 Dr.Kwanchai Eurviriyanukul.
1 XSL Transformations (XSLT). 2 XSLT XSLT is a language for transforming XML documents into XHTML documents or to other XML documents. XSLT uses XPath.
CH 15 XSL Transformations 1. Objective What is XSL? Overview of XSL transformations Understanding XSL templates Computing the value of a node with xsl:value-of.
1 XSLT XSLT (extensible stylesheet language – transforms ) is another language to process XML documents. Originally intended as a presentation language:
XML: Extensible Markup Language
XML Introduction Bill Jerome.
Beginning XML 4th Edition.
Querying and Transforming XML Data
Topics Introduction to Repetition Structures
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Topics Introduction to Repetition Structures
XML Path Language Andy Clark 17 Apr 2002.
XML WITH CSS.
CS 431 – February 28, 2005 Carl Lagoze – Cornell University
XQuery Leonidas Fegaras.
Topics Introduction to Repetition Structures
More XML XML schema, XPATH, XSLT
Query Languages for XML
Presentation transcript:

Query Languages for XML XPath

The XPath Data Model Given an XML document, XPath operates on it and produces values that are sequences of items. An item is either: A value of primitive type, e.g., integer or string; A node (defined next).

Principal Kinds of Nodes Documents represent entire XML documents. Local path name or a URL. Elements are pieces of a document consisting of some opening tag, its matching closing tag (if any), and everything in between. Attributes are names that are given values inside opening tags.

Document Nodes Formed by doc(filename) filename can be a local name or a URL. Example: doc(“univ.xml”) or doc(“/mydir/univ.xml”) All XPath queries refer to a doc node, either explicitly or implicitly. Example: key definitions in XML Schema have XPath expressions that refer to the document described by the schema.

Running Example <!DOCTYPE UNIV [ <!ELEMENT UNIV (STUDENT*, COURSE*)> <!ELEMENT STUDENT (NAME, MARK+)> <!ATTLIST STUDENT sno ID #REQUIRED> <!ELEMENT NAME (#PCDATA)> <!ELEMENT MARK (#PCDATA)> <!ATTLIST MARK theCNO IDREF #REQUIRED> <!ELEMENT COURSE (#PCDATA)> <!ATTLIST COURSE cno ID #REQUIRED> ]>

Example Document An element node <UNIV> <STUDENT sno = ”007”> <NAME>James Bond</NAME> <MARK theCNO = ”CS123”>80</MARK> <MARK theCNO = ”CS456”>98</MARK> </STUDENT> … <COURSE cno = ”CS123”> DB</COURSE>… </UNIV> An attribute node Document node is all of this, plus the header ( <? xml version… ?>).

Nodes as Semistructured Data univ.xml UNIV sno = ”007” cno = ”CS123” STUDENT COURSE theCNO = ”CS456” theCNO = ”CS123” MARK DB NAME MARK 98 James Bond 80

Paths in XML Documents XPath is a language for describing paths in XML documents. The result of the described path is a sequence of items.

Path Expressions Simple path expressions are sequences of slashes (/) and tags, starting with /. Example: /UNIV/STUDENT/MARK Construct the result by starting with just the doc node and processing each tag in turn from the left.

Evaluating a Path Expression Assume the first tag is the root. Processing the doc node by this tag results in a sequence consisting of only the root element. e.g., /UNIV Suppose we have obtained a sequence of items from processing the previous tags, and the next tag is X. For each item that is an element node, replace the element by all its subelements with tag X.

Example: /UNIV One item, the UNIV element <UNIV> <STUDENT sno = ”007”> <NAME>James Bond</NAME> <MARK theCNO = ”CS123”>80</MARK> <MARK theCNO = ”CS456”>98</MARK> </STUDENT> … <COURSE cno = ”CS123”> DB</COURSE>… </UNIV>

Example: /UNIV/STUDENT <UNIV> <STUDENT sno = ”007”> <NAME>James Bond</NAME> <MARK theCNO = ”CS123”>80</MARK> <MARK theCNO = ”CS456”>98</MARK> </STUDENT> … <COURSE cno = ”CS123”> DB</COURSE>… </UNIV> This STUDENT element followed by all the other STUDENT elements

Example: /UNIV/STUDENT/MARK <UNIV> <STUDENT sno = ”007”> <NAME>James Bond</NAME> <MARK theCNO = ”CS123”>80</MARK> <MARK theCNO = ”CS456”>98</MARK> </STUDENT> … <COURSE cno = ”CS123”> DB</COURSE>… </UNIV> These MARK elements followed by the MARK elements of all the other STUDENTs.

Relative Path We can use XPath expressions that are relative to the current node or sequence of nodes. Do not start with /. Example If we have arrived at node /UNIV, then we can use relative path STUDENT/NAME or COURSE to describe its subelements. Lu Chaojun, SJTU

Attributes in Paths Instead of going to subelements with a given tag, you can go to an attribute of the elements you already have. An attribute is indicated by putting @ in front of its name.

Evaluating Attributes When a path expression ends in an attribute, the result is typically a sequence of values of primitive type.

Example: /UNIV/STUDENT/MARK/@theCNO <UNIV> <STUDENT sno = ”007”> <NAME>James Bond</NAME> <MARK theCNO = ”CS123”>80</MARK> <MARK theCNO = ”CS456”>98</MARK> </STUDENT> … <COURSE cno = ”CS123”> DB</COURSE>… </UNIV> These attributes contribute ”CS123” ”CS456” to the result, followed by other theCNO values.

Paths that Begin Anywhere If the path begins with //X, then the first step can begin at the root or any subelement of the root, as long as the tag is X. In fact, //X can appear anywhere in a path. e.g., /UNIV//NAME // is the shorthand of a kind of axis. (see next slide)

Axes: Modes of Navigation In general, path expressions allow us to start at the root and execute steps to find a sequence of nodes. At each step, we may follow any one of several axes. The default axis is child:: --- go to all the children of the current set of nodes. Shorthand: /

Example: Axes /UNIV/STUDENT is really shorthand for /child::UNIV/child::STUDENT. @ is really shorthand for the attribute:: axis. Thus, /UNIV/STUDENT/@sno is shorthand for /child::UNIV/child::STUDENT/attribute::sno

More Axes Some other useful axes are: parent:: = parent(s) of the current node(s) Shorthand: .. self Shorthand: the dot descendant-or-self:: = the current node(s) and all descendants Shorthand: // ancestor, ancestor-or-self, next-sibling, etc.

Wildcard * A star (*) in place of a tag represents “any tag”. Example: /*/*/NAME represents all NAME elements at the third level of nesting. @* represents “any attribute”. Example: /UNIV/STUDENT/@*

Example: /UNIV/* <UNIV> <STUDENT sno = ”007”> <NAME>James Bond</NAME> <MARK theCNO = ”CS123”>80</MARK> <MARK theCNO = ”CS456”>98</MARK> </STUDENT> … <COURSE cno = ”CS123”> DB</COURSE>… </UNIV> This STUDENT element, all other STUDENT elements, the COURSE element, all other COURSE elements

Selection Conditions A condition inside […] may follow a tag. If so, then only paths that have that tag and also satisfy the condition are included in the result of a path expression. Sequence comparisons have an implied “there exists” sense: two sequences are related if any pair of items, one from each sequence, are related by the given comparison operator.

Example: Selection Condition /UNIV/STUDENT/MARK[. < 90] <UNIV> <STUDENT sno = ”007”> <NAME>James Bond</NAME> <MARK theCNO = ”CS123”>80</MARK> <MARK theCNO = ”CS456”>98</MARK> </STUDENT> … <COURSE cno = ”CS123”> DB</COURSE>… </UNIV> The condition that the MARK be < 90 makes this but not the CS123 mark part of the result.

Example: Attribute in Selection /UNIV/STUDENT/MARK[@theCNO = ”CS123”] <UNIV> <STUDENT sno = ”007”> <NAME>James Bond</NAME> <MARK theCNO = ”CS123”>80</MARK> <MARK theCNO = ”CS456”>98</MARK> </STUDENT> … <COURSE cno = ”CS123”> DB</COURSE>… </UNIV> Now, this MARK element is selected, along with any other marks for CS123.

Other Forms of Conditions Here are some useful forms of conditions: X[i] = true for ith child of its parent X[T] = true for X having subelement with tag T X[A] = true for X having attribute A Lu Chaojun, SJTU

Query Languages for XML XQuery

XQuery XQuery extends XPath to a query language that has power similar to SQL. Uses the same sequence-of-items data model. XQuery is an expression/functional language. Any XQuery expression can be an argument of any other XQuery expression. Like relational algebra

More About Item Sequences XQuery will sometimes form sequences of sequences. All sequences are flattened. Example: (1 2 () (3 4)) = (1 2 3 4). Empty sequence

FLWR Expressions Zero or more for and/or let clauses. Then an optional where clause. Exactly one return clause.

Semantics of FLWR Expressions Each for creates a loop. let produces only a local definition. At each iteration of the nested loops, if any, evaluate the where clause. If the where clause returns TRUE, invoke the return clause, and append its value to the output.

for Clauses for var in exp, ... Variables begin with $. A for-variable takes on each item in the sequence denoted by the expression, in turn. Whatever follows this for is executed once for each value of the variable.

Example: for “Expand the en- closed string by replacing variables and path exps. by their values.” for $c in doc(”univ.xml”)/UNIV/COURSE/@cno return <CNO> {$c} </CNO> $c ranges over the cno attributes of all courses in our example document. Result is a sequence of CNO elements: <CNO>CS123</CNO> <CNO>CS456</CNO> . . .

Use of Braces When a variable name like $x, or an expression, could be text, we need to surround it by braces to avoid having it interpreted literally. Example: <A>$x</A> is an A-element with value ”$x”. <A>{$x}</A> is correct. But return $x is unambiguous. You cannot return an untagged string without quoting it, as return ”$x”.

let Clauses let var := exp, ... Value of the variable becomes the sequence of items defined by the expression. Note let does not cause iteration; for does.

Example: let Returns one element with all the course numbers, like: let $d := document(”univ.xml”) let $c := $d/UNIV/COURSE/@cno return <CNO> {$c} </CNO> Returns one element with all the course numbers, like: <CNO>CS123</CNO>

order by Clauses FLWR is really FLWOR: an order-by clause can precede the return. Form: order by <expression> With optional ascending or descending. The expression is evaluated for each assignment to variables. Determines placement in output sequence.

Example: order by List all prices for Bud, lowest first. let $d := document(”univ.xml”) for $p in $d/UNIV/STUDENT/MARK[@theCNO=”CS123”] order by $p return $p Generates bindings for $p to MARK elements. Order those bindings by the values inside the elements (auto- matic coersion). Each binding is evaluated for the output. The result is a sequence of MARK elements.

Predicates Normally, conditions imply existential quantification. e.g., for two sequences of items to be equal, we have only to find any pair of items, one from each side, that equate.

Strict Comparisons To require that the things being compared are sequences of only one element, use comparison operators: eq, ne, lt, le, gt, ge. Example: $x/NAME eq ”James Bond” is true if somebody is the only person named “James Bond”.

Boolean Values in XQuery The effective boolean value (EBV) of an expression is: The actual value if the expression is of type boolean. FALSE if the expression evaluates to 0, ”” [the empty string], or () [the empty sequence]. TRUE otherwise.

Comparison of Elem. and Values When an element is compared to a primitive value, the element is treated as its value, if that value is atomic. Example: /UNIV/STUDENT[@sno=”007”]/ MARK[@theCNO=”CS123”] eq ”80” is true if 007 get 80 for CS123.

Comparison of Two Elements It is insufficient that two elements look alike. Example: /A[@C=”C1”]/B eq /A[@C=”C2”]/B is false. For elements to be equal, they must be the same, physically, in the implied document.

Getting Data From Elements Suppose we want to compare the values of elements, rather than their location in documents. To extract just the value (e.g., the mark itself) from an element E, use data(E).

Eliminating Duplicates Use function distinct-values applied to a sequence. Subtlety: this function strips tags away from elements and compares the string values. But it doesn’t restore the tags in the result. Example return distinct-values( let $d= doc(”univ.xml”) return $d/UNIV/STUDENT/MARK)

Quantifier Expressions some $x in E1 satisfies E2 Evaluate the sequence E1. Let $x (any variable) be each item in the sequence, and evaluate E2. Return TRUE if E2 has EBV TRUE for at least one $x. Analogously: every $x in E1 satisfies E2

Aggregations Take sequence as argument, and return count, sum, max, etc. Example let $d := doc(“univ.xml”) for $s in $d/UNIV/STUDENT where count($s/MARK) > 100 return $s Lu Chaojun, SJTU

Branching Expressions if (E1) then E2 else E3 is evaluated by: Compute the EBV of E1. If true, the result is E2; else the result is E3. Example if($x/@sno eq ”007”) then $x/NAME else ()

Query Languages for XML XSLT

XSLT XSLT (Extensible Stylesheet Language – Transformation) is another language to process XML documents. Originally intended as a presentation language: transform XML into an HTML page that could be displayed. It can also extract data from XML or transform XML -> XML, thus serving as a query language.

XSLT Programs Like XML Schema, an XSLT program is itself an XML document. Usually called a stylesheet. XSLT has a special namespace of tags, usually indicated by xsl:. <?xml version=“1.0” ?> <xsl:stylesheet xmlns:xsl = “http://www.w3.org/1999/XSL/Transform”> … </xsl:stylesheet>

Templates A stylesheet has one or more templates. A template describes a set of elements (of the document being processed) and what should be done with them. The form: <xsl:template match = path > … </xsl:template> Attribute match gives an XPath expression describing how to find the nodes to which the template applies.

Example Template matches only the root. <xsl:template match = ”/”> <HTML><BODY> <B>This is a document.</B> </BODY> </HTML> </xsl:template> Output of the template is a HTML page.

Obtaining Values from XML The output usually depend on the data of the input XML. Use xsl:value-of to extract data. Example: <xsl:template match = “/UNIV/STUDENT”> <xsl:value-of select = “@sno” /> <BR/> </xsl:template>

Recursive Use of Templates An XSLT document usually contains many templates. Start by finding the first one that applies to the root. Any template can have within it <xsl:apply-templates/>, which causes the template-matching to apply recursively from the current node.

Apply-Templates Attribute select gives an XPath expression describing the subelements to which we apply templates. Example: <xsl:apply-templates select = ”A/B” /> says to follow all paths tagged A, B from the current node and apply all templates there.

Example: Apply-Templates <xsl:template match = ”/”> <Students> <xsl:apply-templates /> </Students> </xsl:template> <xsl:template match = ”UNIV/STUDENT”> <Stu> <xsl:value-of select = “NAME”/> </Stu>

Iteration Loop within a template: <xsl:for-each select = “XPath expression” > … </xsl:for-each> executes the body of the for-each at each child of the current node that is reached by the path.

Conditionals Branching <xsl:if test = “boolean expression” > … executes the body if and only if the boolean expression is true. Lu Chaojun, SJTU

End