Web Data Management XPath.

Slides:



Advertisements
Similar presentations
Spring Part III: Introduction to XPath XML Path Language.
Advertisements

XPath XML Path Language. Outline XML Path Language (XPath) Data Model Description Node values XPath expressions Relative expressions Simple subset of.
XPath Eugenia Fernandez IUPUI. XML Path Language (XPath) a data model for representing an XML document as an abstract node tree a mechanism for addressing.
2-Jun-15 XPath. 2 What is XPath? XPath is a syntax used for selecting parts of an XML document The way XPath describes paths to elements is similar to.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
XPath Xml processing as a tree. Introduction Although XML provides a flexible and expressive way of describing data, it does not have a mechanism for.
XPath By Laouina Marouane. Outline  Introduction  Data Model  Expression Patterns Patterns Location Paths Location Paths  Example  XPath 2.0  Practice.
 2001 Prentice Hall, Inc. All rights reserved. Chapter 11 – XML Path Language (XPath) Outline 11.1Introduction 11.2Nodes 11.3Location Paths Axes.
Managing XML and Semistructured Data Lecture 6: XPath Prof. Dan Suciu Spring 2001.
XML May 1 st, XML for Representing Data John 3634 Sue 6343 Dick 6363 John 3634 Sue 6343 Dick 6363 row name phone “John”3634“Sue”“Dick” persons.
1 Introduction to Database Systems CSE 444 Lecture 11 Xpath/XQuery April 23, 2008.
1 Lecture 11: Xpath/XQuery Friday, October 20, 2006.
XPath Carissa Mills Jill Kerschbaum. What is XPath? n A language designed to be used by both XSL Transformations (XSLT) and XPointer. n Provides common.
XML Language Family Detailed Examples Most information contained in these slide comes from: These slides are intended.
XPath Tao Wan March 04, What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
1 Lecture 16: Querying XML Data: XPath, XQuery Friday, February 11, 2005.
Querying XML February 12 th, Querying XML Data XPath = simple navigation through the tree XQuery = the SQL of XML XSLT = recursive traversal –will.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
XP ATH - XML Path Language. W HAT IS XP ATH ? XPath, the XML Path Language, is a query language for selecting nodes from an XML document.query languagenodesXML.
SD2520 Databases using XML and JQuery
Navigating XML. Overview  Xpath is a non-xml syntax to be used with XSLT and Xpointer. Its purpose according to the W3.org is  to address parts of an.
CSE3201/CSE4500 XPath. 2 XPath A locator for elements or attributes in an XML document. XPath expression gives direction.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
1/17 ITApplications XML Module Session 7: Introduction to XPath.
XML DOCUMENTS & DATABASES. Summary of Introduction to XML HTML vs. XML HTML vs. XML Types of Data Types of Data Basics of XML Basics of XML XML Syntax,
CSE3201/CSE4500 Information Retrieval Systems
XP New Perspectives on XML Tutorial 6 1 TUTORIAL 6 XSLT Tutorial – Carey ISBN
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
WORKING WITH XSLT AND XPATH
1 XPath XPath became a W3C Recommendation 16. November 1999 XPath is a language for finding information in an XML document XPath is used to navigate through.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
XPath Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for.
XSLT part of XSL (Extensible Stylesheet Language) –includes also XPath and XSL Formatting Objects used to transform an XML document into: –another XML.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 6. XML Path (XPath)
Management of XML and Semistructured Data Lecture 5: Query Languages Wednesday, 4/1/2001.
August Chapter 6 - XPath & XPointer Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
Lecture 6: XML Query Languages Thursday, January 18, 2001.
XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was.
CSE 636 Data Integration Fall 2006 XML Query Languages XPath.
More XML: semantics, DTDs, XPATH February 18, 2004.
1 XML Data Management XPath Principles Werner Nutt.
More XML XPATH, XSLT CS 431 – February 23, 2005 Carl Lagoze – Cornell University.
University of Nottingham School of Computer Science & Information Technology Introduction to XML 2. XSLT Tim Brailsford.
IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.
CSE3201/CSE4500 XPath. 2 XPath A locator for items in XML document. XPath expression gives direction of navigation.
XPath --XML Path Language Motivation of XPath Data Model and Data Types Node Types Location Steps Functions XPath 2.0 Additional Functionality and its.
Lecture 17: XPath and XQuery Wednesday, Nov. 7, 2001.
1 XPath. 2 Agenda XPath Introduction XPath Nodes XPath Syntax XPath Operators XPath Q&A.
1 Lecture 12: XML, XPath, XQuery Friday, October 24, 2003.
1 The XPath Language. 2 XPath Expressions Flexible notation for navigating around trees A basic technology that is widely used uniqueness and scope in.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
CITA 330 Section 5 XPath. XSL XSL (Extensible Stylesheet Language) is the standard language for writing stylesheets to transform XML documents among different.
5 Copyright © 2004, Oracle. All rights reserved. Navigating XML Documents by Using XPath.
XPath.
XML path expressions CSE 350 Fall 2003.
Querying and Transforming XML Data
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Lecture 15: Querying XML Friday, October 27, 2000.
Lecture 11: XML and Semistructured Data
Presentation transcript:

Web Data Management XPath

In this lecture Review of the XPath specification Resources: data model examples syntax Resources: A formal semantics of patterns in XSLT by Phil Wadler. XML Path Language (XPath) www.w3.org/TR/xpath

XPath http://www.w3.org/TR/xpath (11/99) Building block for other W3C standards: XSL Transformations (XSLT) XML Link (XLink) XML Pointer (XPointer) XML Query Was originally part of XSL

XPath An expression language to be used in another host language (e.g., XSLT, XQuery). Allows the description of paths in an XML tree, and the retrieval of nodes that match these paths. Can also be used for performing some (limited) operations on XML data.

Example for XPath Queries <bib> <book> <publisher> Addison-Wesley </publisher> <author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <title> Foundations of Databases </title> <year> 1995 </year> </book> <book price=“55”> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database and Knowledge Base Systems </title> <year> 1998 </year> </book> </bib>

Data Model for XPath XPath expressions operate over XML trees, which consist of the following node types: Document: the root node of the XML document; Element: element nodes; Attribute: attribute nodes, represented as children of an Element node; Text: text nodes, i.e., leaves of the XML tree.

Data Model for XPath The root The root element element attribute bib The root Processing instruction Comment The root element book book Attr= “1” element attribute publisher author . . . . Much like the Xquery data model Addison-Wesley Serge Abiteboul text

Data Model for XPath The root node of an XML tree is the (unique) Document node; The root element is the (unique) Element child of the root node; A node has a name, or a value, or both an Element node has a name, but no value; a Text node has a value (a character string), but no name; an Attribute node has both a name and a value. Attributes are special! Attributes are not considered as first-class nodes in an XML tree. They must be addressed specifically, when needed.

XPath: Simple Expressions /bib/book/year Result: <year> 1995 </year> <year> 1998 </year> /bib/paper/year Result: empty (there were no papers)

XPath Tree Nodes Seven nodes types: root, element, attribute, text, comment, processing instruction and namespace Namespace and attribute nodes have parent nodes, but are not children of those parent nodes. The relationship between a parent node and a child node is containment Attribute nodes and namespace nodes describe their parent nodes

Xpath Tree Nodes <?xml version = "1.0"?> <!-- Fig. 11.3 : simple2.xml --> <!-- Processing instructions and namespacess --> <html xmlns = "http://www.w3.org/TR/REC-html40"> <head> <title>Processing Instruction and Namespace Nodes </title> </head> <?deitelprocessor example = "fig11_03.xml"?> <body> <deitel:book deitel:edition = "1" xmlns:deitel = "http://www.deitel.com/xmlhtp1"> <deitel:title>XML How to Program</deitel:title> </deitel:book> </body> </html>

XPath Tree Nodes String-value: Each XPath tree node has a string representation that XPath uses to compare nodes. The string-value of a text node consists of the character data contained in the node. Document order: Nodes in an XPath tree have an ordering that is determined by the order in which the nodes appear in the original XML document. The reverse document order is the reverse ordering of the nodes in a document. The string-value for the html element node is determined by concatenating the string-values for all of its descendant text nodes in document order. The string-value for element node html is Processing Instruction and Namespace NodesXML How to Program Because all whitespace is removed when the text nodes are normalized, there is no space in the concatenation.

XPath Tree Nodes For processing instructions, the string-value consists of the remainder of the processing instruction after the target, including whitespace, but excluding the ending ?> The string-value for the processing instruction is example = "fig11_03.xml" Namespace-node string-values consist of the URI for the namespace. The string-value for the namespace declaration is http://www.deitel.com/xmlhtpl

XPath Tree Nodes For the root node of the document, the string-value is also determined by concatenating the string-values of its text-node descendents in document order. The string-value of the root node is therefore identical to the string-value calculated for the html element node The string-value for the edition attribute node consists of its value, which is 3. The string-value for a comment node consists only of the comment's text, excluding <!-- and -->. The string-value for the second comment node is therefore: Processing instructions and namespacess.

XPath Tree Nodes Expanded-name: Certain nodes (i.e., element, attribute, processing instruction and namespace) also have an expanded-name that can be used to locate specific nodes in the XPath tree. Expanded-names consist of both a local part and a namespace URI. The local part for the element node html is therefore html. If there is a prefix for the element node, the namespace URI of the expanded-name is the URI to which the prefix is bound. If there is no prefix for the element node, the namespace URI of the expanded name is the URI for the default namespace.

XPath Tree Nodes The local part of the expanded name for a processing instruction node corresponds to the target of the processing instruction in the XML document. For processing instructions, the namespace URI of the expanded-name is null The local part of the expanded-name for a namespace node corresponds to the prefix for the namespace, if one exists; or, if it is a default namespace, the local part is empty (i.e., the empty string). The namespace URI of the expanded-name for a namespace node is always null.

XPath Tree Nodes Node Type string-value expanded-name Description root Determined by concatenating the string-values of all text-node descendents in document order. None Represents the root of an XML document. This node exists only at the top of the tree and may contain element, comment or processor-instruction children. element The element tag, including the namespace prefix (if applicable). Represents an XML element and may contain element, text, comment or processor-instruction children. attribute The normalized value of the attribute. The name of the attribute, including the namespace prefix (if applicable). Represents an attribute of an element. text The character data contained in the text node. None. Represents the character data content of an element comment The content of the comment (not including <!-- and -->). Represents an XML comment processing instruction The part of the processing instruction that follows the target and any whitespace The target of the processing instruction. Represents an XML processing instruction namespace The URI of the namespace The namespace prefix. Represents an XML namespace

XPath: Axes A location path is an expression that specifies how to navigate an XPath tree from one node to another. A location path is composed of location steps, each of which is composed of an "axis," a "node test" and an optional "predicate." Searching through an XML document begins at a context node in the XPath tree. Searches through the XPath tree are made relative to this context node. An axis indicates which nodes, relative to the context node, should be included in the search. The axis also dictates the ordering of the nodes in the set. Axes that select nodes that follow the context node in document order are called forward axes. Axes that select nodes that precede the context node in document order are called reverse axes.

XPath Context A step is evaluated in a specific context [< N1,N2, · · · ,Nn >, Nc] which consists of: a context list < N1,N2, · · · ,Nn > of nodes from the XML tree; a context node Nc belonging to the context list. The context length n is a positive integer indicating the size of a contextual list of nodes; it can be known by using the function last(); The context node position c  [1,n] is a positive integer indicating the position of the context node in the context list of nodes; it can be known by using the function position().

XPath Steps The basic component of XPath expression are steps, of the form: axis::node-test[P1][P2]. . . [Pn] axis is an axis name indicating what the direction of the step in the XML tree is (child is the default). node-test is a node test, indicating the kind of nodes to select. Pi is a predicate, that is, any XPath expression, evaluated as a boolean, indicating an additional condition. There may be no predicates at all. A step is evaluated with respect to a context, and returns a node list.

Path Expressions A path expression is of the form: [/]step1/step2/. . . /stepn A path that begins with / is an absolute path expression; A path that does not begin with / is a relative path expression. Examples /A/B is an absolute path expression denoting the Element nodes with name B, children of the root named A; ./B/descendant::text() is a relative path expression which denotes all the Text nodes descendant of an Element B, itself child of the context node; /A/B/@att1[.> 2] denotes all the Attribute nodes @att1 whose value is greater than 2.

Evaluation of Path Expressions Each stepi is interpreted with respect to a context; its result is a node list. A step stepi is evaluated with respect to the context of stepi−1. More precisely: For i = 1 (first step) if the path is absolute, the context is a singleton, the root of the XML tree; else (relative paths) the context is defined by the environment; For i > 1 if N = < N1,N2, · · · ,Nn > is the result of step stepi−1, stepi is successively evaluated with respect to the context [N,Nj ], for each j  [1,n]. The result of the path expression is the node set obtained after evaluating the last step.

Evaluation of Path Expressions Evaluation of /A/B/@att1 The path expression is absolute: the context consists of the root node of the tree. The first step, A, is evaluated with respect to this context.

Evaluation of /A/B/@att1 The result is A, the root element. A is the context for the evaluation of the second step, B.

Evaluation of /A/B/@att1 The result is a node list with two nodes B[1], B[2]. @att1 is first evaluated with the context node B[1].

Evaluation of /A/B/@att1 The result is the attribute node of B[1].

Evaluation of /A/B/@att1 @att1 is also evaluated with the context node B[2].

Evaluation of /A/B/@att1 The result is the attribute node of B[2].

Evaluation of /A/B/@att1 Final result: the node set union of all the results of the last step, @att1.

XPath: Axes Axes Ordering Description Self None The context node itself. Parent Reverse The context node's parent, if one exists. Child Forward The context node's children, if they exist. Ancestor The context node's ancestors, if they exist. ancestor-or-self The context node's ancestors and also itself. Descendant The context node's descendants. descendant-or-self The context node's descendants and also itself. Following The nodes in the XML document following the context node, not including descendants. following-sibling The sibling nodes following the context node. Preceding The nodes in the XML document preceding the context node, not including ancestors. preceding-sibling The sibling nodes preceding the context node. Attribute The attribute nodes of the context node. Namespace The namespace nodes of the context node.

XPath: Axes An axis has a principal node type that corresponds to the type of node the axis may select. For attribute axes, the principal node type is attribute. For namespace axes, the principal node type is namespace. All other axes have a element principal node type.

XPath: Axes Child axis: denotes the Element or Text children of the context node. Important: An Attribute node has a parent (the element on which it is located), but an attribute node is not one of the children of its parent. Example: child::D

XPath: Axes Parent axis: denotes the parent of the context node. The node test is either an element name, or * which matches all names, node() which matches all node types. Always a Element or Document node, or an empty node-set (if the parent does not match the node test or does not satisfy a predicate). .. is an abbreviation for parent::node(): the parent of the context Example: parent::node()

XPath: Axes Attribute axis: denotes the attributes of the context node. The node test is either the attribute name, or * which matches all the names. Example: attribute::*

XPath: Axes Descendant axis: all the descendant nodes, except the Attribute nodes. The node test is either the node name (for Element nodes), or * (any Element node) or text() (any Text node) or node() (all nodes). The context node does not belong to the result: use descendant-or-self instead. Example: descendant::node()

XPath: Axes Example: descendant::*

XPath: Axes Ancestor axis: all the ancestor nodes. The node test is either the node name (for Element nodes), or node() (any Element node, and the Document root node). The context node does not belong to the result: use ancestor-or-self instead. Example: ancestor::node()

XPath: Axes Following axis: all the nodes that follows the context node in the document order. Attribute nodes are not selected. The node test is either the node name, * text() or node(). The axis preceding denotes all the nodes that precede the context node. Example: following::node()

XPath: Axes Following sibling axis: all the nodes that follows the context node, and share the same parent node. Same node tests as descendant or following. The axis preceding-sibling denotes all the nodes the precede the context node. Example: following-sibling::node()

Location Path Abbreviations child:: This location path is used by default if no axis is supplied and may therefore be omitted attribute:: @ /descendant-or-self::node()/ // self::node() (.) parent::node() (..)

XPath: Node Tests The set of selected nodes is refined with node tests. node tests rely upon the principal node type of an axis for selecting nodes in a location path Node Test Description * Selects all nodes of the same principal node type. node() Selects all nodes, regardless of their type. text() Selects all text nodes. comment() Selects all comment nodes. processing-instruction() Selects all processing-instruction nodes. node name Selects all nodes with the specified node name.

XPath: Axes Location Paths Using Axes and Node Tests child::* Location paths are composed of sequences of location steps. A location step contains an axis and a node test separated by a double-colon (::) and, optionally, a "predicate" enclosed in square brackets ([ ]). child::* The above location path selects all element-node children of the context node, because the principal node type for the child axis is element.

XPath: Wildcard //author/child::* or //author/* Result: <first-name> Rick </first-name> <last-name> Hull </last-name> * Matches any element

XPath: Axes and Node Tests child::text() selects all text-node children of the context node Combining two location steps to form the location path child::*/child::text() selects all text-node grandchildren of the context node

XPath: Node Tests /bib/book/author/text() Result: Serge Abiteboul Victor Vianu Jeffrey D. Ullman Rick Hull doesn’t appear because he has firstname, lastname /bib/book/author/*/text() Result: Rick Hull

XPath: Restricted Kleene Closure select all author element nodes in an entire document /descendent-or-self::node()/child::author Instead use the abbreviation: //author Result:<author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <author> Jeffrey D. Ullman </author> /bib//first-name Result: <first-name> Rick </first-name>

XPath: Attribute Nodes /bib/book/@price Result: “55” @price means that price has to be an attribute

XPath: Predicates Boolean expression, built with tests and the Boolean connectors and/or (negation is expressed with the not() function); a test is either an XPath expression, whose result is converted to a Boolean; a comparison or a call to a Boolean function. Important: predicate evaluation requires several rules for converting nodes and node sets to the appropriate type.

Predicate Evaluation A step is of the form axis::node-test[P] First axis::node-test is evaluated: one obtains an intermediate result I Second, for each node in I, P is evaluated: the step result consists of those nodes in I for which P is true. /A/B/descendant::text()[1]

Predicate Evaluation Beware: an XPath step is always evaluated with respect to the context of the previous step. Here the result consists of those Text nodes, first descendant (in the document order) of a node B. /A/B//text()[1]

XPath: Predicates <?xml version = "1.0"?> <!-- Fig. 11.9 : books.xml --> <!-- XML book list --> <books> <book> <title>Java How to Program</title> <translation edition="1">Spanish</translation> <translation edition="1">Chinese</translation> <translation edition="1">Japanese</translation> <translation edition="2">French</translation> <translation edition="2">Japanese</translation> </book> <book> <title>C++ How to Program</title> <translation edition="1">Korean</translation> <translation edition="2">French</translation> <translation edition="2">Spanish</translation> <translation edition="3">Italian</translation> <translation edition="3">Japanese</translation> </books>

XPath: Predicates Select the title element node for each book that has a Japanese translation /books/book/translation[. = 'Japanese']/../title A predicate is a Boolean expression used as part of a location path to filter nodes from the search. Select the edition attribute node for books with Japanese translations /books/book/translation[. = 'Japanese']/@edition

XPath 1.0 Type System Four primitive types: The boolean(), number(), string() functions convert types into each other (no conversion to nodesets is defined), but this conversion is done in an implicit way most of the time. Rules for converting to a Boolean: A number is true if it is neither 0 nor NaN. A string is true if its length is not 0. A nodeset is true if it is not empty. Type Description Literals Examples Boolean Boolean values None true(), not($a=3) Number Floating-point 12, 12.5 1 div 33 String Ch. Strings "to", ’ti’ concat(’Hello’,’!’) Nodeset Node set /a/b[c=1 or @e]/d

XPath 1.0 Type System Rules for converting a nodeset to a string: The string value of a nodeset is the string value of its first item in document order. The string value of an element or document node is the concatenation of the character data in all text nodes below. The string value of a text node is its character data. The string value of an attribute node is the attribute value. Examples (Whitespace-only text nodes removed) <a toto="3"> <b titi=’tutu’><c /></b> <d>tata</d> </a> string(/) "tata" string(/a/@toto) "3" boolean(/a/b) true() boolean(/a/e) false()

Operators Node-set operators allow to manipulate the node sets to form other node sets. Node-set Operators Description pipe (|) union of node-sets (Example: node()|@*) slash (/) Separates location steps double-slash (//) Abbreviation for the location path /descendant-or-self::node()/ +, -, *, div, mod standard arithmetic operators or, and Boolean operators (Example: @a and c=3) <, <=, >=, > relational operators (Example: ($a<2) and ($a>0))

Node-set Functions node-set functions perform an action on a node-set returned by a location path Node-set Functions Description last() returns a number equal to the context size from the expression evaluation context position() Returns the position number of the current node in the node-set being tested. count( node-set ) Returns the number of nodes in node-set. id( string ) Returns the element node whose ID attribute matches the value specified by argument string. local-name( node-set ) Returns the local part of the expanded-name for the first node in node-set. namespace-uri( node-set ) Returns the namespace URI of the expanded-name for the first node in node-set. name( node-set ) Returns the qualified name for the first node in node-set.

Node-set Functions //book/author[last()] Returns the last author child of book node - Jeffrey D. Ullman //book/author[position() = 3] or //book/author[3] Selects the third author element of the book node /book[count(*)] return the total number of element-node children of the book node //book selects all book element nodes in the document

String Functions String Function Description concat($s1,...,$sn) concatenates the strings $s1, . . . , $sn starts-with($a,$b) returns true() if the string $a starts with $b contains($a,$b) returns true() if the string $a contains $b substring-before($a,$b) returns the substring of $a before the first occurrence of $b substring-after($a,$b) returns the substring of $a after the first occurrence of $b substring($a,$n,$l) returns the substring of $a of length $l starting at index $n (indexes start from 1). $l may be omitted string-length($a) returns the length of the string $a normalize-space($a) removes all leading and trailing whitespace from $a, and collapse all whitespace to a single character translate($a,$b,$c) returns the string $a, where all occurrences of a character from $b has been replaced by the character at the same place in $c

Boolean and Number Functions Decsription not($b) returns the logical negation of the boolean $b sum($s) returns the sum of the values of the nodes in the nodeset $s floor($n) rounds the number $n to the next lowest integer ceiling($n) rounds the number $n to the next greatest integer round($n) rounds the number $n to the closest integer count(//*) returns the number of elements in the document normalize-space(’ titi toto ’) returns the string “titi toto” translate(’baba,’abcdef’,’ABCDEF’) returns the string “BABA” round(3.457) returns the number 3

XPath String functions <?xml version = "1.0"?> <!-- Fig. 11.14 : stocks.xsl --> <!-- string function usage --> <xsl:stylesheet version = "1.0“ xmlns:xsl = "http://www.w3.org/1999/XSL/Transform“ <xsl:template match = "/stocks"> <html> <body> <ul> <xsl:for-each select = "stock"> <xsl:if test = "starts-with(@symbol, 'C')"> <li> <xsl:value-of select = "concat(@symbol,' - ',name)"/> </li> </xsl:if> </xsl:for-each> </ul> </body> </html> </xsl:template> </xsl:stylesheet> <?xml version = "1.0"?> <!-- Fig. 11.13 : stocks.xml --> <!-- Stock list --> <stocks> <stock symbol = "INTC"> <name>Intel Corporation</name> </stock> <stock symbol = "CSCO"> <name>Cisco Systems, Inc.</name> </stock> <stock symbol = "DELL"> <name>Dell Computer Corporation</name> </stock> <stock symbol = "MSFT"> <name>Microsoft Corporation</name> </stock> <stock symbol = "SUNW"> <name>Sun Microsystems, Inc.</name> </stock> <stock symbol = "CMGI"> <name>CMGI, Inc.</name> </stock> </stocks>

XPath: Qualifiers /bib/book[@price < “60”] /bib/book/author[first-name] Result: <first-name> Rick </first-name> /bib/book[@price < “60”] /bib/book/author[@age < “25”] /bib/book/author[text()]

XPath Examples child::A/descendant::B : B elements, descendant of an A element, itself child of the context node; Can be abbreviated to A//B. child::*/child::B : all the B grand-children of the context node descendant-or-self::B : elements B descendants of the context node, plus the context node itself if its name is B. child::B[position()=last()] : the last child named B of the context node. Abbreviated to B[last()]. following-sibling::B[1] : the first sibling of type B (in the document order) of the context node

XPath Examples /descendant::B[10] the tenth element of type B in the document. Not: the tenth element of the document, if its type is B! child::B[child::C] : child elements B that have a child element C. Abbreviated to B[C]. /descendant::B[@att1 or @att2] : elements B that have an attribute att1 or an attribute att2; Abbreviated to //B[@att1 or @att2] *[self::B or self::C] : children elements named B or C

XPath: Summary bib matches a bib element * matches any element / matches the root element /bib matches a bib element under root bib/paper matches a paper in bib bib//paper matches a paper in bib, at any depth //paper matches a paper at any depth paper|book matches a paper or a book @price matches a price attribute bib/book/@price matches price attribute in book, in bib bib/book[@price<“55”]/author/lastname matches…

The Root and the Root <bib> <paper> 1 </paper> <paper> 2 </paper> </bib> bib is the “document element” The “root” is above bib /bib = returns the document element / = returns the root Why ? Because we may have comments before and after <bib>; they become siblings of <bib>

XPath: More Details Examples: What does this mean ? child::author/child:lastname = author/lastname child::author/descendant::zip = author//zip child::author/parent::* = author/.. child::author/attribute::age = author/@age What does this mean ? paper/publisher/parent::*/author /bib//address[ancestor::book] /bib//author/ancestor::*//zip

XPath: Even More Details name() = the name of the current node /bib//*[name()=book] same as /bib//book What does this mean ? /bib//*[ancestor::*[name()!=book]] In a different notation bib.[^book]*._ Navigation axis gives us strictly more power !

XPath 2.0 An extension of XPath 1.0, backward compatible with XPath 1.0. Main differences: Improved data model tightly associated with XML Schema.  a new sequence type, representing ordered set of nodes and/or values, with duplicates allowed.  XSD types can be used for node tests. More powerful new operators (loops) and better control of the output (limited tree restructuring capabilities) Extensible Many new built-in functions; possibility to add user-defined functions. XPath 2.0 is also a subset of XQuery 1.0.

Path expressions in XPath 2.0 New node tests in XPath 2.0: Nested paths expressions: Any expression that returns a sequence of nodes can be used as a step Node tests Description item() any node or atomic value element() any element (eq. to child::* in XPath 1.0) element(author) any element named author element(*, xs:person) any element of type xs:person attribute() any attribute /book/(author | editor)/name

XPath 1.0 Implementations libxml2 Free C library for parsing XML documents, supporting XPath. java.xml.xpath Java package, included with JDK versions starting from 1.5. System.Xml.XPath .NET classes for XPath. XML::XPath Free Perl module, includes a command-line tool. DOMXPath PHP class for XPath, included in PHP5. PyXML Free Python library for parsing XML documents, supporting XPath.