Download presentation
Presentation is loading. Please wait.
1
XPath
2
What is XPath? A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. Primary purpose: Address ‘part’ of an XML document, and provide basic facilities for manipulation of strings, numbers and booleans. Because Xpaht is cooperation between XSL and Xpointer working groups it has a broader definition so that both groups can use it. Different implementations specific to each language are defined within that language so that Xpath can be used across several languages. Xpath is used because we needed a way to point to a specific thing or a set of things in an XML doc.
3
Outline Introduction Data Model Xpath Syntax XPath utilities
Location Path General Xpath Expressions Core Function Library XPath utilities Conclusion
4
Introduction W3C Recommendation. November 16, 1999
Latest version: XPath uses a compact, string-based, rather than XML element-based syntax. Operates on the abstract, logical structure of an XML document rather than its surface syntax. Uses a path notation (like in URLs) to navigate through this hierarchical tree structure. Introduction
5
Introduction Cont. Xpath models an XML doc as a tree of nodes and defines a way to compute a string-value for each type of node. Supports Namespaces. Expression (Expr) is the primary syntactic construct of Xpath. Namespace portion if the name of a node is optional, NULL if not specified Introduction
6
Data Model The way to represent an XML document.
This tree consists of 7 nodes: Root Node Element Nodes Attribute Nodes Namespace Nodes Processing Instruction Nodes Comment Nodes Text Nodes The tree structure is ordered in order of the occurrence of nodes’ start-tag in the XML doc. Data Model
7
Data Model Example <?xml version=“1.0”>
<?xml-stylesheet type=“text/xsl” href=“bib.xsl” ?> <! -- simple XML document --> <bib> <book price=“25.00” pages=“400”> <publisher> IDG books</publisher> <author> <first-name>Rick</first-name> <last-name> Hull </last-name> </author> <author> Simon North</author> <title> XML complete </title> <year> 1997 </year> </book> <book> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database </title> <year> 1998 </year> </book> </bib> Children of DOC are a processing-instruciton node and two para element nodes Four nodes comprise the descendents of the 1st <para> node: 2 text nodes, and <em>node >>Children <em> has child text node >> Grandchild of <para> ?Pub Caret?: Pub Caret is the xml declaration and contains special info for the xml processor. Tree order is top to bottom and left to right. Is possible to count nodes and select them by ordinal position w/respect to document order. Ex. Address and select 2nd <para> child of doc. Data Model
8
Xpath Syntax Expression is the primary syntactic construct in XPath
Evaluated to yield an object of 4 basic types. node-set (unordered collection of nodes without duplicates). boolean (true/false) number (float) string (sequence of UCS chars) Expression Evaluation occurs will respect to a context. (XSLT/XPointer specified context) Location path is one important kind of expression. Location paths select a set of nodes relative to the context node. node (the context node) pair of non-zero positive integers (context position and context size) set of variable bindings (mapping from variable names to variable values) function library (mapping function names to functions) XPath implementations support a core Function Library. XSLT/XPointer extend XPath by defining additional functions set of namespace declarations (mapping from prefixes to namespace URIs) Expression
9
Location Path Location Path provides the mechanism for ‘addressing’ parts of an XML doc, similar to file system addressing. Ex: /book/year (select all the year elements that have a book parent) Every location path can be expressed using a straightforward but rather verbose syntax: unabbreviated syntax (verbose syntax) Ex: child::* (select all element children of the context node) abbreviated syntax Ex. * (equivalent to unabbreviation above) (special case of an Expr) Location Path: describes path from 1pt to another in XML doc RelativeLocationPath ::= Step | RelativeLocationPath ‘/’ Step | AbbreviatedRelativeLocationPath =>The initial sequence of steps selects a set of nodes relative to a context node. Each node in that set is used as a context node for the following step. The set of nodes identified by the composition of the steps is this union. ex: child::div/child::para selects the para element children of the div element children of the context node para element grandchildren that have div parents AbsoluteLocationPath ::= ‘/’ RelativeLocationPath? | AbbreviatedAbsoluteLocationPath AbbreviatedAbsoluteLocationPath => / by itself selects the root node of the document containing the context node. If followed by a relative location path, then the location path selects the set of nodes relative to the root node of the document containing the context node. Location Path
10
Location Path Cont. Two types of paths: Relative & Absolute
Relative location path: consists of a sequence of one or more location steps separated by / absolute location path: consists of / optionally followed by a relative location path Composed of a series of steps (1 or more) Ex. Child::bib/child::book (select the book element children of the bib element children of the context node) Ex. / (select the root node of the document containing the context node) Location Path
11
Location Path Examples
Verbose syntax (has syntactic abbreviations for common cases) Examples (unabbreviated) child::book selects the book element children of the context node child::* selects all element children of the context node attribute::price selects the price attribute of the context node descendant::book selects all book descendants of the context node self::book selects the context node if it is a book element (otherwise selects nothing) child::*/child::book selects all book grandchildren of the context node / selects the document root (which is always the parent of the document element) Location Path
12
Location Steps 3 parts axis (specifies relationship btwn selected nodes and the context node) node test (specifies the node type and expanded-name of selected nodes) predicates (arbitrary expressions to refine the selected set of nodes) The syntax for location step is the axis name and node test separated by a double colon followed by zero or more expressions, each in square bracket. Evaluate a location step is to generate an initial node-set from axis (relationship to context node) and node-test (node-type and expanded-name), then filter that node-set by each of the predicates in turn. ex: child::book[position( )=1] child is the name of the axis, book is the node test, and [position()=1] is a predicate ex: descendant::book[position( )=1] selects the all book element descendants of the context node firstly, then filter the one which is first book descendant of context node. Location Step
13
Location Steps Axes We’ve only seen these, so far Node test
13 axes defined in XPath Ancestor, ancestor-or-self Attribute Child Descendant, descendant-or-self Self Following Preceding Following-sibling, preceding-sibling Namespace Parent Node test Identifies type and expanded-name of node. Can use a name, wildcard or function to evaluate/verify type and name. ex. Child::text() select the text node children of context node. Child::book select book element children of context node. Attribute::* select all attribute children of context node. We’ve only seen these, so far Node Tests Every Axis has a principal node type. Principal node type is the type of the nodes that the axis can contain. attribute axis (principal node type is attribute) In relation to CONTEXT NODE Ancestor, ancestor–or-self //Ancestors of context node…or-self includes context node Attribute //Specific attribute of the context node Child //children of the context node Descsndant, descendant-or-self //descendants of …or-self includes context node Following //all elements that com after the context node excluding descentants //elements whose start tags come AFTER the end tag of the context node in doc order Preceding //elements coming before ContextNode excluding ancestors Folling-dibling, preceding-sibling //any sibling preceding(before)or following(after) contextnode LtoR order // in tree structure Namespace //all open namespaces at context node Empty if ContextNode is NOT an element Parent //parent of context node Self //context node itself Location step
14
Location Step Cont. Predicate
A predicate filters a node-set with respect to an axis to produce a new node-set. Use XPath expressions (normally, boolean expressions) in square brackets following the basis (axis & node test). Ex. Child::book[attribute::price=“25”] (select all book children of the context node that have a price attribute with value 25. A predicateExpr is evaluated by evaluating the Expr and converting the result to a boolean (True or False)
15
Examples Axis and Node Test: Basis and Predicate:
descendant::publisher (selects the publisher elements that are descendant of the context node) attributes::* (selects all attributes of the context node) Basis and Predicate: child::book[3] (selects the 3rd book of the children of the context node) child::*[self::author or self::year][position()=last()] (selects the last author or year child of the context node) child::book[attribute::page=“400”][5] (selects the fifth book child of the context node that has a page attribute with value 400) Basis: axis & nodetest Location Path
16
Abbreviated Syntax Abbreviated syntax is the simpler way to express location path. For common case, abbreviation can be used to express concisely (not every case). Each abbreviation can be converted to unabbreviated one. child:: can be omitted from a location step (child is the default axis) ex. bib/book is equivalent to child::bib/child::book attribute:: can be abbreviated ex. is short for child::book[attribute::price=“25”] // is short for /descendant-or-self::node()/ ex. Book//author is short for book/descendant-or-self::node()/child::author A location step of . is short for self::node() ex: .//book is short for self::node()/descendant-or-self::node()/child::book Location step of .. is short for parent::node() ex. ../title is short for parent::node()/child::title Location Path
17
Expressions Function Calls Function Calls Node-sets Booleans Numbers
Strings Function Calls Expressions
18
Function Calls Function call expression is evaluated by using the FunctionName to identify a function in the expression evaluation context function library. An argument is converted to type string (as if calling the string function), to type boolean (as if calling the Boolean function), to type number (as if calling the number function), An argument that is not of type node-set cannot be converted to a node-set. Ex. position() function returns the current node’s position in the context node list as a number. Expressions
19
Expressions Function Calls Node-sets Booleans Numbers Strings
20
Node-sets A location path can be used as an expression.
The expression returns the set of nodes selected by the path. Expressions
21
Expressions Function Calls Node-sets Booleans Numbers Strings
22
Booleans A boolean can only have two values: true or false
The following operators can be used in boolean expressions or combine two boolean expressions according to the usual rules of boolean logic: or and =, != <=, <, >=, > Ex. Book=‘XML complete’ or book=‘Principles of Database Expressions
23
Expressions Function Calls Node-sets Booleans Numbers Strings
24
Numbers A number represents a floating-point number, no pure integers exist in Xpath. The basic arithmetic operators include: +, -, *, div and mod. div 10 Expressions
25
Expressions Function Calls Node-sets Booleans Numbers Strings
26
Strings Strings consist of a sequence of zero or more character.
May be enclosed in either single or double quotes. Comparison operators: =, != Expressions
27
Core Function Library XPath defines a core set of functions to evaluate expressions. All implementations of Xpath must implement the core function library. Four type of functions: Node Set Functions: operate on or return info about node sets. String Functions: are used for basic string operations. Ex. substring(“12345”, 0, 3) returns “12” Boolean Functions: all return true or false. Number Functions: are used for basic number operations. Core Library
28
Xpath Utilities Miscellaneous utilities related to Xpath
XPath Visualiser: This is a powerful tool for the evaluation of an XPath expression and visual presentation of the resulting node-set. allowing you to experiment with XPath for finding the correct expression. The display of the XML source document is similar to the default IE display with the same syntax color and collapsible & expandable container nodes. very straightforward XPath learning process. Xpath Utilities
29
XPath Visualiser Context Node Xpath input Tree View of XML Doc
Xpath evaluating result Result is highlighted Xpath Utilities
30
Conclusion Xpath is complete pattern match language.
Provides an concise way for addressing parts of an XML document. Base for XSLT, Xpointer and XML Query WG. Supported by W3C. Implementing XPath basically requires learning the abbreviated syntax of location path expressions and the functions of the core library. Conclusion
31
Reference XML Path Language (XPath) V1.0 http://www.w3.org/TR/xpath
XML in a Nutshell ch09.html Managing XML and Semistructured Data Xpath utilities Xpath Reference
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.