1 XPath Extracting Data from XML
2 Data stored in an XML document must be extracted to use with various applications Data can be extracted programmatically We will discuss XPath: a declarative language for extracting data from XML XPath is used extensively in other specifications, e.g., XQuery, XSL, XPointer
3 Dark Side of the Moon Pink Floyd Space Oddity David Bowie 9.90 Aretha: Lady Soul Aretha Franklin 9.90 Dark Side of the Moon Pink Floyd Space Oddity David Bowie 9.90 Aretha: Lady Soul Aretha Franklin 9.90 An XML document
4 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK USA Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin The XML document as a DOM tree The XML document as a DOM tree
5 Main Element of Syntax: Path Expressions / at the beginning of an XPath expression represents the root of the document / between element names represents a parent-child relationship // represents an ancestor-descendent marks an attribute [condition] specifies a condition
6 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin /catalog Getting the root element of the document USA
7 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin /catalog/cd Finding child nodes USA
8 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin /catalog/cd/price Finding descendent nodes USA
9 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin /catalog/cd[price<10] Condition on elements USA
10 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin //title // represents any directed path in the document /catalog//title USA
11 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin /catalog/cd/* * represents any element name in the document USA
12 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin /*/* What will the following expressions return? //* //*[price=9.90] //*[price=9.90]/* USA * represents any element name in the document
13 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin /catalog/cd[1] Position based condition /catalog/cd[last()] USA
14 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin //title | //price Use | for union USA
15 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin marks attributes USA
16 catalog.xml catalog cd country titleartistprice titleartistpricetitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin marks attributes USA
17 catalog.xml catalog cd country titleartistprice titleartisttitleartistprice country UK Dark Side of the Moon Space OddityAretha: Lady Soul Pink Floyd David BowieAretha Franklin USA price How would you write: The price of the cds Whose artist is David Bowie?
18 XPath Axises We have discussed the following axises: Child (/) Descendent (//) Attribute These symbols are actually a shorthand, e.g., /cd//price is the same as child::cd/descendent::price There are additional shorthands, e.g., Self (/.) Parent (/..)
19 Additional Axises ancestorContains all ancestors (parent, grandparent, etc.) of the current node ancestor-or-selfContains the current node plus all its ancestors (parent, grandparent, etc.) descendant-or-selfContains the current node plus all its descendants (children, grandchildren, etc.) followingContains everything in the document after the closing tag of the current node following-siblingContains all siblings after the current node precedingContains everything in the document that is before the starting tag of the current node preceding-siblingContains all siblings before the current node
20 More Details XPath specification at W3C XPath XPath tutorial in W3Schools XPath tutorial Mulberry XPath Quick ReferenceQuick Reference