Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dongwon Lee, Ph.D. IST 516 Fall 2011

Similar presentations


Presentation on theme: "Dongwon Lee, Ph.D. IST 516 Fall 2011"— Presentation transcript:

1 Dongwon Lee, Ph.D. IST 516 Fall 2011
/*/*/self::* XPath Dongwon Lee, Ph.D. IST 516 Fall 2011

2 XPath Path-based XML query language
V1.0 – 1999: V2.0 – 2003: Functional, strongly-typed query language

3 Apps of XPath XQuery: a full-blown query language for XML
for $x in doc("books.xml")/bookstore/book where $x/price>30 order by $x/title return $x/title XPointer/XLink: a standard way to create hyperlinks in XML <book title="Harry Potter">   <description   xlink:type="simple"   xlink:href="   xlink:show="new">   As his fifth year at Hogwarts School of Witchcraft and   Wizardry approaches, 15-year-old Harry Potter is   </description> </book>

4 Apps of XPath XSLT: a style sheet language of XML that can transform XML from one to another format <xsl:stylesheet version="1.0” xmlns:xsl=" <xsl:template match="/">       <xsl:for-each select="catalog/cd">         <tr>           <td><xsl:value-of select="title"/></td>           <td><xsl:value-of select="artist"/></td>         </tr>       </xsl:for-each>  </xsl:template> </xsl:stylesheet>

5 XPath vs. SQL XPath SQL XML Model Trees Hierarchy Order
Relational Model Tables Flat Orderless (except ORDER-BY)

6 XPath vs. XQuery XPath XQery XML Model Trees Hierarchy Order
Can do all XPath does but not vice versa Turing-Complete general purpose PL Can retrieve, update, and transform XML data FLWOR expression

7 XPath Expression Expression (basic building block) returns one of the 4 objects: node-set (an unordered collection of nodes without duplicates) boolean (true or false) number (a floating-point number) string (a sequence of characters) . . .

8 processing-instruction
XPath Nodes processing-instruction <?xml version="1.0” encoding="UTF-8”?> <note xmlns=" xmlns:xsi=" xsi:noNamespaceSchemaLocation=“note.xsd”> <to>Tove</to> <!-- <from>Jani</from> --> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> namespace document comment text Nodes: 7 types element, attribute, text, namespace processing-instruction, comment, document

9 axis :: node-test [predicate]
Location Step Location Steps are evaluated in order from left to right Absolute: /step/step/… Relative: step/step/… Axis: Specifies the node relationship Node Test: specifies node type and name Predicate: Instructions to filter nodes Preferred – Faster to evaluate axis :: node-test [predicate]

10 1. Axis / selects the root of the node hierarchy Forward Axis
<document/> as the default root of XML document Forward Axis child::, descendent::, attribute::, self::, descendent-or-self::, following-sibling::, following:: Backward Axis ancestor::, preceding-sibling::, preceding::, ancestor-or-self:: Relative to the current context (Axis::context) child::emp: “emp” is the child element of current node attribute::date: “date” is the attribute of current node

11 Node Relationships Parent Ancestors Self Sibling Descendants Child
Courses Undergrad Room Instructor Name Office Phone Parent Child Descendants Ancestors Grandchild Graduate Sibling Self

12 1. Axis Abbreviation Descendent-or-self::node()  // child::  /
attribute:: self::node()  . parent::  .. Eg /child::doc/descendent::chapter  /doc//chapter //doc/attribute::type 

13 2. Node Test node(): matches all nodes text(): matches all text nodes
ElementName: matches all elements of type ‘ElementName’ *: matches all elements @*: matches all attributes

14 2. Node Test * (wildcard) is often used to match unknown XML elements
/catalog/cd/*: all the child elements of all the cd elements of the catalog element /* : all children of the root <document/> /*/*: all grandchildren of the root <document/> //*: all elements of the XML document

15 3. Predicate Path-expresson[ filtering condition ]  Path-expression that satisfies the filtering condition Eg //doc finds all <doc> elements whose attribute “type” values are ‘PDF’ This returns <doc> elements, not its attributes “type” Filtering condition does not affect the returned answers (ie, projection) of XPath It just adds more constraints to satisfy

16 Location Step Examples

17 Examples of usage

18 /Courses/*[child::Room=‘110 IST’]
IST Example What IST Classes are in Room IST 110? /Courses/*[child::Room=‘110 IST’] Original XML Result <Courses> <Undergrad ID=“IST462”> <Room>110 IST</Room> <Instructor /> <TA>Robert Luo</TA> </Undergrad> <Graduate ID=“IST597”> <Room>210 IST</Room> </Graduate> </Courses>

19 /Courses/*/TA/parent::*
IST Example What IST courses have TA’s? /Courses/*/TA/parent::* Original XML Result <Courses> <Undergrad ID=“IST462”> <Room>110 IST</Room> <Instructor /> <TA>Robert Luo</TA> </Undergrad> <Graduate ID=“IST597”> <Room>210 IST</Room> </Graduate> </Courses>

20 /Courses/*/Room/text()
IST Example What rooms are used by IST courses? /Courses/*/Room/text() Original XML Result <Courses> <Undergrad ID=“IST402”> <Room>110 IST</Room> <Instructor /> <TA>Robert Luo</TA> </Undergrad> <Graduate ID=“IST597”> <Room>210 IST</Room> </Graduate> </Courses> 110 IST 210 IST

21 NOTE: When used within Predicate, Child::Room == Child::Room/text()
Comparison Comparison can be performed using =, !=, <=, <, >=, and > Examples [child::Room != ‘205 IST’] [child::Time > 1220] NOTE: When used within Predicate, Child::Room == Child::Room/text()

22 Math Operators + : performs addition - : performs subtraction
* : performs multiplication div : performs division mod : returns the remainder of division Examples: [child::Time mod 100 = 30]

23 Node Functions last() : returns the numeric position of the last node in a list position() : returns the numeric position of the current node count() : returns the number of nodes in a list name(): returns the name of a node id() : selects elements by their unique ID

24 /Courses/*[count(child::*)>2]
Node Function Example Which courses have more than 2 child elements? /Courses/*[count(child::*)>2] Original XML Result <Courses> <Undergrad ID=“IST402”> <Room>110 IST</Room> <Instructor /> <TA>Robert Luo</TA> </Undergrad> <Graduate ID=“IST597”> <Room>210 IST</Room> </Graduate> </Courses>

25 String Functions concat(string, string) : concatenates the string arguments starts-with(string, string) : returns true if the first string starts with the second string contains(string, string) : returns true if the first string contains the second string Eg, concat(‘sh’, ‘oe’) = ‘shoe’ starts-with(‘cat’, ‘ca’) = true contains(‘puppy’, ‘upp’) = true

26 String Functions substring(string, number, [number]) : returns a substring of the provided string string-length(string) : returns the number of characters in the string Eg, substring(‘chicken’, 3, 4) = ‘icke’ substring(‘chicken’, 3) = ‘icken’ string-length(‘cat’) = 3

27 String Functions Examples
//Book [starts-with(child::Title, “X”)] / price //Book [string-length(Author/FN)=3] / Title <Catalog> <Book> <Title>XML</> <Price>19.9</> <Author> <FN>Joe</> </Author> </Book> <Book> <Title>XSLT</> <Price>22.9</> <FN>HJ</><LN>Kyle</> </Book> </Catalog>

28 Number Functions sum(node-set) : returns the sum of values for each node in a node set Eg, floor(number) : returns the largest integer that is not greater than the argument Eg, floor(2.6) = 2 ceiling(number) : returns the smallest integer that is not less than the argument Eg, ceiling(2.6) = 3 round(number) : returns the closest integer to the argument Eg, round (2.4) = 2

29 Boolean OPs in XPath Conjunction: “and” Disjunction: ““or”
Disjunction: ““or” Disjunction: “|” Compute both node-sets and return the union //Book | //Tape NOTE: some XPath engines currently support only either “|” or “or” disjunction

30 XPath Lab [www.zvon.org]
/AAA/CCC       <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> /AAA/DDD/BBB    <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA>

31 XPath Lab [www.zvon.org]
//BBB       <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> /AAA/*    <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA>

32 XPath Lab [www.zvon.org]
/AAA/BBB[1]       <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> /AAA/BBB[last()]    <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA>

33 XPath Lab [www.zvon.org]
/AAA//BBB[1]       <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> /AAA//BBB[last()]    <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> Position=3 Position =1

34 Position Explanation “/AAA//BBB” returns two lists:
Three <BBB> as the children of <AAA> One <BBB> as the grandchild of <AAA> Then, position like [1] or [2] applies predicate to answers in each list SEPARATELY /AAA//BBB[1] returns both: First <BBB> from the first list -- a child of <AAA> First <BBB> from the second list -- a grandchild of <AAA> /AAA//BBB[last()] however returns nothing last() returns the position of the last node in a list But there are two lists here and can’t pick which

35 XPath Lab [www.zvon.org]
<AAA>  <BBB id = "b1"/>  <BBB id = "b2"/>  <BBB name = "bbb"/>  <BBB/> </AAA>     <AAA>  <BBB id = "b1"/>  <BBB id = "b2"/>  <BBB name = "bbb"/>  <BBB/> </AAA>    

36 XPath Lab [www.zvon.org]
//*[count(BBB)=2] <AAA>           <CCC>                <BBB/>                <BBB/>                <BBB/>           </CCC>           <DDD>                <BBB/>                <BBB/>           </DDD>           <EEE>                <CCC/>                <DDD/>           </EEE>  </AAA> //*[count(*)=3] <AAA>           <CCC>                <BBB/>                <BBB/>                <BBB/>           </CCC>           <DDD>                <BBB/>                <BBB/>           </DDD>           <EEE>                <CCC/>                <DDD/>           </EEE>  </AAA>

37 XPath Evaluation S/W Many S/W have built-in support for XPath 1.0 and 2.0 now Eg, XPath Visualizer: Windows only XMLSpy: Windows only <oXygen/>: Mac and Windows XMLPad: Windows only

38 #1. XPath Visualizer Answer #2 for //letter/paragraph Answer #1 for
Minor bug here

39 #2. XMLSpy Choose Evaluate XPath

40 #2. XMLSpy Answer #1 for //letter/paragraph

41 #2. XMLSpy Answer #2 for //letter/paragraph

42 #3. <Oxygen/> Press Enter key Answer #1 for //letter/paragraph

43 #3. <Oxygen/> Answer #2 for //letter/paragraph

44 #4 XMLPad

45 XPath Evaluation in Programming
XPath Engines / Libraries Apache Xalan-Java: Saxon: Jaxen: PL specific APIs Java: package javax.xml.xpath + DOM PHP: domxml’s xpath_eval() (v4), SimpleXML (v5)

46 Eg. XPath in JAVA public Node findAddress(String name, Document source) throws Exception { // need to recreate a few helper objects XMLParserLiaison xpathSupport = new XMLParserLiaisonDefault(); XPathProcessor xpathParser = new XPathProcessorImpl(xpathSupport); PrefixResolver prefixResolver = new PrefixResolverDefault(source.getDocumentElement()); // create the XPath and initialize it XPath xp = new XPath(); String xpString = "//address[child::addressee[text() = '” +name+"']]"; xpathParser.initXPath(xp, xpString, prefixResolver); // now execute the XPath select statement XObject list = xp.execute(xpathSupport, source.getDocumentElement(), prefixResolver); return list.nodeset().item(0); }

47 Eg. SimpleXML in PHP http://www.tuxradar.com/practicalphp/12/3/3
<?php     $xml = simplexml_load_file('employees.xml');     echo "<strong>Using direct method...</strong><br />";     $names = $xml->xpath('/employees/employee/name');     foreach($names as $name) {         echo "Found $name<br />";     } echo "<br />";     echo "<strong>Using indirect method...</strong><br />";     $employees = $xml->xpath('/employees/employee');     foreach($employees as $employee) {         echo "Found {$employee->name}<br />";     } echo "<br />";     echo "<strong>Using wildcard method...</strong><br />";     $names = $xml->xpath('//name');     foreach($names as $name) {         echo "Found $name<br />";     } ?>

48 Lab #2 (DUE: Sep. 25 11:55PM) https://online.ist.psu.edu/ist516/labs
Tasks: Individual Lab Using an XML files, practice XPath queries Turn-In XPath queries and English interpretation Screenshot of results of XPath queries


Download ppt "Dongwon Lee, Ph.D. IST 516 Fall 2011"

Similar presentations


Ads by Google