Dongwon Lee, Ph.D. IST 516 Fall 2011

Slides:



Advertisements
Similar presentations
Dr. Alexandra I. Cristea CS 253: Topics in Database Systems: XPath, NameSpaces.
Advertisements

Dr. Alexandra I. Cristea XPath and Namespaces.
Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Managing Data Exchange: XPath
1 XQuery Web and Database Management System. 2 XQuery XQuery is to XML what SQL is to database tables XQuery is designed to query XML data What is XQuery?
XPath XML Path Language. Outline XML Path Language (XPath) Data Model Description Node values XPath expressions Relative expressions Simple subset of.
XPath Eugenia Fernandez IUPUI. XML Path Language (XPath) a data model for representing an XML document as an abstract node tree a mechanism for addressing.
XPATH neral/examples.html.
XPath Carissa Mills Jill Kerschbaum. What is XPath? n A language designed to be used by both XSL Transformations (XSLT) and XPointer. n Provides common.
XML Language Family Detailed Examples Most information contained in these slide comes from: These slides are intended.
XPath Tao Wan March 04, What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary.
Cornell CS 502 More XML XML schema, XPATH, XSLT CS 502 – Carl Lagoze – Cornell University.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Internet Technologies1 XSLT Processing XML using XSLT Using XPath.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
XP ATH - XML Path Language. W HAT IS XP ATH ? XPath, the XML Path Language, is a query language for selecting nodes from an XML document.query languagenodesXML.
SD2520 Databases using XML and JQuery
Navigating XML. Overview  Xpath is a non-xml syntax to be used with XSLT and Xpointer. Its purpose according to the W3.org is  to address parts of an.
CSE3201/CSE4500 XPath. 2 XPath A locator for elements or attributes in an XML document. XPath expression gives direction.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
Xpath Xlink Xpointer Xquery Sources:
1/17 ITApplications XML Module Session 7: Introduction to XPath.
XML DOCUMENTS & DATABASES. Summary of Introduction to XML HTML vs. XML HTML vs. XML Types of Data Types of Data Basics of XML Basics of XML XML Syntax,
CSE3201/CSE4500 Information Retrieval Systems
XP New Perspectives on XML Tutorial 6 1 TUTORIAL 6 XSLT Tutorial – Carey ISBN
XSLT and XPath, by Dr. Khalil1 XSL, XSLT and XPath Dr. Awad Khalil Computer Science Department AUC.
WORKING WITH XSLT AND XPATH
1 XPath XPath became a W3C Recommendation 16. November 1999 XPath is a language for finding information in an XML document XPath is used to navigate through.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
XPath Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
XSLT part of XSL (Extensible Stylesheet Language) –includes also XPath and XSL Formatting Objects used to transform an XML document into: –another XML.
XSLT Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
More on XSLT 07. More on XSLT © Aptech Limited XPath  In this first lesson, XPath, you will learn to:  Define and describe XPath.  Identify nodes according.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 6. XML Path (XPath)
Database Systems Part VII: XML Querying Software School of Hunan University
XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was.
 2002 Prentice Hall, Inc. All rights reserved. 1 Chapter 12 – XSL: Extensible Stylesheet Language Transformations (XSLT) Outline 12.1Introduction 12.2Setup.
WPI, MOHAMED ELTABAKH PROCESSING AND QUERYING XML 1.
XPath Presented by Kushan Athukorala. 2 Agenda XPath XPath Terminology Selecting Nodes Predicates.
1 XML Data Management XPath Principles Werner Nutt.
Submitted To: Ms. Poonam Saini, Asst. Prof., NITTTR Submitted By: Rohit Handa ME (Modular) CSE 2011 Batch.
XPath. XPath, the XML Path Language, is a query language for selecting nodes from an XML document. The XPath language is based on a tree representation.
More XML XPATH, XSLT CS 431 – February 23, 2005 Carl Lagoze – Cornell University.
CSE3201/CSE4500 XPath. 2 XPath A locator for items in XML document. XPath expression gives direction of navigation.
XPath --XML Path Language Motivation of XPath Data Model and Data Types Node Types Location Steps Functions XPath 2.0 Additional Functionality and its.
1 XPath. 2 Agenda XPath Introduction XPath Nodes XPath Syntax XPath Operators XPath Q&A.
1 The XPath Language. 2 XPath Expressions Flexible notation for navigating around trees A basic technology that is widely used uniqueness and scope in.
CITA 330 Section 5 XPath. XSL XSL (Extensible Stylesheet Language) is the standard language for writing stylesheets to transform XML documents among different.
Displaying Data with XSLT ©NIITeXtensible Markup Language/Lesson 6/Slide 1 of 45 Objectives In this lesson, you will learn to: * Perform conditional formatting.
5 Copyright © 2004, Oracle. All rights reserved. Navigating XML Documents by Using XPath.
1 XSL Transformations (XSLT). 2 XSLT XSLT is a language for transforming XML documents into XHTML documents or to other XML documents. XSLT uses XPath.
XML Query languages--XPath. Objectives Understand XPath, and be able to use XPath expressions to find fragments of an XML document Understand tree patterns,
Beginning XML 4th Edition.
XML and XPath.
Xpath creation.
Querying and Transforming XML Data
{ XML Technologies } BY: DR. M’HAMED MATAOUI
ACG 4401 XSLT Extensible Stylesheet Language for Transformations
XML Path Language Andy Clark 17 Apr 2002.
XML WITH CSS.
XPath 9-May-19.
More XML XML schema, XPATH, XSLT
XML DOCUMENTS & DATABASES
XPath 7-Dec-19.
Presentation transcript:

Dongwon Lee, Ph.D. IST 516 Fall 2011 /*/*/self::* XPath Dongwon Lee, Ph.D. IST 516 Fall 2011

XPath Path-based XML query language V1.0 – 1999: http://www.w3.org/TR/xpath V2.0 – 2003: http://www.w3.org/TR/xpath20/ Functional, strongly-typed query language http://www.w3schools.com/xpath/xpath_intro.asp

Apps of XPath XQuery: a full-blown query language for XML for $x in doc("books.xml")/bookstore/book where $x/price>30 order by $x/title return $x/title XPointer/XLink: a standard way to create hyperlinks in XML <book title="Harry Potter">   <description   xlink:type="simple"   xlink:href="http://book.com/images/HPotter.gif"   xlink:show="new">   As his fifth year at Hogwarts School of Witchcraft and   Wizardry approaches, 15-year-old Harry Potter is.......   </description> </book>

Apps of XPath XSLT: a style sheet language of XML that can transform XML from one to another format <xsl:stylesheet version="1.0” xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/">       <xsl:for-each select="catalog/cd">         <tr>           <td><xsl:value-of select="title"/></td>           <td><xsl:value-of select="artist"/></td>         </tr>       </xsl:for-each>  </xsl:template> </xsl:stylesheet>

XPath vs. SQL XPath SQL XML Model Trees Hierarchy Order Relational Model Tables Flat Orderless (except ORDER-BY)

XPath vs. XQuery XPath XQery XML Model Trees Hierarchy Order Can do all XPath does but not vice versa Turing-Complete general purpose PL Can retrieve, update, and transform XML data FLWOR expression

XPath Expression Expression (basic building block) returns one of the 4 objects: node-set (an unordered collection of nodes without duplicates) boolean (true or false) number (a floating-point number) string (a sequence of characters) . . .

processing-instruction XPath Nodes processing-instruction <?xml version="1.0” encoding="UTF-8”?> <note xmlns="http://pike.psu.edu" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation=“note.xsd”> <to>Tove</to> <!-- <from>Jani</from> --> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> namespace document comment text Nodes: 7 types element, attribute, text, namespace processing-instruction, comment, document

axis :: node-test [predicate] Location Step Location Steps are evaluated in order from left to right Absolute: /step/step/… Relative: step/step/… Axis: Specifies the node relationship Node Test: specifies node type and name Predicate: Instructions to filter nodes Preferred – Faster to evaluate axis :: node-test [predicate]

1. Axis / selects the root of the node hierarchy Forward Axis <document/> as the default root of XML document Forward Axis child::, descendent::, attribute::, self::, descendent-or-self::, following-sibling::, following:: Backward Axis ancestor::, preceding-sibling::, preceding::, ancestor-or-self:: Relative to the current context (Axis::context) child::emp: “emp” is the child element of current node attribute::date: “date” is the attribute of current node

Node Relationships Parent Ancestors Self Sibling Descendants Child Courses Undergrad Room Instructor Name Office Phone Parent Child Descendants Ancestors Grandchild Graduate Sibling Self

1. Axis Abbreviation Descendent-or-self::node()  // child::  / attribute::  @ self::node()  . parent::  .. Eg /child::doc/descendent::chapter  /doc//chapter //doc/attribute::type  //doc/@type

2. Node Test node(): matches all nodes text(): matches all text nodes ElementName: matches all elements of type ‘ElementName’ *: matches all elements @*: matches all attributes

2. Node Test * (wildcard) is often used to match unknown XML elements /catalog/cd/*: all the child elements of all the cd elements of the catalog element /* : all children of the root <document/> /*/*: all grandchildren of the root <document/> //*: all elements of the XML document

3. Predicate Path-expresson[ filtering condition ]  Path-expression that satisfies the filtering condition Eg //doc [@type=‘PDF’] finds all <doc> elements whose attribute “type” values are ‘PDF’ This returns <doc> elements, not its attributes “type” Filtering condition does not affect the returned answers (ie, projection) of XPath It just adds more constraints to satisfy

Location Step Examples

Examples of usage

/Courses/*[child::Room=‘110 IST’] IST Example What IST Classes are in Room IST 110? /Courses/*[child::Room=‘110 IST’] Original XML Result <Courses> <Undergrad ID=“IST462”> <Room>110 IST</Room> <Instructor /> <TA>Robert Luo</TA> </Undergrad> <Graduate ID=“IST597”> <Room>210 IST</Room> </Graduate> </Courses>

/Courses/*/TA/parent::* IST Example What IST courses have TA’s? /Courses/*/TA/parent::* Original XML Result <Courses> <Undergrad ID=“IST462”> <Room>110 IST</Room> <Instructor /> <TA>Robert Luo</TA> </Undergrad> <Graduate ID=“IST597”> <Room>210 IST</Room> </Graduate> </Courses>

/Courses/*/Room/text() IST Example What rooms are used by IST courses? /Courses/*/Room/text() Original XML Result <Courses> <Undergrad ID=“IST402”> <Room>110 IST</Room> <Instructor /> <TA>Robert Luo</TA> </Undergrad> <Graduate ID=“IST597”> <Room>210 IST</Room> </Graduate> </Courses> 110 IST 210 IST

NOTE: When used within Predicate, Child::Room == Child::Room/text() Comparison Comparison can be performed using =, !=, <=, <, >=, and > Examples [child::Room != ‘205 IST’] [child::Time > 1220] NOTE: When used within Predicate, Child::Room == Child::Room/text()

Math Operators + : performs addition - : performs subtraction * : performs multiplication div : performs division mod : returns the remainder of division Examples: [child::Time mod 100 = 30]

Node Functions last() : returns the numeric position of the last node in a list position() : returns the numeric position of the current node count() : returns the number of nodes in a list name(): returns the name of a node id() : selects elements by their unique ID

/Courses/*[count(child::*)>2] Node Function Example Which courses have more than 2 child elements? /Courses/*[count(child::*)>2] Original XML Result <Courses> <Undergrad ID=“IST402”> <Room>110 IST</Room> <Instructor /> <TA>Robert Luo</TA> </Undergrad> <Graduate ID=“IST597”> <Room>210 IST</Room> </Graduate> </Courses>

String Functions concat(string, string) : concatenates the string arguments starts-with(string, string) : returns true if the first string starts with the second string contains(string, string) : returns true if the first string contains the second string Eg, concat(‘sh’, ‘oe’) = ‘shoe’ starts-with(‘cat’, ‘ca’) = true contains(‘puppy’, ‘upp’) = true

String Functions substring(string, number, [number]) : returns a substring of the provided string string-length(string) : returns the number of characters in the string Eg, substring(‘chicken’, 3, 4) = ‘icke’ substring(‘chicken’, 3) = ‘icken’ string-length(‘cat’) = 3

String Functions Examples //Book [starts-with(child::Title, “X”)] / price //Book [string-length(Author/FN)=3] / Title <Catalog> <Book> <Title>XML</> <Price>19.9</> <Author> <FN>Joe</> </Author> </Book> <Book> <Title>XSLT</> <Price>22.9</> <FN>HJ</><LN>Kyle</> </Book> </Catalog>

Number Functions sum(node-set) : returns the sum of values for each node in a node set Eg, sum(//@price) floor(number) : returns the largest integer that is not greater than the argument Eg, floor(2.6) = 2 ceiling(number) : returns the smallest integer that is not less than the argument Eg, ceiling(2.6) = 3 round(number) : returns the closest integer to the argument Eg, round (2.4) = 2

Boolean OPs in XPath Conjunction: “and” Disjunction: ““or” //Product[@price>10.8 and @year>2000] Disjunction: ““or” /Customer[@cname=‘Lee’ or @cid>100] Disjunction: “|” Compute both node-sets and return the union //Book | //Tape NOTE: some XPath engines currently support only either “|” or “or” disjunction

XPath Lab [www.zvon.org] /AAA/CCC       <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> /AAA/DDD/BBB    <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA>

XPath Lab [www.zvon.org] //BBB       <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> /AAA/*    <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA>

XPath Lab [www.zvon.org] /AAA/BBB[1]       <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> /AAA/BBB[last()]    <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA>

XPath Lab [www.zvon.org] /AAA//BBB[1]       <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> /AAA//BBB[last()]    <AAA>           <BBB/>           <CCC/>           <BBB/>           <BBB/>           <DDD>                <BBB/>           </DDD>           <CCC/>      </AAA> Position=3 Position =1

Position Explanation “/AAA//BBB” returns two lists: Three <BBB> as the children of <AAA> One <BBB> as the grandchild of <AAA> Then, position like [1] or [2] applies predicate to answers in each list SEPARATELY /AAA//BBB[1] returns both: First <BBB> from the first list -- a child of <AAA> First <BBB> from the second list -- a grandchild of <AAA> /AAA//BBB[last()] however returns nothing last() returns the position of the last node in a list But there are two lists here and can’t pick which

XPath Lab [www.zvon.org] //@id <AAA>  <BBB id = "b1"/>  <BBB id = "b2"/>  <BBB name = "bbb"/>  <BBB/> </AAA>     //BBB[@id=“b2”] <AAA>  <BBB id = "b1"/>  <BBB id = "b2"/>  <BBB name = "bbb"/>  <BBB/> </AAA>    

XPath Lab [www.zvon.org] //*[count(BBB)=2] <AAA>           <CCC>                <BBB/>                <BBB/>                <BBB/>           </CCC>           <DDD>                <BBB/>                <BBB/>           </DDD>           <EEE>                <CCC/>                <DDD/>           </EEE>  </AAA> //*[count(*)=3] <AAA>           <CCC>                <BBB/>                <BBB/>                <BBB/>           </CCC>           <DDD>                <BBB/>                <BBB/>           </DDD>           <EEE>                <CCC/>                <DDD/>           </EEE>  </AAA>

XPath Evaluation S/W Many S/W have built-in support for XPath 1.0 and 2.0 now Eg, XPath Visualizer: Windows only http://xpathvisualizer.codeplex.com/ XMLSpy: Windows only <oXygen/>: Mac and Windows XMLPad: Windows only

#1. XPath Visualizer Answer #2 for //letter/paragraph Answer #1 for Minor bug here

#2. XMLSpy Choose Evaluate XPath

#2. XMLSpy Answer #1 for //letter/paragraph

#2. XMLSpy Answer #2 for //letter/paragraph

#3. <Oxygen/> Press Enter key Answer #1 for //letter/paragraph

#3. <Oxygen/> Answer #2 for //letter/paragraph

#4 XMLPad

XPath Evaluation in Programming XPath Engines / Libraries Apache Xalan-Java: http://xml.apache.org/xalan-j/ Saxon: http://saxon.sourceforge.net/ Jaxen: http://jaxen.codehaus.org/ PL specific APIs Java: package javax.xml.xpath + DOM PHP: domxml’s xpath_eval() (v4), SimpleXML (v5)

Eg. XPath in JAVA public Node findAddress(String name, Document source) throws Exception { // need to recreate a few helper objects XMLParserLiaison xpathSupport = new XMLParserLiaisonDefault(); XPathProcessor xpathParser = new XPathProcessorImpl(xpathSupport); PrefixResolver prefixResolver = new PrefixResolverDefault(source.getDocumentElement()); // create the XPath and initialize it XPath xp = new XPath(); String xpString = "//address[child::addressee[text() = '” +name+"']]"; xpathParser.initXPath(xp, xpString, prefixResolver); // now execute the XPath select statement XObject list = xp.execute(xpathSupport, source.getDocumentElement(), prefixResolver); return list.nodeset().item(0); } http://www.javaworld.com/javaworld/jw-09-2000/jw-0908-xpath.html?page=3

Eg. SimpleXML in PHP http://www.tuxradar.com/practicalphp/12/3/3 <?php     $xml = simplexml_load_file('employees.xml');     echo "<strong>Using direct method...</strong><br />";     $names = $xml->xpath('/employees/employee/name');     foreach($names as $name) {         echo "Found $name<br />";     } echo "<br />";     echo "<strong>Using indirect method...</strong><br />";     $employees = $xml->xpath('/employees/employee');     foreach($employees as $employee) {         echo "Found {$employee->name}<br />";     } echo "<br />";     echo "<strong>Using wildcard method...</strong><br />";     $names = $xml->xpath('//name');     foreach($names as $name) {         echo "Found $name<br />";     } ?> http://www.tuxradar.com/practicalphp/12/3/3

Lab #2 (DUE: Sep. 25 11:55PM) https://online.ist.psu.edu/ist516/labs Tasks: Individual Lab Using an XML files, practice XPath queries Turn-In XPath queries and English interpretation Screenshot of results of XPath queries