XML for .NET Session 1 Introduction to XML Introduction to XSLT

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

XML: Extensible Markup Language
Mobile Development Storing State with XML Rob Miles Department of Computer Science.
1 XSLT – eXtensible Stylesheet Language Transformations Modified Slides from Dr. Sagiv.
XPath Eugenia Fernandez IUPUI. XML Path Language (XPath) a data model for representing an XML document as an abstract node tree a mechanism for addressing.
1 Extensible Markup Language: XML HTML: portable, widely supported protocol for describing how to format data XML: portable, widely supported protocol.
1 Extensible Markup Language: XML HTML: portable, widely supported protocol for describing how to format data XML: portable, widely supported protocol.
1 COS 425: Database and Information Management Systems XML and information exchange.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
SD2520 Databases using XML and JQuery
C# The new language for Updated by Pavel Ježek © University of Linz, Institute for System Software, 2004 published under the Microsoft Curriculum License.
.NET and XML (or XML in.NET) David Oguns Matt Harding.
ECA 228 Internet/Intranet Design I Intro to XSL. ECA 228 Internet/Intranet Design I XSL basics W3C standards for stylesheets – CSS – XSL: Extensible Markup.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
10/06/041 XSLT: crash course or Programming Language Design Principle XSLT-intro.ppt 10, Jun, 2004.
A First Program Using C#
JSP Standard Tag Library
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
HTML DOM.  The HTML DOM defines a standard way for accessing and manipulating HTML documents.  The DOM presents an HTML document as a tree- structure.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
1/17 ITApplications XML Module Session 7: Introduction to XPath.
Presentation XML. NET SEMINAR By: Siddhant Ahuja (SID)
XP New Perspectives on XML Tutorial 6 1 TUTORIAL 6 XSLT Tutorial – Carey ISBN
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
WORKING WITH XSLT AND XPATH
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
Fall 2006 Florida Atlantic University Department of Computer Science & Engineering COP 4814 – Web Services Dr. Roy Levow Part 4 - XML.
Working with the XML Document Object Model ©NIITeXtensible Markup Language/Lesson 7/Slide 1 of 44 Objectives In this lesson, you will learn to: *Identify.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
XML DOM Functionality in.NET DSK Chakravarthy
XQL, OQL and SQL Xia Tang Sixin Qian Shijun Shen Feb 18, 2000.
XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
ECA 228 Internet/Intranet Design I XSLT Example. ECA 228 Internet/Intranet Design I 2 CSS Limitations cannot modify content cannot insert additional text.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
XSLT Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
WORKING WITH XML IN THE.NET FRAMEWORK. Accessing an XML File Basic activities: open it, read it.NET Framework provides structured and unstructured mechanisms.
1 Introduction  Extensible Markup Language (XML) –Uses tags to describe the structure of a document –Simplifies the process of sharing information –Extensible.
Declaratively Producing Data Mash-ups Sudarshan Murthy 1, David Maier 2 1 Applied Research, Wipro Technologies 2 Department of Computer Science, Portland.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
C# and Windows Programming XML Processing. 2 Contents Markup XML DTDs XML Parsers DOM.
Introduction to the Document Object Model Eugenia Fernandez IUPUI.
Unit 3 — Advanced Internet Technologies Lesson 11 — Introduction to XSL.
XP New Perspectives on XML, 2 nd Edition Tutorial 7 1 TUTORIAL 7 CREATING A COMPUTATIONAL STYLESHEET.
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
Accessing XML Documents Using DOM ©NIITeXtensible Markup Language/Lesson 8/Slide 1 of 23 Objectives In this lesson, you will learn to: * Use XML DOM objects.
XML CORE CSC1310 Fall XML DOCUMENT XML document XML document is a convenient way for parsers to archive data. In other words, it is a way to describe.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
XML Schema – XSLT Week 8 Web site:
1 XSL Transformations (XSLT). 2 XSLT XSLT is a language for transforming XML documents into XHTML documents or to other XML documents. XSLT uses XPath.
Product Training Program
XML: Extensible Markup Language
Unit 4 Representing Web Data: XML
Displaying XML Data with XSLT
Querying and Transforming XML Data
Data Virtualization Demoette… Data Lineage Reporting
Introduction to the Document Object Model
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
CHAPTER 9 JAVA AND XML.
Chapter 7 Representing Web Data: XML
Processing XML.
New Perspectives on XML
New Perspectives on XML
Presentation transcript:

XML for .NET Session 1 Introduction to XML Introduction to XSLT Programmatically Reading XML Documents Introduction to XPATH

XML Documents Can be Read Programmatically The .NET Framework consists of many classes to aid in programmatically iterating through and navigating XML documents. These classes are found in the System.Xml namespace. The various classes in the System.Xml namespace are highlighted in Chapter 6 of the text, XML and ASP.NET (starting on page. 261).

Accessing XML Content XML documents can be accessed in one of two ways: in a push model or a pull model. The pull model loads the entire XML document into memory, and then works with the document once it has been completely loaded. The push model accesses only tiny pieces of the XML document when needed.

Comparing and Contrasting Push and Pull Approaches Pull Model Push Model Pluses Quickly iterate and navigate through XML content once it’s fully loaded. Allows for navigation and iteration of very large XML files. Minuses Requires that the entire XML document be loaded into memory; does not scale to large XML content or large number of users. Difficult to add and update elements in the XML document.

How to use the Two Methods The .NET Framework provides developers both methods: Pull Method – use the DOM classes in the .NET Framework. Push Method – use the XmlReader and XmlWriter classes.

Using the Pull Method The System.Xml namespace contains a number of classes to work with XML documents in the DOM paradigm: XmlDocument – represents an XML document. XmlElement – represents an individual element in the DOM XmlAttribute – represents an attribute. XmlText – represents text content.

Using the Push Method The XmlReader reads one node at a time from a specified XML source. The XmlReader can only read in a FORWARD direction. The XmlReader class cannot be used directly; instead, one of its derived classes must be used instead: XmlNodeReader – reads one node at a time from an XML DOM. XmlTextReader – reads one node at a time from an XML source, such as a file with XML content. XmlValidatingReader – a reader that performs DTD or schema validation (more on this next week!)

Iterating through an XML Document using XmlTextReader To iterate through the contents of an XML document with the XmlTextReader we need to: Specify the XML document to iterate through when creating the XmlTextReader. Call the Read() method, which reads in the next Node. Access the properties of the XmlTextReader to determine the name, value, and other information about the read Node.

Iterating through an XML Document using XmlTextReader We can programmatically read through the contents of an XML file like so: // create an XmlTextReader to read the specified XML file XmlTextReader reader = new XmlTextReader(filepath); // now, display the information of each node in the TextBox while (reader.Read()) { // access the properties of the XmlTextReader class... // like reader.Name, reader.NodeType, reader.Value, etc. } // close the XmlTextReader reader.Close();

What is a Node? Recall that the XmlReader classes read XML nodes. What constitutes a node? Can you identify the nodes in the following XML fragment? <?xml version=“1.0” encoding=“utf-8” ?> <books> <book price=“34.95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </book> </books>

What is a Node? <?xml version=“1.0” encoding=“utf-8” ?> <books> <book price=“34.95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </book> </books> The whitespace between each element (if present) is also considered a node! (Although, you can set the XmlTextReader’s WhitespaceHandling property to specify if the Reader should read whitespace nodes or not.

What is a Node? <?xml version=“1.0” encoding=“utf-8” ?> <books> <book price=“34.95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </book> </books> Notice that the attributes of an element are not considered nodes...

Creating a Program to View the Content Read by an XmlTextReader We can create a program that allows the user to select an XML file; then, the contents of the XML file are read by an XmlTextReader, with each read node’s name, type, and value displayed. (Run demo!)

Reading the Attributes As we saw in the demo, the attributes are not read as a separate node. We can determine whether or not a given node has attributes by the HasAttributes property. In order to programmatically access the attributes of a node, we must use the MoveToNextAttribute() method of the XmlTextReader.

Reading the Attributes while (reader.Read()) // C# { if (reader.HasAttributes) while (reader.MoveToNextAttribute()) // Access the attribute name/value via // reader.Name/reader.Value } While reader.Read // VB.NET If reader.HasAttributes then While reader.MoveToNextAttribute() ' Access the attribute name/value via ' reader.Name/reader.Value End While End If

The XmlTextReader Properties and Methods The properties and methods of the XmlTextReader are listed started on pg. 272 of the text. Some more germane methods include: ReadInnerXml() – returns a string with the complete content (including XML markup) of the current node’s content (child nodes, text content, etc.) ReadOutterXml() – returns a string containing the node’s XML markup along with the node’s content XML markup.

The XmlTextReader Properties and Methods Run ReadInnerOutterXml-ForXmlTextReader demo… When reading an XML document, the XmlTextReader class will throw an XmlException if there was an error in parsing the XML. An error can occur if the XML, for example, is malformed. (That is, it is not well-formed.)

The XmlTextReader Properties and Methods Run the XmlException demo We will examine the XmlNodeReader and XmlValidatingReader – the other two XmlReader classes – later in this course.

Using the DOM to Iterate through an XML Document In contrast to the Push method (XmlReader/XmlWriter), the .NET Framework offers a Pull method. Recall that the Pull method reads the entire XML document into memory and then works with it from there. For this model, XML documents are represented in the Document Object Model (DOM).

What is the DOM? DOM stands for Document Object Model, and it’s a model that can be used to describe an XML document. The DOM expresses the XML document as a hierarchy of nodes, where each element can have zero to many children elements. The text content and attributes of an element are expressed as its children as well.

Example XML File <?xml version="1.0" encoding="UTF-8" ?> <books> <book price="34.95">  <title>TYASP 3.0</title> <authors> <author>Mitchell</author>   </authors>  </book> <book price=“29.95">  <title>ASP.NET Tips</title> <authors> <author>Mitchell</author> <author>Walther</author> <author>Seven</author> </books>

The DOM View of the XML Document

The DOM Classes - XmlNode There are a number of classes in the System.Xml namespace that represent the DOM. Each “box” in the DOM model is represented in the .NET Framework by the XmlNode class. This means that elements, attributes, and text values are all represented by the XmlNode class. The XmlNode class is discussed on pg. 287

Extending the XmlNode Class There are a number of classes that are derived from the XmlNode class: XmlAttribute XmlElement XmlDocument And so on…

The XmlNode Properties The XmlNode class many properties, the most germane ones being: Name – the name of the node. For elements and attributes, the name is the name of the element or attribute. For text content, the name is #text. Value – the value of the DOM element. For elements, there is no value. For attributes, it’s the value of the attribute; for text nodes, it’s the value of the text in the node. NodeType – indicates the type of the node (element, text, attribute, etc.)

More XmlNode Properties InnerXml – the string content of the XML markup of the node’s children. OuterXml – the string content of the XML markup of the node itself and its children. InnerText – the string content of the value of the node and all its children nodes. HasChildNodes – a Boolean, indicating if the node has any children.

The XmlNodeList Class The XmlNodeList class represents an arbitrary collection of XmlNodes. For example, the XmlNode class has a ChildNodes property, which returns an XmlNodeList instance. This instance is a collection of nodes representing the DOM element’s children.

Loading an XML Document into a DOM Representation The XmlDocument’s Load() method has four variations: Load(Stream) Load(string) Load(TextReader) Load(XmlTextReader) In the Load(string) variation, the input string is a file path (or URL) to the XML file to load into the DOM representation.

The XmlDocument Properties The XmlDocument is derived from the XmlNode class, meaning it has all of the properties and methods available to the XmlNode class. Once an XML file has been loaded into an XmlDocument instance, we can access the root element through the DocumentElement property.

The XmlElement and XmlAttribute Classes The XmlElement and XmlAttribute classes are also derived from the XmlNode class. They represent, respectively, an element and an attribute.

Example The following loads and XML document and displays the name of the root element. Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim rootElementName as String rootElementName = xmlDoc.DocumentElement.Name

Example Iterating through the root element’s children: Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim n as XmlNode For Each n in xmlDoc.DocumentElement.ChildNodes ' Display the name of the node using n.Name Next

An Example of Iterating through an XML Document Let’s create an application that displays an XML document in a TreeView control. Each node in the TreeView represents a Node in the DOM

An Example of Iterating through an XML Document We can recursively iterate through the DOM, ensuring that we’ll visit each node. (Explain recursion?) Examine application code... Questions on the program?

Navigating through an XML Document So far, all we have seen is how to iterate through an XML document, one node at a time. With the pull method (DOM), however, we can navigate through the document as well. For example, we might want access just the elements in the document that have a certain name. (Such as elements with the name <author>.)

Accessing Elements with a Certain Name The XmlDocument class contains a GetElementsByTagName() method, which returns an XmlNodeList containing elements that have the specified tag name. Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim n as XmlNode For Each n in xmlDoc.GetElementsByTagName("author") Display n.Value Next What would be the output of the above code???

Navigating through an XML Document However, what if we want to access nodes based on more complex criteria, such as: “Access all <book> elements with a price attribute value less than 30,” or, “Access the name of the authors who have written more than one book.” To accomplish this we need something more powerful – enter XPath!

A Quick Examination of XPath XPath is used to define particular sections of an XML document. XPath is named XPath because its syntax is similar to the syntax for a file path. For example, in our books XML document, we could use the following XPath statement to access all of the author elements: /books/book/authors/author

Why We Might Want to Access Certain XML Document Portions When using XSLT to display an XML file, typically we want to display only a subset of the XML document. For example, we might want to display a listing of flights, displaying the date, the departure city and the destination city. When working with XML data, we might want to retrieve only a certain subset of the data. We might want to access data that meets a certain set of criteria. All of these tasks can be accomplished with XPath

XPath Components – Steps To access the root element of the XML document, we use the following syntax: /RootElementName Then, to access immediate descendents (children) of a given element, we use /, followed by the name of the child element. The / operator is referred to as the step operator.

XPath Components – Steps The step operator has parallels to the \ operator in file paths. With file systems (which can be modeled as XML documents), you navigate the directory structure by using \. For example, a path like: C:\Games\Quake\SavedGames This file path - C:\Games\Quake\SavedGames – takes you to the specified directory. A file system can be represented as an XML Document

The file system can be represented as an XML document… <?xml version="1.0" encoding="UTF-8" ?> <filesystem> <drive letter="C"> <folder name="Program Files" /> <folder name="Games"> <folder name="Quake"> <folder name="SavedGames" /> <file>Quake.exe</file> <file>README.txt</file> </folder> </folder> <folder name="Windows"> <file>README.txt</file> </folder> </drive> <drive letter="D"> <folder name="Backup"> <file>2003-06-01.bak</file> <file>2003-06-07.bak</file> </folder> </drive> </filesystem>

The DOM Model of the FileSystem XML Document

XPath Components - Steps Using XPath we can access all of the root element using: /filesystem

XPath Components - Steps To access all of the <drive> elements, we’d use: /filesystem/drive

XPath Components - Steps To access all of the folder elements that were children of <drive> elements, we’d use: /filesystem/drive/folder

XPath Components - Steps What about /filesystem/drive/folder/folder/folder

Descendent Steps Using elementName/elementName2, we get all of the elements that are children of elementName that have the name elementName2. But what if we want all elements that are descendents of elementName, regardless of whether or not the element is a child, grandchild, great-grandchild, etc.? Here, we use the // operator.

Descendent Steps As we saw earlier, /filesystem/drive/folder will return the folders that are immediate children of the <drive> element (Program Files, Games, and Window). If we want to get all folders, regardless of their depth in the hierarchy, we can use: /filesystem/drive//folder

Descendent Steps - Example What will /filesystem//file return?

Accessing an Element’s Text Value If an element has a text value (such as the <file> element), you can access it using the text() XPath function. For example, to return the contents of the <file> elements, we could use: /filesystem/drives//files/text()

Accessing Text Element’s - Example /filesystem/drives//files/text()

Accessing an Element’s Attribute Value To access an attribute value for all elements matching a particular XPath expression, use the following syntax: xpathExpression/@attributeName So, to access the values of the name attribute in the <folder> elements that are children of the <drive> element, you would use: /filesystem/drive/folder/@name

Accessing Element Attribute Values - Example /filesystem/drive/folder/@name

Example What if you wanted to retrieve the names of subdirectories? That is, you wanted to get the name attribute for all <folder> elements that were not children of the <drive> elements? What XPath expression would you use??? /filesystem/drive/folder//folder/@name

Filtering Imagine that you wanted to return only those folders that contain files. Would the following XPath work? /filesystem/drives//folders/file No! Because the above would return <file> elements. If you want to return folder elements, filtered to only those that contain files, you can use the following syntax: /filesystem/drives//folders[file]

Filtering Example /filesystem/drives//folders[file]

Filtering Similarly, you can return only elements that contain a certain attribute by using: elementName[@attributeName]

XPath Components - Predicates Realize that when using steps, all matching elements are returned. From the file system example, /filesystem/drive/folder will return all four <folder> elements (Program Files, Games, Window, and Backup). Predicates allow to only return those elements that meet a certain set of criteria. Predicate syntax: [boolean expression]

XPath Components - Predicates For example, to return all <folder> elements with the name attribute equal to Games, we could use: /filesystem/drive//folder[@name="Games"]

Predicate Example /filesystem/drive//folder[@name="Games"]

Predicate Example Predicates can also appear in earlier step expressions, like: /filesystem/drive[@letter="C"]/folder

Predicates A number of operators can be used within predicates: =, !=, <, >, <=, >=, and, or, not(), +, -, div, *, mod Example: to get all of the files in folders named either Windows or Quake, you could do: /filesystem//folder[@name="Quake" or @name="Windows"]/file

XPath Components – Predicates - Examples Here are some predicates – what elements would be returned for each? /filesystem/drive/folder[@name="Quake"] NOTHING IS RETURNED! This is because there is no folder that is a child of the <drive> element that has its name attribute equal to Quake.

XPath Components – Predicates - Examples What about the following XPath expression? /filesystem/drive//folder[@name="Quake"] <folder name="Quake"> <folder name="SavedGames" /> <file>Quake.exe</file> <file>README.txt</file> </folder> </folder>

XPath Components – Predicates - Examples What about the following XPath expression? /filesystem/drive/folder/@name name="My Programs" name="Games" name="Windows"

XPath Components – Predicates - Examples What about the following XPath expression? /filesystem/drive//folder[@name="Quake"]/file <file>Quake.exe</file> <file>README.txt</file>

More on XPath There are many more features and much more functionality available with XPath, which we’ll examine in Session 3. For a good tutorial on XPath, see: http://www.w3schools.com/xpath/default.asp.

Navigating through the DOM using XPath The XmlNode class contains two methods for navigating the DOM: SelectSingleNode(string) SelectNodes(string) These string input parameter for both of these methods is an XPath expression. SelectSingleNode() returns at most one node, the first node to match the XPath expression. SelectNodes() returns all of the nodes that match the XPath expression.

An Example The following code displays the titles of books whose price is less than $30.00. Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim n as XmlNode For Each n in _ xmlDoc.SelectNodes("/books/book[@price<30]/title/text()") Display n.Value Next

Answer: the name of the first author found in the XML document. An Example What does the following code output? Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim n as XmlNode n = xmlDoc.SelectSingleNode("//author/text()") Display n.Value Answer: the name of the first author found in the XML document.

Summary In this presentation, we saw how to programmatically iterate through XML documents. We examined the differences between the push and pull methods. The pull method uses the DOM, while the push method uses XmlTextReaders and XmlTextWriters.

Summary We studied the syntax of XPath, a technology designed to allow for XML document navigation. We saw how to use the SelectSingleNode() and SelectNodes() methods of the XmlNode class to navigate an XML document. XML document navigation is only possible in the DOM world.

Questions?