Download presentation
Presentation is loading. Please wait.
Published byJanis Hubbard Modified over 9 years ago
1
Copyright © Software Insight 2003 1 Querying and Transforming XML with.NET
2
Copyright © Software Insight 2003 2 Software Consultant –12+ Years programming, engineering, management experience –C#, VB.NET, C++, Java Contributing Editor: –asp.netPRO –C#PRO –Visual Studio Magazine –.NET Magazine Authored –.NET and COM: Working Together (MightyWords press 2001) –Teach Yourself DirectX 7 in 24 Hours, Sams Press (co-author) Participates in the.NET design reviews Speaker at conferences and user groups –INETA Speakers Bureau Contact at www.idesign.net or brian.noyes@idesign.net www.idesign.net About Brian Noyes
3
Copyright © Software Insight 2003 3 Gameplan Quick Intro to XML Technologies Loading XML Query DOM Documents Working with XPathDocuments Transforming XML with XslTransform
4
Copyright © Software Insight 2003 4 XML Technologies Overview XML is the basis for many emerging technologies and capabilities Hype clouds the issue –XML itself is very simple –derivative technologies and use are where the complexity lies XML forms the glue that allows systems to interoperate
5
Copyright © Software Insight 2003 5 XML Uses XML is used extensively for –Data definition –B2B communication –Messaging technologies –Structured storage of data
6
Copyright © Software Insight 2003 6 XML Derivative Technologies Many derivative technologies and grammars continue to be defined and evolve: –XML Schema 1.0 –XPath 1.0 –XSLT 1.0 –XPointer –XLink –SOAP –RSS –WebDAV –…
7
Copyright © Software Insight 2003 7 What is XML? eXtensible Markup Language Simple syntax for data –Self describing markup –Strict but simple rules define a document –Human readable text –Derived from SGML As is HTML
8
Copyright © Software Insight 2003 8 XML Syntax Rules –Case sensitive –A document is composed of nodes Everything in a document is a node –A document has one and only one root node That root node is an element –Elements have both start and end tags –Elements are properly nested –Elements can contain attributes, child elements, or text –Special node definitions for specific purposes: Comments, CDATA, Processing Instructions, Entities
9
Copyright © Software Insight 2003 9 Sample XML Document Going Under Going Under Bring Me To Life Bring Me To Life Everybody's Fool Everybody's Fool
10
Copyright © Software Insight 2003 10 XML Namespaces Used to scope named objects in a document Similar to programming language namespaces Consist of an arbitrary string Should be unique to ensure no name clashes with other XML documents –Two conventions exist for namespaces to try to ensure uniqueness: URL: Used because domain names are guaranteed to be unique in the Internet space xmlns=“http://www.idesign.net/DataAccess” URN: arbitrary naming with specific syntax xmlns=“urn:idesign-net”
11
Copyright © Software Insight 2003 11 XML Namespaces Namespaces can be declared at document or element scope –Child nodes are scoped to the default namespace of their parent node Prefixes are used to allow scoping to multiple namespaces with shorthand notation <Root xmlns="http://www.idesign.net/DataAccess" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:foo="urn:yadda-foobar-skwee"> <!-- Unscoped elements are assumed to be defined in http://www.idesign.net/DataAccess namespace --> <!-- Now unscoped elements are assumed to be defined in urn:idesign-net namespace --> DeDoDoDoDeDaDaDa
12
Copyright © Software Insight 2003 12 Commonly Used XML Grammars in.NET XML Schemas –Used to define the structure / type system of an XML document XPath –Used for querying XML documents XSLT –Used to transform XML into other document formats
13
Copyright © Software Insight 2003 13 XML Schemas XML Schemas define: –Allowable structure of an XML document What elements and attributes are allowed where and in what order –The type system of the XML document The elements and attributes represent data, so the organization of those nodes and the kind of data contained in them defines the type system of the document XML Documents are instances of a particular schema
14
Copyright © Software Insight 2003 14 XML Schemas Three generations to contend with: –Document Type Definitions (DTDs) –XML Data Reduced (XDR) –XML Schema 1.0 Definition Language (XSD)
15
Copyright © Software Insight 2003 15 Document Type Definitions Structural definition of an XML document Not XML itself Limited type definition capability Quickly becoming obsolete <!DOCTYPE Music [ ]>
16
Copyright © Software Insight 2003 16 XML Data Reduced (XDR) XML Schema Definition Language (XDR) –Early attempt by Microsoft to standardize an XML-based schema language based on the W3C Schema Recommendation before W3C approval –Not as expressive as XML Schema 1.0 <Schema xmlns="urn:idesign.net-music" xmlns:dt="urn:schemas-microsoft-com:datatypes">
17
Copyright © Software Insight 2003 17 XML Schema 1.0 (XSD) The formal recommendation approved by W3C Includes: –Primitive type definitions –Simple and complex type definitions –Element and attribute declarations –Namespace scoping –Inheritance, includes, imports The default schema mechanism used by.NET
18
Copyright © Software Insight 2003 18 XML Schema 1.0 Example
19
Copyright © Software Insight 2003 19 XPath Overview Query Syntax for XML –Can think of XPath as the SQL for XML –Provides common syntax and semantics for location specifications in a document for both XSLT and XPointer –Navigates the hierarchical structure of a document –Represents a document as a tree of nodes
20
Copyright © Software Insight 2003 20 XPath Expressions Evaluate to a node set, boolean, number, or string Expressions are evaluated based on a current context or position within a document –Kind of like a cursor in a database Not XML themselves
21
Copyright © Software Insight 2003 21 XPath Expression Composition XPath Expressions are composed of location steps –Each location step has Axis (optional/implied) Node Test Predicate –Results from one location step are used as the context node for the next location step –Location steps are separated by slashes locstep1/locstep2/locstep3/…
22
Copyright © Software Insight 2003 22 XPath Expression Composition Axes to define a relative path –descendant::, child::, parent::, etc. –Implicit child:: if not provided –Shorthand operators for common axes: / - root // - descendant. – current node.. – parent node Node Test defines a node to match against –Element, attribute, built-in function Predicate defines conditional logic to determine matches –Can define multiple predicates Implicit AND of separate predicates –Can define complex predicates Boolean logic, built-in functions, extension functions
23
Copyright © Software Insight 2003 23 XPath and Namespaces Nodes referred to in an XPath expression are frequently scoped to a namespace Up to the processor to decide what to do about it –Usually will only match against properly scoped node names with namespaces Can include namespace prefixes in node names to scope to their namespace –Processor has to be made aware of the namespace URI associated with the prefix somehow Use XmlNamespaceManager in.NET to create collection of prefix/URI associations in the NameTable of the processor object
24
Copyright © Software Insight 2003 24 XPath Examples Examples: Music/Artist Music/Artist/@name /Music/Artist/Album[@name=‘Fallen’] descendant::Album[@name='Fallen'] descendant::Album/@name /Music/child::Artist descendant::*[parent::Album] descendant::*[parent::Album][@number = 2] descendant::*[parent::Album][@number = 2][contains(text(),'Life')] descendant::*[parent::myns:Album][@number = 2]
25
Copyright © Software Insight 2003 25 XSLT Overview Extensible Stylesheet Language Transformations Defines transformations for XSL –Conversions from one document structure to another –Can be used to output document formats suitable for presentation (i.e. HTML, RTF, PDF, etc.) XSLT transforms source trees into result trees –Uses pattern associations specified in XPath –Output determined by templates Fragments of markup that are output based on the matching patterns encountered in the source document match attributes use XPath expressions to evaluate source document node matches –Based on the current context in the document –Selection paths are relative to the current node Really a programming language unto itself
26
Copyright © Software Insight 2003 26 XSLT Constructs XSLT instructions –Declare templates –Apply templates –Call templates –Declare/pass arguments/parameters –Get the value of a matched node –Copy a node –Evaluate functions –Control program flow for-each, if, choose-when-otherwise –Emit nodes Elements, attributes, etc.
27
Copyright © Software Insight 2003 27 XSLT Example Artists
28
Copyright © Software Insight 2003 28 Managing XML in.NET Lots of support for XML in.NET –DOM Support XmlDocument –Data-Oriented Support DataSet / XmlDataDocument –XML Navigation Support XPathNavigator / XPathDocument –Streaming Support XmlReader / XmlWriter –Serialization Support XmlSerialization SOAP formatted object serialization –XML Web Services
29
Copyright © Software Insight 2003 29 Loading XML Document Object Model (DOM) Oriented –XmlDocument.Load() Load from disk or stream –XmlDocument.LoadXml() Load from a string representation of XML document Data Oriented –DataSet.ReadXml() Loads XML that is suitable for representation as a DataSet Will infer a schema if one is not present –XmlDataDocument constructor Takes a DataSet and provides an XmlDocument derived class
30
Copyright © Software Insight 2003 30 Loading XML XPath Oriented –XPathDocument() constructor Load from stream, uri, reader XSLT Oriented –XslTransform.Load() Load from URL Stream Oriented –Stream + XmlTextReader Read only/forward only node access from a stream –Document + XmlNodeReader Read only/forward only node access from a document
31
Copyright © Software Insight 2003 31 XmlResolver Handles all aspects of resolving external entities referred to by an XML Document –DTDs, Entities, schemas, stylesheets Handles security –Logs you into remote locations Handles negotiating network protocols to access entities Optional argument to most XML load operations
32
Copyright © Software Insight 2003 32 Performing DOM Queries XmlNode class methods –SelectNodes() –SelectSingleNode() XmlDocument class methods –GetElementsByTagName() –GetElementById()
33
Copyright © Software Insight 2003 33 XmlNode.SelectNodes() Easiest access queries when working with DOM documents Takes an XPath expression Optionally takes an XmlNamespaceManager to support namespace prefixed node names Returns an XmlNodeList –Iterate through the results and operate on them accordingly Provides Read/Write access to nodes in the DOM –Abstract base class – actually an XPathNodeList instance Uses an XPathNavigator under the covers –Derived DocumentXPathNavigator object
34
Copyright © Software Insight 2003 34 XmlNode.SelectNodes() // Load the document XmlDocument doc = new XmlDocument(); doc.Load("C:\\demos\\MusicBase.xml"); // Call SelectNodes with a valid XPath Query XmlNodeList nodes = doc.SelectNodes( "descendant::Track[@number=2]/text()"); foreach (XmlNode node in nodes) { Console.WriteLine(node.Value); }
35
Copyright © Software Insight 2003 35 XmlNode.SelectSingleNode() Used to obtain a single match for a query Actually just calls SelectNodes and returns first node in list –Not very efficient, query could return a large result set –Just a convenience method Improving performance: use smarter queries –Construct an XPath query that will only return a single node Smart knowledge of the target data [postion() = 1] predicate
36
Copyright © Software Insight 2003 36 XmlNode.SelectSingleNode() // Load the document XmlDocument doc = new XmlDocument(); doc.Load("C:\\demos\\MusicBase.xml"); // Select a single node XmlNode dumb = doc.SelectSingleNode( "descendant::Track[@number=2]"); XmlNode smart = doc.SelectSingleNode( "descendant::Track[@number=2][position()=1]"); Console.WriteLine(dumb.InnerText); Console.WriteLine(smart.InnerText);
37
Copyright © Software Insight 2003 37 Other DOM Query Options XmlDocument.GetElementsByTagName() –Convenience method for getting collections of elements –Uses internal XmlDocument maintained element tables to obtain –Does not allow you to control level of element selected in heirarchy XmlDocument.GetElementByID() –Takes an ID value and returns the element whose ID attribute matches –Only works with DTD schema definition associated with document –Not very useful
38
Copyright © Software Insight 2003 38 Beware of XmlNodeList.Count XmlNodeList is an abstract base class XPathNodeList is returned from SelectNodes() –Very lightweight list implementation You may be tempted to check the Count property to see how many nodes you got –Triggers an extremely inefficient walk and test of every node in the node set to calculate the count
39
Copyright © Software Insight 2003 39 XPathNodeList.Count public virtual int get_Count() { if (!(this.done)) this.ReadUntil(2147483647); return this.list.Count; } internal int ReadUntil(int index) { int local0; XmlNode local1; local0 = this.list.Count; while (!(this.done) && index >= local0) { if (this.iterator.MoveNext()) { local1 = this.GetNode(this.iterator.Current); if (local1 == null) continue; this.list.Add(local1); local0++; continue; } this.done = 1; break; } return local0; }
40
Copyright © Software Insight 2003 40 XmlDataDocument Bridges the XML and relational views of data –Easy access to XML functionality on the data Queries, transformation, validation –Easy access to relational functionality with the data.NET Databinding Derives from XmlDocument –“Is a” XmlDocument –Looks like a DataSet when needed through a property Everything covered about XmlDocument applies –Its just a derived class –Has a specialized XPathNavigator implementation
41
Copyright © Software Insight 2003 41 Enter the.NET XPath Processing Engine.NET provides a new model for working with XML for navigation and querying Other models good for what they were designed for –DOM is heavyweight, read/write, random access, handles all XML node types Rich access to complex object model –XmlReader is lightweight, read-only, forward-only, handles all XML Node Types Fast serialized access for getting data in memory Need something that gives the best of both worlds
42
Copyright © Software Insight 2003 42 XPathNavigator Abstract base class used to define XML navigation and query API for XML in.NET –Document classes that support implement IXPathNavigable Lightweight object model –Hierarchical tree based –Restricted to just elements, attributes, namespaces Name and value Read-Only –Controlled through the API defined by the XPathNavigator methods Random-Access –Can move forward or backward in document, traverse up or down levels easily
43
Copyright © Software Insight 2003 43 XPathDocument Document class for loading XML into memory for an XPathNavigator to operate against Constructor takes stream, URL, or reader Loads XML into memory in a much lighter weight object model than XmlDocument Used to construct an XPathNavigator through CreateNavigator() call –Actually an XPathDocumentNavigator instance
44
Copyright © Software Insight 2003 44 XPath Objects
45
Copyright © Software Insight 2003 45 XPathNavigator Methods MoveXXX Methods –Allows forward/backward navigation amongst the nodes SelectXXX Methods –Select() – primary XPath query evaluation method, returns iterator for node set results –SelectAncenstors()/SelectChildren()/SelectDescendants() Easier selection along particular axes Others –Compile() – Pre-compile XPathExpression for rapid repeated execution –Evaluate() – Similar to Select(), but works for queries that do not return node sets (string, bool, number) –Clone() – Get a copy of the iterator for navigating child nodes –GetAttribute()/GetNamespace() – access node info –Positional checks
46
Copyright © Software Insight 2003 46 XPathNavigator Properties NodeType Name – fully qualified LocalName – no namespace NamespaceURIValue Prefix – namespace prefix NameTableHasAttributesHasChildrenEtc.
47
Copyright © Software Insight 2003 47 XPathExpressions Class to support declaring and compiling XPath Queries Not created directly, returned from XPathNavigator.Compile() Only way to associate namespace information with a query
48
Copyright © Software Insight 2003 48 XPathNodeIterators What gets returned from a call to Select() Lightweight iterator on the resulting node-set Access the current node with Current property Loop through with MoveNext() Count property is pre-calculated
49
Copyright © Software Insight 2003 49 Querying with XPathNavigator Load the XML into a document class Get an XPathNavigator for the document Optionally compile an XPath expression for reuse Call Select() to get an XPathNodeIterator or Call Evaluate() to get a value back –Returned as an object, must cast to the expected type –Can also return node set iterators this way too
50
Copyright © Software Insight 2003 50 Simple XPathNavigator Query // Load the document object XPathDocument doc = new XPathDocument("C:\\demos\\MusicBase.xml"); // Get a navigator for the document XPathNavigator nav = doc.CreateNavigator(); // Perform the query XPathNodeIterator iter = nav.Select("descendant::Track[@number=2]"); // Move through the results with the iterator while (iter.MoveNext()) { Console.WriteLine(iter.Current.Value); }
51
Copyright © Software Insight 2003 51 Working with Namespaces Create an XmlNamespaceManager using the NameTable of the navigator Add namespace prefix/URI combos Compile the XPath expression Set the XPathExpression context
52
Copyright © Software Insight 2003 52 Working with Namespace // Load the document XPathDocument doc = new XPathDocument("C:\\demos\\MusicDefNs.xml"); // Get a navigator for the document XPathNavigator nav = doc.CreateNavigator(); // Compile the query with namespace prefixes XPathExpression exp = nav.Compile("descendant::foo:Track[@number=2]"); // Create the namespace manager XmlNamespaceManager mgr = new XmlNamespaceManager(nav.NameTable); // Add an alias for the default namespace mgr.AddNamespace("foo","urn:foo"); // Set the context on the expression exp.SetContext(mgr); // Perform the query XPathNodeIterator iter = nav.Select(exp); // Loop through the results while (iter.MoveNext()) { Console.WriteLine(iter.Current.Value); }
53
Copyright © Software Insight 2003 53 Advanced Topics Sorting the node set –Use the AddSort methods of the XPathExpression object Specify a sort order or a comparer object Creating custom navigator objects –Can implement IXPathNavigable on any object that you want to expose with an XML representation –Need to implement all the move/select/etc. functionality yourself
54
Copyright © Software Insight 2003 54 Transforming XML Simple conceptual process –Take one document –Apply rules and matching to parts of the document –Produce a new document form that is equivalent or a subset or different representation of the source document XML -> HTML XML Schema A -> XML Schema B XML -> RTF, PDF, text, etc.
55
Copyright © Software Insight 2003 55 XslTransform Class XslTransform class provides a powerful XSLT processing engine for.NET –Load from multiple kinds of sources XPath documents, XML readers, URL, File –Transforms to multiple output targets File, URL, Stream, Text/XML Writers –Can take argument lists –Can process embedded scripts C#, Visual Basic, Jscript –Can pass managed objects to the template and allow their methods to be called
56
Copyright © Software Insight 2003 56 Simple XSLT Transformations Load the XSLT file into an XslTransform instance Call Transform(), passing the source document and the output target // Load the transform document XslTransform xslt = new XslTransform(); xslt.Load("C:\\demos\\MusicTransform.xslt"); // Tranform the source xslt.Transform("C:\\demos\\MusicBase.xml", "C:\\demos\\Music.html",new XmlUrlResolver());
57
Copyright © Software Insight 2003 57 Transforming with Arguments XSLT templates can take parameters that can be declared inline or can be passed in when the template is called XslTransform supports passing in arguments from the calling code –Must comply with the allowable XSLT types: bool, string, number, XPathNodeIterator Process: –Construct an XslArgumentList –Add parameters to it with the AddParams method –Associate it with the transform in the call to Transform()
58
Copyright © Software Insight 2003 58 Working with Arguments // Load the transform document XslTransform xslt = new XslTransform(); xslt.Load("C:\\demos\\MusicTransform.xslt"); // Construct the argument list XsltArgumentList xargs = new XsltArgumentList(); xargs.AddParam("ShowTracks",String.Empty,false); // Construct the document to contain the source XPathDocument doc = new XPathDocument("C:\\demos\\Music.xml"); // Construct the output stream FileStream fso = File.OpenWrite("C:\\demos\\Music.html"); // Tranform the source xslt.Transform(doc,xargs,fso,new XmlUrlResolver()); fso.Close();
59
Copyright © Software Insight 2003 59 XSLT Extensibility –Allows jscript or vbscipt to be declared as part of the template <msxsl:script> –Allows embedded.NET language code –C#, VB.NET, Jscript Extension Objects –Can call methods in managed objects from within the XSLT template –Powerful mechanism for extensibility –Better maintainability and reuse –All of.NET Framework capability available to you
60
Copyright © Software Insight 2003 60 Extension Objects // Load the transform document XslTransform xslt = new XslTransform(); xslt.Load("C:\\demos\\ExtensionObject.xslt"); // Create the extension object DocGenerator dg = new DocGenerator(); // Construct the argument list XsltArgumentList xargs = new XsltArgumentList(); xargs.AddExtensionObject("urn:idesign-net", dg); // Construct the document to contain the source XPathDocument doc = new XPathDocument("C:\\demos\\Empty.xml"); // Tranform the source xslt.Transform(doc,xargs,fso,new XmlUrlResolver());
61
Copyright © Software Insight 2003 61 Extension Objects public class DocGenerator { public XPathNodeIterator GetDoc() { XmlDocument doc = new XmlDocument(); doc.LoadXml(" Fred ” + “ Alberta "); XPathNavigator nav = doc.CreateNavigator(); return nav.Select("*"); } <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:idesign="urn:idesign-net"> We found an employees container node Name:
62
Copyright © Software Insight 2003 62 Summary Prefer XPathDocument/XPathNavigator when you don’t need to modify nodes Use Extension objects to offload transform computation Resources: –Applied XML Programming for Microsoft.NET, Dino Esposito, Microsoft Press –Manipulate XML Data Easily with the XPath and XSLT APIs in the.NET Framework, Dino Esposito, MSDN Magazine, July 2003 http://www.msdn.microsoft.com/msdnmag/issues/03/07/XPathandXSLT/defa ult.aspx http://www.msdn.microsoft.com/msdnmag/issues/03/07/XPathandXSLT/defa ult.aspx –XSL Transformations: XSLT Alleviates XML Schema Incompatibility Headaches, Don Box, Aaron Skonnard, John Lam, MSDN Magazine, August 2000 http://www.msdn.microsoft.com/msdnmag/issues/0800/XSLT/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.