Programming with XML Written by: Adam Carmi Zvika Gutterman.

Slides:



Advertisements
Similar presentations
1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
Advertisements

XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
Summer A-2000, Project Course-- Carnegie Mellon University 1 Financial Engineering Project Course.
Document Type Definitions
14-Jun-15 DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
Summer A-2000, Project Course-- Carnegie Mellon University 1 Financial Engineering Project Course.
Programming with XML Written by: Adam Carmi Zvika Gutterman.
Internet Technologies1 More XML Schema The main source for these slides is “The XML Companion” by Bradley Other resources:
Sunday, June 28, 2015 Abdelali ZAHI : FALL 2003 : XML Schemas XML Schemas Presented By : Abdelali ZAHI Instructor : Dr H.Haddouti.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Manohar – Why XML is Required Problem: We want to save the data and retrieve it further or to transfer over the network. This.
Chapter 24 XML. CHAPTER GOALS Understanding XML elements and attributes Understanding the concept of an XML parser Being able to read and write XML documents.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
XML eXtensible Markup Language by Darrell Payne. Experience Logicon / Sterling Federal C, C++, JavaScript/Jscript, Shell Script, Perl XML Training XML.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Schema 1 Lecturer.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
XML for E-commerce III Helena Ahonen-Myka. In this part... n Transforming XML n Traversing XML n Web publishing frameworks.
Dr. Azeddine Chikh IS446: Internet Software Development.
Neminath Simmachandran
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Schemas Ellen Pearlman Eileen Mullin Programming the Web Using XML.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XML Language Family Detailed Examples Most information contained in these slide comes from: These slides are intended.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Syntax - Writing XML and Designing DTD's
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
1 Java and XML Modified from presentation by: Barry Burd Drew University Portions © 2002 Hungry Minds, Inc.
Session IV Chapter 9 – XML Schemas
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Electronic Commerce COMP3210 Session 4: Designing, Building and Evaluating e-Commerce Initiatives – Part II Dr. Paul Walcott Department of Computer Science,
Lecture 6 XML DTD Content of.xml fileContent of.dtd file.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
XML Processing in Java. Required tools Sun JDK 1.4, e.g.: JAXP (part of Java Web Services Developer Pack, already in Sun.
New Perspectives on XML, 2nd Edition
IS432 Semi-Structured Data Lecture 2: DTD Dr. Gamal Al-Shorbagy.
XML Schema. Why Schema? To define a class of XML documents Serve same purpose as DTD “Instance document" used for XML document conforming to schema.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
XML – Part III. The Element … This type of element either has the element content or the mixed content (child element and data) The attributes of the.
An Introduction to XML Sandeep Bhattaram
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
Schema Data Processing
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
225 City Avenue, Suite 106 Bala Cynwyd, PA , phone , fax presents… XML Syntax v2.0.
Well Formed XML The basics. A Simple XML Document Smith Alice.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
XSD: XML Schema Language Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Lecture 0 W3C XML Schema. Topics Status Motivation Simple type vs. complex type.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
13-Mar-16 DOM. 2 Difference between SAX and DOM DOM reads the entire XML document into memory and stores it as a tree data structure SAX reads the XML.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
1 Validation SAX-DOM. Objectives 2  Schema Validation Framework  XML Validation After Transformation  Workshops.
21-Jun-16 Document Object Model DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C.
4 Copyright © 2004, Oracle. All rights reserved. Validating XML by Using XML Schema.
XML. Contents  Parsing an XML Document  Validating XML Documents.
Unit 4 Representing Web Data: XML
Java/XML.
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
Chapter 7 Representing Web Data: XML
Jagdish Gangolly State University of New York at Albany
DOM 24-Feb-19.
Presentation transcript:

Programming with XML Written by: Adam Carmi Zvika Gutterman

XML2 Agenda About XML Review of XML syntax Document Object Model (DOM) JAXP W3C XML Schema Validating Parsers

XML3 About XML XML – EXtensible Markup Language Designed to describe data –Provides semantic and structural information –Extensible Human readable and computer-manipulable Software and Hardware independent Open and Standardized by W3C 1 Ideal for data exchange 1)World Wide Web Consortium (founded in 1994 by Tim Berners-Lee)

XML4 Comment David Reuven Harel :32:00 Ran a red light at Arik & Bentz st. offenders.xml Information is marked up with structural and semantic information. The characters &,, ‘, “ are reserved and can’t be used in character data. Use &, <, >, &apos; and " instead. Tag Character Data Character Data

XML5 David Reuven Harel :32:00 Ran a red light at Arik & Bentz st. offenders.xml: Tags Start Tag End Tag Root Tag Shorthand for: XML tags are not pre- defined and are case sensitive. An XML document may have only one root tag.

XML6 David Reuven Harel :32:00 Ran a red light at Arik & Bentz st. offenders.xml: Elements Root Element Elements mark-up information. Element x begins with a start-tag and ends with an end-tag XML Elements must be properly nested: XML documents must contain exactly one root element.

XML7 offenders.xml: Content     David   Reuven   Harel         10:32:00  RanaredlightatArik&Benz st.     The content of an element is all the text that lies between its start and end tags. An XML parser is required to pass all characters in a document, including whitespace characters.   whitespace

XML8 offenders.xml: Attributes David Reuven Harel :32:00 Ran a red light at Arik & Benz st. Attributes are used to provide additional information about elements. Attributes values must always be enclosed in quotes (“/‘)

XML9 DOM TM DOM TM – Document Object Model A Standard hierarchy of objects, recommended by the W3C, that corresponds to XML documents. Each element, attribute, comment, etc., in an XML document is represented by a Node in the DOM tree. The DOM API 1 allows data in an XML document to be accessed and modified by manipulating the nodes in a DOM tree. 1)Application Programming Interface

XML10 DOM Class Hierarchy 1 1)A partial class hierarchy is presented in this slide. > Node > Text > Element > Document > Comment > CharacterData > NodeList > NamedNodeMap > Attr

XML11 offenders.xml: DOM tree :Document :Element offenders :Comment Listsalltrafficoffenders :Element offender :Element firstName :Text David :Attribute id :Text  :Text  :Text  :Text  :Text 

XML12 Example: offenders DOM :Element violation :Attribute id :Text  :Element code :Attribute num :Attribute category :Text  :Element issueDate :Text offenderoffenders :Text 12 :Text 232 :Text traffic :Element lastName :Text Harel :Text  The element “middleName” was skipped

XML13 Example: offenders DOM :Element issueTime :Text 10:32:00 :Text  :Text Ranaredlight atArik&Benzst. offenderviolation :Text  offenders :Text 

XML14 JAXP JAXP – Java TM API for XML Processing JAXP enables applications to parse and transform XML documents using an API that is independent of a particular XML processor implementation. JAXP provides two parser types: –SAX 1 parser: event driven –DOM document builder: constructs DOM trees by parsing XML documents. 1)Simple API for XML

XML15 The Simple API for XML (SAX) APIs

XML16 The Document Object Model (DOM) APIs

XML17 Creating a DOM Builder 1.Create a DocumentBuilderFactory object: DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 2.Configure the factory object: dbf.setIgnoringComments(true); 3.Create a builder instance using the factory: DocumentBuilder docBuilder = dbf.newDocumentBuilder(); A ParserConfigurationException is thrown if a DocumentBuilder, which satisfies the configuration requested cannot be created.

XML18 Building a DOM Document A DOM document can be built manually from within the application: Document doc = docBuilder.newDocument(); Element offenders = doc.createElement("offenders"); doc.appendChild(offenders); Element offender = doc.createElement("offender"); offender.setAttribute("id", " "); offenders.appendChild(offender); Element firstName = doc.createElement(“firstName”); Text text = doc.createTextNode(“ David “); firstName.appendChild(text);... A DOMException is raised if an illegal character appears in a name, an illegal child is appended to a node etc.

XML19 Building a DOM Document A DOM Tree representation of an XML document can be built automatically by parsing the XML document: Document doc = docBuilder.parse(new File(xmlFile)); A SAXParseException or SAXException is raised to report parse errors.

XML20 DumpDom.java (1 of 5) import org.w3c.dom.Document; import org.w3c.dom.NodeList; import org.w3c.dom.NamedNodeMap; import org.w3c.dom.Node; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.ParserConfigurationException; import java.io.File; import java.io.IOException; Creating and traversing a DOM document

XML21 DumpDom.java (2 of 5) public class DumpDom { private int indent = 0; // text indentation level public DumpDom(String xmlFile) { try { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder docBuilder = dbf.newDocumentBuilder(); Document doc = docBuilder.parse(new File(xmlFile)); recursiveDump(doc); } catch (ParserConfigurationException pce) { System.err.println("Failed to create document builder"); } catch (SAXParseException spe) { System.err.println("Error: Line=" + spe.getLineNumber() + ": " + spe.getMessage()); } catch (SAXException se) { System.err.println("Parse error found: " + se); } catch (IOException e) { e.printStackTrace(); }

XML22 DumpDom.java (3 of 5) private void recursiveDump(Node node) { switch (node.getNodeType()) { case Node.DOCUMENT_NODE: dumpNode("document", node); break; case Node.COMMENT_NODE: dumpNode("comment", node); break; case Node.ATTRIBUTE_NODE: dumpNode("attribute", node); break; case Node.TEXT_NODE: dumpNode("text", node); break;

XML23 DumpDom.java (4 of 5) case Node.ELEMENT_NODE: dumpNode("element", node); indent += 2; NamedNodeMap atts = node.getAttributes(); for (int i = 0 ; i < atts.getLength() ; ++i) recursiveDump(atts.item(i)); indent -= 2; break; default: System.err.println("Unknown node: " + node); System.exit(1); } // end of switch // print children of the input node (if there are any) indent+=2; for (Node child = node.getFirstChild() ; child != null ; child = child.getNextSibling()) { recursiveDump(child); } indent-=2; }// end of recursiveDump

XML24 DumpDom.java (5 of 5) private void dumpNode(String type, Node node) { for (int i = 0 ; i < indent ; ++i) System.out.print(" "); System.out.print("[" + type + "]: "); System.out.print(node.getNodeName()); if (node.getNodeValue() != null) System.out.print("=\"" + node.getNodeValue() + "\""); System.out.print("\n"); } public final static void main(String[] args) { DumpDom dumper = new DumpDom(args[0]); }

XML25 DTD - Document Type Definition A specification for ensuring the validity of XML documents The original mechanism, defined as part of the XML specification Various Schema proposals - newer mechanisms for describing validation criteria

XML26 XML Schema The purpose of an XML Schema is to define a class of XML documents. An XML document that is syntactically correct is considered well formed. If it also conforms to an XML schema is considered valid. An XML document is not required to have a corresponding Schema.

XML27 XML Schema (cont.) XML Schema documents are themselves XML documents. –Can be manipulated as such –XML Schema is a language with an XML syntax. An XML document may explicitly reference the schema document that validates it. Several schema models exist. In this course we will use the W3C XML Schema 1. 1)W3C recommendation since 2001

XML28 W3C XML Schema... A W3C XML Schema consists of a schema element and a variety of sub-elements which determine the appearance of elements and their content in instance documents Each of the elements (and predefined simple types) in the schema has (by convention) a prefix xsd: which is associated with the W3C XML schema namespace.

XML29 Elements & Attribute Declarations Elements are declared using the element element: Attributes are declared using the attribute element: A pre-defined (simple) type

XML30 Element & Attribute Types Elements that contain sub-elements or carry attributes are said to have complex types. Elements that contain only text (e.g. numbers, strings, dates etc.) but do not contain any sub- elements are said to have simple types. Attributes always have simple types. Many simple types (e.g. string, date, integer etc.) are pre-defined.

XML31 A Few Built in Simple Types Simple TypeExamples stringany textual value (white space preserved) NMTOKEN 1 student, 342, $$ ID 1 s1, :myId, _4 integer , -1, 0, 1, , float-INF, -1E4, -0, 0,12.78, 12.78E-2, NaN time13:24:12, 02:15: date booleantrue, false, 0, 1 1)Should only be used as attribute types

XML32 Derived Simple Types New simple types may be defined by deriving them from existing simple types (build-in and derived) New simple types are derived by restricting the range of permitted values for an existing simple type. A new simple type is defined using the simpleType element.

XML33 Derived Simple Types (cont.) Example: Numeric Restriction Example: Enumeration

XML34 Complex Types Complex types are defined using the complexType element. Elements with complex types may carry attributes. The content of elements with complex types is categorized as follows: –Empty: no content is allowed. –Simple: content must be of simple type. –Element: content must include only child elements. –Mixed: both element and character content is allowed.

XML35 Complex Types: Attributes Attributes may be declared, using the use attribute, as required or optional (default). Default values for attributes are declared using the default attribute –Allowed only for optional attributes The fixed attribute is used to ensure that an attribute is set to a particular value. –Appearance of the attribute is optional. –fixed and use are mutually exclusive.

XML36 Complex Types: Attributes (cont.) Example: use, fixed Example: use, default... <xsd:attribute name="accuracy" type="Accuracy" use="optional" default="accurate"/>...

XML37 Complex Types: Empty Content Example: schema <xsd:attribute name="category" type="ViolationCategory“ fixed="traffic"/> Example: instance document

XML38 Complex Types: Simple Content Example: element with no attributes Example: element with attributes <xsd:attribute name="accuracy" type="Accuracy" use="optional" default="accurate"/> Simple type

XML39 Complex Types: Element Content Element Occurrence Constraints –The minimum number of times an element may appear is specified by the value of the optional attribute minOccurs. –The maximum number of times an element may appear is specified by the value of the optional attribute maxOccurs. The value unbounded indicates that there maximum number of occurrences is unbounded. –The default value of minOccurs and maxOccurs is 1.

XML40 Complex Types: Element Content (cont.) The element sequence is used to specify a sequence of sub- elements. –Elements must appear in the same order that they are declared. <xsd:element name="middleName" type="xsd:string“ minOccurs="0"/> <xsd:element name="violation" type="Violation“ minOccurs="0" maxOccurs="unbounded"/>......

XML41 Complex Types: Mixed Content The optional Boolean attribute mixed is used to specify mixed content:...

XML42 Global Elements/Attributes Global elements and global attributes are created by declarations that appear as the children of the schema element. A global element is allowed to appear as the root element of an instance document. The attribute ref of element/attribute elements may be used (instead of the name attribute) to reference a global element/attribute. Cardinality constraints cannot be placed on global declarations, although they can be placed on local declarations that reference global declarations.

XML43 Global Elements/Attributes (cont.) Example: global declarations... Example: ref attribute

XML44 Anonymous Type Definitions When a type is referenced only once, or contains very few constraints, it can be more succinctly defined as an anonymous type. Saves the overhead of naming the type and explicitly referencing it.

XML45 Anonymous Type Definitions (cont.) <xsd:element name="middleName" type="xsd:string“ minOccurs="0"/> <xsd:element name="violation" type="Violation“ minOccurs="0" maxOccurs="unbounded"/> Is this a global declaration? Anonymous

XML46 offenders.xsd (1 of 4) <xsd:attribute name="accuracy" type="Accuracy" use="optional" default="accurate"/> <xsd:attribute name="category" type="ViolationCategory" fixed="traffic"/> Schema for offenders XML documents

XML47 offenders.xsd (2 of 4) <xsd:element name="middleName" type="xsd:string“ minOccurs="0"/> <xsd:element name="violation" type="Violation" minOccurs="0" maxOccurs="unbounded"/>

XML48 offenders.xsd (3 of 4)

XML49 offenders.xsd (4 of 4)

XML50 Validating Parsers A validating parser is capable of reading a Schema specification or DTD and determine whether or not XML documents conform to it. A non validating parser is capable of reading a Schema / DTD but cannot check XML documents for conformity. –Limited to syntax checking

XML51 Creating a Validating DOM Parser 1.Create a DocumentBuilderFactory object: DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 2.Configure the factory object to produce a validating parser: dbf.setAttribute(" + "/schemaLanguage", " dbf.setAttribute(" + "/schemaSource", new File(xmlSchema)); dbf.setValidating(true); 3.Create a builder instance and set its error-handler: DocumentBuilder docBuilder = dbf.newDocumentBuilder(); docBuilder.setErrorHandler(new MyErrorHandler());

XML52 Handling Parsing Errors By default, JAXP parsers do not throw exceptions when documents are found to be invalid. JAXP provides the interface ErrorHandler so that users will be able to implement their own error-handling semantics.

XML53 BoundedErrorPrinter.java (1 of 3) import org.xml.sax.ErrorHandler; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; /** * An error handler that prints to the standard error stream a specified * number of errors. Once the specified number of errors is detected, * parsing is aborted. */ public class BoundedErrorPrinter implements ErrorHandler { private int errorCount = 0; private int errorsToPrint; public BoundedErrorPrinter(int errorsToPrint) { this.errorsToPrint = errorsToPrint; }

XML54 public void warning(SAXParseException spe) throws SAXException { System.err.println("Warning: " + getParseExceptionInfo(spe)); } public void error(SAXParseException spe) throws SAXException { if (errorCount < errorsToPrint) { System.err.println("Error: " + getParseExceptionInfo(spe)); ++errorCount; } if (errorCount >= errorsToPrint) throw spe; // abort parsing } BoundedErrorPrinter.java (2 of 3)

XML55 public void fatalError(SAXParseException spe) throws SAXException { if (errorCount < errorsToPrint) System.err.println("Fatal: " + getParseExceptionInfo(spe)); throw spe; } public boolean errorsFound() { return errorCount > 0; } private String getParseExceptionInfo(SAXParseException spe) { return "Line = " + spe.getLineNumber() + ": " + spe.getMessage(); } BoundedErrorPrinter.java (3 of 3)