XML. Contents  Parsing an XML Document  Validating XML Documents.

Slides:



Advertisements
Similar presentations
The Semantic Web. The Web Today Designed for Human to read Cannot express meaning Architecture: URL –Decentralized: Link structure Language: html.
Advertisements

XML Parsing Using Java APIs AIP Independence project Fall 2010.
14-Jun-15 DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C Activities HTML: is the lingua franca for publishing on the Web XHTML: an XML application.
Xerces The Apache XML Project Yvonne Yao. Introduction Set of libraries that provides functionalities to parse XML documents Set of libraries that provides.
Java API for XML Processing (JAXP) CSE 4/586: Distributed Systems Department of Computer Science and Engineering University at Buffalo, New York Jia Zhao.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
Chapter 24 XML. CHAPTER GOALS Understanding XML elements and attributes Understanding the concept of an XML parser Being able to read and write XML documents.
1 CS122B: Projects in Databases and Web Applications Spring 2015 Notes 05: XML Professor Chen Li Department of Computer Science UC Irvine CS122BNotes 05:
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Lecture 7 of Advanced Databases XML Querying & Transformation Instructor: Mr.Ahmed Al Astal.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
XML for E-commerce III Helena Ahonen-Myka. In this part... n Transforming XML n Traversing XML n Web publishing frameworks.
Java WWW Week 10 Version 2.1 Mar 2008 Slide Java (JSP) and XML  Format of lecture: What is XML? A sample XML file… How to use.
Lecture 6 of Advanced Databases XML Querying & Transformation Instructor: Mr.Eyad Almassri.
17 Apr 2002 XML Programming - DOM Andy Clark. DOM Design Premise Derived from browser document model Defined in IDL – Lowest common denominator programming.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
Working with the XML Document Object Model ©NIITeXtensible Markup Language/Lesson 7/Slide 1 of 44 Objectives In this lesson, you will learn to: *Identify.
Intro to XML Originally Presented by Clifford Lemoine Modified by Box.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
XML Processing in Java. Required tools Sun JDK 1.4, e.g.: JAXP (part of Java Web Services Developer Pack, already in Sun.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
DOM Programming The Document Object Model standardises  what an application can see of the XML data  how it can access it An XML structure is a tree.
C# and Windows Programming XML Processing. 2 Contents Markup XML DTDs XML Parsers DOM.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
XML Study-Session: Part III
Document Object Model DOM. Agenda l Introduction to DOM l Java API for XML Parsing (JAXP) l Installation and setup l Steps for DOM parsing l Example –Representing.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
XML and SAX (A quick overview) ● What is XML? ● What are SAX and DOM? ● Using SAX.
XML. DCS – SWC 2 Data vs. Information We often use the terms data and information interchangeably More precisely, data is some ”value” of a certain type,
Schema Data Processing
1 JAXP & XPATH. Objectives 2  XPath  JAXP Processing of XPath  Workshops.
CS 157B: Database Management Systems II February 13 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
Representing data with XML SE-2030 Dr. Mark L. Hornick 1.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
XML DTD. XML Validation XML with correct syntax is "Well Formed" XML. XML validated against a DTD is "Valid" XML.
Web services. DOM parsing and SOAP.. Summary. ● Exercise: SAX-Based checkInvoice(), ● push parsing, ● event-based parsing, ● traversal order is depth-first.
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C - The World Wide Web Consortium W3C - The World Wide Web Consortium.
13-Mar-16 DOM. 2 Difference between SAX and DOM DOM reads the entire XML document into memory and stores it as a tree data structure SAX reads the XML.
USING ANDROID WITH THE DOM. Slide 2 Lecture Summary DOM concepts SAX vs DOM parsers Parsing HTTP results The Android DOM implementation.
XML & JSON. Background XML and JSON are to standard, textual data formats for representing arbitrary data – XML stands for “eXtensible Markup Language”
21-Jun-16 Document Object Model DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C.
XML 1.Introduction to XML 2.Document Type Definition (DTD) 3.XML Parser 4.Example: CGI Gateway to XML Middleware.
Week-9 (Lecture-1) XML DTD (Data Type Document): An XML document with correct syntax is called "Well Formed". An XML document validated against a DTD is.
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Chapter 24 XML.
Java XML IS
Intro to XML.
Java/XML.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
Database Processing with XML
CHAPTER 9 JAVA AND XML.
DOM Document Object Model.
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Thinking about grammars
DOM 8-Dec-18.
WaysInJavaToParseXML
DOM 24-Feb-19.
XML Parsers.
Thinking about grammars
WaysInJavaToParseXML
XML and Web Services (II/2546)
Presentation transcript:

XML

Contents  Parsing an XML Document  Validating XML Documents

I. Parsing an XML Document Write a Java program to parse an XML document Helvetica 36 Times Roman Times Roman Helvetica

I. Parsing an XML Document To process an XML document, we need to parse it. The XML format allows you to express the structure hierarchy and repeated elements without contortions. The body of the XML document contains the root element, which can contain other elements. An element can contain child elements, text, or both. XML elements can contain attributes.

I. Parsing an XML Document A parser is a program that  reads a file  confirms that the file has the correct format  breaks it up into the constituent elements  lets a programmer access those elements The Java libraries supplies  Tree parsers: The Document Object Model (DOM) that read an XML document into a tree structure  Streaming parser: The Simple API for XML (SAX) that generate events as they read an XML document.

I. Parsing an XML Document Getting a DocumentBuilder object: DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Reading a document from a file: File f =... Document doc = builder.parse(f); Alternatively, you can use a URL: URL u =... Document doc = builder.parse(u); You can even specify an arbitrary input stream: InputStream in =... Document doc = builder.parse(in);

The Document object is an in-memory representation of the tree structure of the XML document.  It is composed of objects whose classes implement the Node interface and its various subinterfaces.

You start analyzing the contents of a document by calling the getDocumentElement method. It returns the root element. Element root = doc.getDocumentElement();... then calling getDocumentElement returns the font element.

To get the element's children (which may be subelements, text, comments, or other nodes) NodeList children = root.getChildNodes(); for (int i = 0; i < children.getLength(); i++) { Node child = children.item(i);... }

Be careful ! Helvetica 36 The parser reports five:  The whitespace between and  The name element  The whitespace between and  The size element  The whitespace between and

If you expect only subelements, then you can ignore the whitespace: for (int i = 0; i < children.getLength(); i++) { Node child = children.item(i); if (child instanceof Element) { Element childElement = (Element) child;... } }

Text nodes are the only children, you can use the getFirstChild method without having to traverse another NodeList. Use the getData method to retrieve the string stored in the Text node. for (int i = 0; i < children.getLength(); i++) { Node child = children.item(i); if (child instanceof Element) { Element childElement = (Element) child; Text textNode = (Text) childElement.getFirstChild(); String text = textNode.getData().trim(); if (childElement.getTagName().equals("name")) name = text; else if (childElement.getTagName().equals("size")) size = Integer.parseInt(text); } }

To enumerate the attributes of a node:  call the getAttributes method. It returns a NamedNodeMap object that contains Node objects describing the attributes.  traverse the nodes in a NamedNodeMap in the same way as a NodeList.  call the getNodeName and getNodeValue methods to get the attribute names and values.

NamedNodeMap attributes = element.getAttributes(); for (int i = 0; i < attributes.getLength(); i++) { Node attribute = attributes.item(i); String name = attribute.getNodeName(); String value = attribute.getNodeValue();... } Alternatively, if you know the name of an attribute, you can retrieve the corresponding value directly: String unit = element.getAttribute("unit");

II. Validating XML Documents Helvetica 36 One of the major benefits of an XML parser is that it can automatically verify that a document has the correct structure. Then the parsing becomes much simpler. For example, if you know that the font fragment has passed validation, then you can simply get the two grandchildren, cast them as Text nodes, and get the text data, without any further checking.

To Specify the Document Structure Document Type Definitions (DTD) might contain a rule:  This rule expresses that a font element must always have two children, which are name and size elements.

To Specify the Document Structure The XML Schema language expresses the same constraint as XML Schema can express more sophisticated validation conditions (such as the fact that the size element must contain an integer) than can DTDs.

Document Type Definitions There are several methods for supplying a DTD. You can include a DTD in an XML document like this: more rules... ]>...

Document Type Definitions Supplying a DTD outside an XML document or

How to Use DTDs Read an XML document and show values of specific elements XML document: configDTD.xml Helvetica 36

DTD file: configDTD.dtd

How to Use DTDs