XML DOM and SAX Parsers By Omar RABI. Introduction to parsers  The word parser comes from compilers  In a compiler, a parser is the module that reads.

Slides:



Advertisements
Similar presentations
Copyright © 2003 Pearson Education, Inc. Slide 8-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
Advertisements

1/7 ITApplications XML Module Session 8: Introduction to Programming with XML.
XML: Extensible Markup Language
1 XSLT – eXtensible Stylesheet Language Transformations Modified Slides from Dr. Sagiv.
3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
XML A brief introduction ---by Yongzhu Li. XML --- a brief introduction 2 CSI668 Topics in System Architecture SUNY Albany Computer Science Department.
CS 898N – Advanced World Wide Web Technologies Lecture 22: Applying XML Chin-Chih Chang
COS 381 Day 16. Agenda Assignment 4 posted Due April 1 There was no resubmits of Assignment Capstone Progress report Due March 24 Today we will discuss.
Chapter 13 XML Concept of XML Simple Example of XML XML vs. HTML in Syntax XML Structure DTD and CDATA Sections Concept of SAX Processing Download and.
By: Shawn Li. OUTLINE XML Definition HTML vs. XML Advantage of XML Facts Utilization SAX Definition DOM Definition History Comparison between SAX and.
Chapter 12 Creating and Using XML Documents HTML5 AND CSS Seventh Edition.
JavaScript and The Document Object Model MMIS 656 Web Design Technologies Acknowledgements: 1.Notes from David Shrader, NSU GSCIS 2.Some material adapted.
HTML DOM.  The HTML DOM defines a standard way for accessing and manipulating HTML documents.  The DOM presents an HTML document as a tree- structure.
JS: Document Object Model (DOM)
XML eXtensible Markup Language by Darrell Payne. Experience Logicon / Sterling Federal C, C++, JavaScript/Jscript, Shell Script, Perl XML Training XML.
1 Document Object Model (DOM) MV4920 – XML 24 September 2001 Simon R. Goerger MAJ, US Army
XML for E-commerce III Helena Ahonen-Myka. In this part... n Transforming XML n Traversing XML n Web publishing frameworks.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Introduction to XML Extensible Markup Language. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
Working with the XML Document Object Model ©NIITeXtensible Markup Language/Lesson 7/Slide 1 of 44 Objectives In this lesson, you will learn to: *Identify.
 2004 Prentice Hall, Inc. All rights reserved. 1 Chapter 34 - Case Study: Active Server Pages and XML Outline 34.1 Introduction 34.2 Setup and Message.
INTRODUCTION TO JAVASCRIPT AND DOM Internet Engineering Spring 2012.
HTML Concepts and Techniques Fourth Edition Project 12 Creating and Using XML Documents.
Intro to XML Originally Presented by Clifford Lemoine Modified by Box.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Scripting with the DOM Ellen Pearlman Eileen Mullin Programming the Web.
1 Introduction  Extensible Markup Language (XML) –Uses tags to describe the structure of a document –Simplifies the process of sharing information –Extensible.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
1 Dr Alexiei Dingli XML Technologies SAX and DOM.
XML Study-Session: Part III
Introduction to the Document Object Model Eugenia Fernandez IUPUI.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
CO1552 – Web Application Development Further JavaScript: Part 1: The Document Object Model Part 2: Functions and Events.
Martin Kruliš by Martin Kruliš (v1.1)1.
XML Tools (Chapter 4 of XML Book). What tools are needed for a complete XML application? n Fundamental components n Web infrasructure n XML development.
Document Object Model.  The XML DOM (Document Object Model) defines a standard way for accessing and manipulating XML documents.  The DOM presents an.
JS: Document Object Model (DOM) DOM stands for Document Object Model, and allows programmers generic access to: DOM stands for Document Object Model, and.
Tutorial 9 Working with XHTML. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Describe the history and theory of XHTML.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
XML DOM Week 11 Web site:
XML 1.Introduction to XML 2.Document Type Definition (DTD) 3.XML Parser 4.Example: CGI Gateway to XML Middleware.
Week-9 (Lecture-1) XML DTD (Data Type Document): An XML document with correct syntax is called "Well Formed". An XML document validated against a DTD is.
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
In this session, you will learn to:
Unit 4 Representing Web Data: XML
Introduction to the Document Object Model
XML in Web Technologies
Chapter 7 Representing Web Data: XML
Week 11 Web site: XML DOM Week 11 Web site:
DOM Document Object Model.
Creating an XML Document
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
More Sample XML By Sadia Anjum.
XML Programming in Java
Presentation transcript:

XML DOM and SAX Parsers By Omar RABI

Introduction to parsers  The word parser comes from compilers  In a compiler, a parser is the module that reads and interprets the programming language.

Introduction to Parsers  In XML, a parser is a software component that sits between the application and the XML files.

Introduction to parsers  It reads a text-formatted XML file or stream and converts it to a document to be manipulated by the application.

Well-formedness and validity  Well-formed documents respect the syntactic rules.  Valid documents not only respect the syntactic rules but also conform to a structure as described in a DTD.

Validating vs. Non-validating parsers  Both parsers enforce syntactic rules  only validating parsers know how to validate documents against their DTDs

Tree-based parsers  These map an XML document into an internal tree structure, and then allow an application to navigate that tree.  Ideal for browsers, editors, XSL processors.

Event-based  An reports parsing events (such as the start and end of elements) directly to the application through callbacks.  An event-based API reports parsing events (such as the start and end of elements) directly to the application through callbacks.  The application implements handlers to deal with the different events

Event-based vs. Tree-based parsers  Tree-based parsers deal generally small documents.  Event-based parsers deal generally used for large documents.

Event-based vs. Tree-based parsers  Tree-based parsers are generally easier to implement.  Event-based parsers are more complex and give hard time for the programmer

What is DOM?  The Document Object Model (DOM) is an application programming interface (API) for HTML and XML documents.  It defines the logical structure of documents and the way a document is accessed and manipulated

Properties of DOM  Programmers can build documents, navigate their structure, and add, modify, or delete elements and content.  Provides a standard programming interface that can be used in a wide variety of environments and applications.  structural isomorphism.

DOM Identifies  The interfaces and objects used to represent and manipulate a document.  The semantics of these interfaces and objects - including both behavior and attributes.  The relationships and collaborations among these interfaces and objects.

What DOM is not!!  The Document Object Model is not a binary specification.  The Document Object Model is not a way of persisting objects to XML or HTML.  The Document Object Model does not define "the true inner semantics" of XML or HTML.

What DOM is not!!  The Document Object Model is not a set of data structures, it is an object model that specifies interfaces.  The Document Object Model is not a competitor to the Component Object Model (COM).

DOM into work <products><product> XML Editor XML Editor <price>499.00</price></product><product> DTD Editor DTD Editor <price>199.00</price></product><product> XML Book XML Book <price>19.99</price></product><product> XML Training XML Training <price>699.00</price></product></products>

DOM into work

DOM levels: level 0  DOM Level 0 is a mix of Netscape Navigator 3.0 and MS Internet Explorer 3.0 document functionalities.

DOM levels: DOM 1  It contains functionality for document navigation and manipulation. i.e.: functions for creating, deleting and changing elements and their attributes.

DOM level 1 limitations  A structure model for the internal subset and the external subset.  Validation against a schema.  Control for rendering documents via style sheets.  Access control.  Thread-safety.  Events

DOM levels: DOM 2  A style sheet object model and defines functionality for manipulating the style information attached to a document.  Enables of the traversal on the document.  Defines an event model.  Provides support for XML namespaces

DOM levels: DOM 3  Document loading and saving as well as content models (such as DTD’s and schemas) with document validation support.  Document views and formatting, key events and event groups

An Application of DOM <HTML><HEAD> Currency Conversion Currency Conversion </HEAD><BODY><CENTER> File: File: Rate: Rate: </FORM> </CENTER></BODY></HTML>

An Application of DOM  :  : defines an XML island.   XML islands are mechanisms used to insert XML in HTML documents.   In this case, XML islands are used to access Internet Explorer’s XML parser. The price list is loaded into the island.

An Application of DOM   The “Convert” button in the HTML file calls the JavaScript function convert(), which is the conversion routine.   convert() accepts two parameters, the form and the XML island.

An Application for DOM function convert(form,xmldocument) {var fname = form.fname.value, output = form.output, rate = form.rate.value; output.value = ""; var document = parse(fname,xmldocument), topLevel = document.documentElement; searchPrice(topLevel,output,rate);} function parse(uri,xmldocument) {xmldocument.async = false; xmldocument.load(uri); if(xmldocument.parseError.errorCode != 0) alert(xmldocument.parseError.reason); return xmldocument;} function searchPrice(node,output,rate) {if(node.nodeType == 1) {if(node.nodeName == "price") output.value += (getText(node) * rate) + "\r"; var children, i; children = node.childNodes; for(i = 0;i < children.length;i++) searchPrice(children.item(i),output,rate);}} function getText(node) {return node.firstChild.data;}

An Application of DOM   nodeType is a code representing the type of the object.   parentNode is the parent (if any) of current Node object.   childNode is the list of children for the current Node object.   firstChild is the Node’s first child.   lastChild is the Node’s last child.   previousSibling is the Node immediately preceding the current one.   nextSibling is the Node immediately following the current one.   attributes is the list of attributes, if the current Node has any.

An Application of DOM   The parse() function loads the price list in the XML island and returns its Document object.   The function searchPrice() tests whether the current node is an element.

An Application of DOM   The function searchPrice() visits each node by recursively calling itself for all children of the current node.

An Application for DOM

What is SAX?  SAX (the Simple API for XML) is an event- based parser for xml documents.  The parser tells the application what is in the document by notifying the application of a stream of parsing events.  Application then processes those events to act on data.

SAX History  SAX 1.0 was released on May 11,  SAX is a common, event-based API for parsing XML documents, developed as a collaborative project of the members of the XML-DEV discussion under the leadership of David Megginson.

Why SAX?  For applications that are not so XML- centric, an object-based interface is less appealing.  Efficiency: lower level than object- based interfaces

Why SAX?  Event-based interface consumes fewer resources than an object- based one  With an event-based interface, the application can start processing the document as the parser is reading it

Limitations of SAX  With SAX, it is not possible to navigate through the document as you can with a DOM.  The application must explicitly buffer those events it is interested in.

SAX API  Parser events are similar to user- interface events such as ONCLICK (in a browser) or AWT events (in Java).  Events alert the application that something happened and the application might want to react.

SAX API  Element opening tags  Element closing tags  Content of elements  Entities  Parsing errors

SAX API

SAX Example <doc> Hello, world! Hello, world! </doc>

SAX example  start document  start element: doc  start element: para  characters: Hello, world!  end element: para  end element: doc  end document

Conclusion