XML and SAX (A quick overview) ● What is XML? ● What are SAX and DOM? ● Using SAX.

Slides:



Advertisements
Similar presentations
J0 1 Marco Ronchetti - Web architectures – Laurea Specialistica in Informatica – Università di Trento Java XML parsing.
Advertisements

Technische universität dortmund Service Computing Service Computing Prof. Dr. Ramin Yahyapour IT & Medien Centrum 22. Oktober 2009.
XML Parsers By Chongbing Liu. XML Parsers  What is a XML parser?  DOM and SAX parser API  Xerces-J parsers overview  Work with XML parsers (example)
1 SAX and more… CS , Spring 2008/9. 2 SAX Parser SAX = Simple API for XML XML is read sequentially When a parsing event happens, the parser invokes.
SAX A parser for XML Documents. XML Parsers What is an XML parser? –Software that reads and parses XML –Passes data to the invoking application –The application.
XML Robert Grimm New York University. The Whirlwind So Far  HTTP  Persistent connections  (Style sheets)  Fast servers  Event driven architectures.
1 The Simple API for XML (SAX) Part I ©Copyright These slides are based on material from the upcoming book, “XML and Bioinformatics” (Springer-
31 Signs That Technology Has Taken Over Your Life: #6. When you go into a computer store, you eavesdrop on a salesperson talking with customers -- and.
1 XML and Data Management XML Processors Hachim Haddouti Al Akhawayn University SSE
Xerces The Apache XML Project Yvonne Yao. Introduction Set of libraries that provides functionalities to parse XML documents Set of libraries that provides.
XML Parser. Why Need a XML Parser ? Check XML syntax. ( is well-formed ? ) Validation. ( DTD and XML Schema ) Allow programmatic access to the document’s.
Week 5 Basic SAX Example From Chapter 5 of XML and Java Working with XML SAX Filters as described in Chapter 5.
21-Jun-15 SAX (Abbreviated). 2 XML Parsers SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard.
26-Jun-15 SAX. SAX and DOM SAX and DOM are standards for XML parsers--program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
17 Apr 2002 XML Programming: SAX Andy Clark. SAX Design Premise Generic method of creating XML parser, parsing documents, and receiving document information.
Processing of structured documents Spring 2003, Part 5 Helena Ahonen-Myka.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
SDPL 2003Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
XML for E-commerce II Helena Ahonen-Myka. XML processing model n XML processor is used to read XML documents and provide access to their content and structure.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
5 Processing XML Parsing XML documents  Document Object Model (DOM)  Simple API for XML (SAX) Class generation Overview.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
EXtensible Markup Language (XML) James Atlas July 15, 2008.
SDPL 2002Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
SDPL 20113: XML APIs and SAX1 3. XML Processor APIs n How can (Java) applications manipulate structured (XML) documents? –An overview of XML processor.
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
SAX. What is SAX SAX 1.0 was released on May 11, SAX is a common, event-based API for parsing XML documents Primarily a Java API but there implementations.
Electronic Commerce COMP3210 Session 4: Designing, Building and Evaluating e-Commerce Initiatives – Part II Dr. Paul Walcott Department of Computer Science,
Beginning XML 4th Edition. Chapter 12: Simple API for XML (SAX)
Intro to XML Originally Presented by Clifford Lemoine Modified by Box.
Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
XML Processing in Java. Required tools Sun JDK 1.4, e.g.: JAXP (part of Java Web Services Developer Pack, already in Sun.
Java API for XML Processing (JAXP) Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
SDPL Streaming API for XML1 3.4 Streaming API for XML (StAX) n Could we process XML documents more conveniently than with SAX, and yet more efficiently?
Web Technologies COMP6115 Session 4: Adding a Database to a Web Site Dr. Paul Walcott Department of Computer Science, Mathematics and Physics University.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
Java and XML. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information about a document. Tags are added.
WIRED Detector Description in XML Mark Dönszelmann, Applications for Physics and Infrastructure, IT, CERN XML Detector Description Workshop CERN, 14 April,
© Marty Hall, Larry Brown Web core programming 1 Simple API for XML SAX.
SAX2 and DOM2 Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
SDPL 20063: XML Processor Interfaces1 3. XML Processor APIs n How can (Java) applications manipulate structured (XML) documents? –An overview of XML processor.
Simple API for XML (SAX) Aug’10 – Dec ’10. Introduction to SAX Simple API for XML or SAX was developed as a standardized way to parse an XML document.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
SDPL 2001Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How applications can manipulate structured documents? –An overview of document parser.
1 Introduction SAX. Objectives 2  Simple API for XML  Parsing an XML Document  Parsing Contents  Parsing Attributes  Processing Instructions  Skipped.
Java API for XML Processing
May 8, 2006 MAGE v1 and MAGE v2 Michael Miller Lead Software Developer Rosetta Biosoftware NCI MAGE Jamboree.
Simple API for XML SAX. Agenda l Introduction to SAX l Installation and setup l Steps for SAX parsing l Defining a content handler l Examples Printing.
XML 1.Introduction to XML 2.Document Type Definition (DTD) 3.XML Parser 4.Example: CGI Gateway to XML Middleware.
Week-9 (Lecture-1) XML DTD (Data Type Document): An XML document with correct syntax is called "Well Formed". An XML document validated against a DTD is.
Parsing with SAX using Java Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
In this session, you will learn to:
Java XML IS
CHAPTER 9 JAVA AND XML.
XML Parsers By Chongbing Liu.
Jagdish Gangolly State University of New York at Albany
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Java API for XML Processing
A parser for XML Documents
SAX2 29-Jul-19.
Presentation transcript:

XML and SAX (A quick overview) ● What is XML? ● What are SAX and DOM? ● Using SAX

What is XML? ● Textfiles built from text content marked up with text tags ● The tags provide meaning ● Although it is similar to HTML, all tags in XML must be well formed ● Start tags must be balanced by end tags... ● Similarly, tags are case sensitive in XML

XML Documents ● XML Documents contain one root node Zippy The Pinehad Politician Clown ● The tags within an XML document form a tree structure. ● All tags must be contained within the root tags.

Tags ● Tags can have attributes ● The tags within an XML document form a tree structure. ● All tags must be contained within the root tags.

DTDs and Schemas ● An XML document can be checked for “validity”. ● What makes a “valid” XML document can be defined within: ● DTD: Document Type Definitions ● Defines which tags must be present and where they can be seen ● Convoluted syntax ● Schema ● Similar to DTDs except they are defined in XML ● DTDs aren't used very much anymore due to the overly complex nature of their syntax ● DTDs and Schemas are NOT necessary to using XML. If you choose to use a validating parser (and validation is enabled), you must have a DTD or Schema.

What are SAX and DOM? ● DOM (Document Object Model) describes a language neutral object model which can represent an XML document. ● The types defined in DOM have been defined using an interface definition language (IDL) as defined by the OMG (Object Management Group) ● DOM parsers parse an XML file into the DOM types. The user/programmer traverses these structures to pull out relevant portions of the document ● SAX (Simple API for XML) is an API for parsing XML documents ● It is an event-based push model of parsing. ● As the document is parsed, events are generated. ● The programmer must create handlers to deal with these events

Using SAX ● To get started with SAX, the developer must obtain a reference to an XMLReader object. This can be obtained in the following ways: XMLReader aReader = XMLReaderFactory.createXMLReader(); XMLReader aReader = XMLReaderFactory.createXMLReader( “org.apache.xerces.parsers.SAXParser”); ● In the first call, the programmer is asking the system for an instance of the default parser. In the second, the programmer is asking for an instance of a specific parser. ● The second call requires that a file xerces.jar is in the CLASSPATH.

Defining Content Handlers ● Once a Reader has been instantiated, it must be provided with a ContentHandler ● The ContentHandler interface defines the events which are generated as a result of parsing the XML document. ● The interface is as follows: setDocumentLocator(Locator locator); startDocument() throws SAXException; endDocument() throws SAXException; startPrefixMapping(String prefix, String uri) throws SAXException; endPrefixMapping(String prefix) throws SAXException; startElement(String namespaceURI, String localName, String qualifiedName, Attributes attrs) throws SAXException; endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException; characters(char[] ch, int start, int length) throws SAXException; ignorableWhitespace(char[] ch, int start, int length) throws SAXException; processingInstruction(String target, String data) throws SAXException; skippedEntity(String name) throws SAXException;

The DefaultHandler Class ● To implement a ContentHandler, you must provide an implementation for all of the methods defined in the ContentHandler interface. ● However, not all methods are necessary for all contexts ● The DefaultHandler class contains a null implementation of all the methods defined in the ContentHandler interface. ● You can subclass DefaultHandler and override only those methods you require. ● Generally, most people provide an implementation for: ● startElement – (Called when an start tag is parsed) ● endElement – (Called when an end tag is parsed) ● characters – (called when text between tags is parsed)

Design Elements to Think about ● When you are doing your 1 st assignment, I recommend that you use SAX ● Think about the design of SAX while you are using it, we will be revisting SAX at a later time ● We will be evaluating its design to see if we can make improvements.