1 4/13/01 CSE 121/131 Programming Spring 2001 Lecture Notes 7  2000-2001 A. Sahuguet & V.Tannen.

Slides:



Advertisements
Similar presentations
1 eXtensible Markup Language. XML is based on SGML: Standard Generalized Markup Language HTML and XML are both based on SGML 2 SGML HTMLXML.
Advertisements

J0 1 Marco Ronchetti - Web architectures – Laurea Specialistica in Informatica – Università di Trento Java XML parsing.
1/7 ITApplications XML Module Session 8: Introduction to Programming with XML.
XML Parsing Using Java APIs AIP Independence project Fall 2010.
SAX A parser for XML Documents. XML Parsers What is an XML parser? –Software that reads and parses XML –Passes data to the invoking application –The application.
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C Activities HTML: is the lingua franca for publishing on the Web XHTML: an XML application.
Combining Static and Dynamic Data in Code Visualization David Eng Sable Research Group, McGill University PASTE 2002 Charleston, South Carolina November.
XML Parser. Why Need a XML Parser ? Check XML syntax. ( is well-formed ? ) Validation. ( DTD and XML Schema ) Allow programmatic access to the document’s.
21-Jun-15 SAX (Abbreviated). 2 XML Parsers SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard.
Implementation of One Stop Search by XSLT By Dave Low University of Hong Kong 9-Dec-2003.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
Technical Track Session XML Techie Tools Tim Bornholt.
Processing of structured documents Spring 2003, Part 5 Helena Ahonen-Myka.
1 CS122B: Projects in Databases and Web Applications Spring 2015 Notes 05: XML Professor Chen Li Department of Computer Science UC Irvine CS122BNotes 05:
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
By: Shawn Li. OUTLINE XML Definition HTML vs. XML Advantage of XML Facts Utilization SAX Definition DOM Definition History Comparison between SAX and.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
5 Processing XML Parsing XML documents  Document Object Model (DOM)  Simple API for XML (SAX) Class generation Overview.
School of Computing and Management Sciences © Sheffield Hallam University To understand the Oracle XML notes you need to have an understanding of all these.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
Sheet 1XML Technology in E-Commerce 2001Lecture 7 XML Technology in E-Commerce Lecture 7 XSL Formatting Objects, Java Data Binding.
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
SDPL 2002Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
Electronic Commerce COMP3210 Session 4: Designing, Building and Evaluating e-Commerce Initiatives – Part II Dr. Paul Walcott Department of Computer Science,
XML 6.4 DOM 6. The XML ‘Alphabet Soup’ XMLExtensible Markup Language Defines XML documents XSLExtensible Stylesheet Language Language for expressing stylesheets;
Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
Java API for XML Processing (JAXP) Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
FYP: LYU0001 Wireless-based Mobile E-Commerce on the Web Supervisor: Prof. Michael R. Lyu By: Tony, Wat Hong Fai Harris, Yan Wai Keung.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
Web Technologies COMP6115 Session 4: Adding a Database to a Web Site Dr. Paul Walcott Department of Computer Science, Mathematics and Physics University.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
XML Study-Session: Part III
SNU OOPSLA Lab. DOM/SAX Applications The ubiquitous XML(9) © copyright 2001 SNU OOPSLA Lab.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
SAX2 and DOM2 Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML and SAX (A quick overview) ● What is XML? ● What are SAX and DOM? ● Using SAX.
Extracting Typed Values from XML Data Fabio Simeoni David Lievens Paolo Manghi Steve Neely Richard Connor.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Introduction to Server-Side Web Development Introduction to Server-Side Web Development using JSP and XML Further JSP and integration with XML 17 th March.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C - The World Wide Web Consortium W3C - The World Wide Web Consortium.
XML SNU OOPSLA Lab. October Contents  Semistructured Data  Introduction  History  XML Application  DTD & XML Schema  DOM & SAX  Summary.
XML Tools (Chapter 4 of XML Book). What tools are needed for a complete XML application? n Fundamental components n Web infrasructure n XML development.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Introduction to Server-Side Web Development Introduction to Server-Side Web Development using JSP and XML Session V: Further JSP and integration with XML.
USING ANDROID WITH THE DOM. Slide 2 Lecture Summary DOM concepts SAX vs DOM parsers Parsing HTTP results The Android DOM implementation.
Java API for XML Processing
I Copyright © 2004, Oracle. All rights reserved. Introduction.
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Unit 4 Representing Web Data: XML
Java XML IS
Database Processing with XML
Chapter 7 Representing Web Data: XML
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
More Sample XML By Sadia Anjum.
XML Problems and Solutions
A parser for XML Documents
Python and XML Styling and other issues XML
XML and its applications: 4. Processing XML using PHP
XML and Web Services (II/2546)
Presentation transcript:

1 4/13/01 CSE 121/131 Programming Spring 2001 Lecture Notes 7  A. Sahuguet & V.Tannen

2 4/13/01 Data on the Web, today: HTML... Primary Faculty Rajeev Alur Associate Professor, Computer and Information Science Formal support for design and analysis of reactive, real-time, and hybrid systems. Hardware verification; Software engineering; Control of distributed multi-agent systems; Logic and concurrency theory; Distributed computing....

3 4/13/01 Data on the Web, tomorrow: XML... Rajeev Alur Associate Professor Computer and Information Science Formal support for design and analysis of reactive, real-time, and hybrid systems. Hardware verification; Software engineering; Control of distributed multi-agent systems; Logic and concurrency theory; Distributed computing....

4 4/13/01 What is XML? Like HTML, XML is a “document markup language” i.e., a way to enrich text with tags and attributes. HTML’s markup is about visual presentation. However, it is difficult for a program to manipulate the data in HTML. XML’s markup is about the meaning of the information. This makes it easier for programs to manipulate XML. Still, what we saw on the previous slide is an external format. Internally, XML is represented as trees.

5 4/13/01 How XML overcomes some HTML limitations Using XML, content providers can separate form and content. XML Content Wireless Markup Language HTML XSL (Stylesheets) HTML (Web-TV)

6 4/13/01 Wireless Applications Hand-held devices have some constraints –small display –narrowband network connection –limited memory and computational resources HTML is not suitable to deliver information to them -> Need for a Wireless Markup Language (WML) What WML offers –specific layout –new metaphor (deck, cards) –state management –binary XML format to make data more concise The same metaphor can be used for e-forms in various domains: interactive kiosks, medical forms, etc.

7 4/13/01 Manipulating XML documents Manipulation –parsing: reading, checking syntax, transforming in internal format –navigating –modifying Fortunately, XML comes with a standard API that offers all these features Document Object Model (DOM) API: Application Programming Interface

8 4/13/01 DOM “DOM provides a programmatic access to the content, structure and style of XML documents and allows languages such as Java to extract information from documents containing specific tags as if they were objects.” [Ardent’s white paper on XML] Platform neutral API designed by W3C using CORBA/IDL Mapping to various programming languages (Java, C++, Perl, etc.) DOM supported by all the major players DOM makes XML documents parser and representation independent

9 4/13/01 DOM overview What DOM is doing Shady Grove Aeolian Over the River, Charlie Dorian

10 4/13/01 The DOM API (overview) Node AttrCharacterData CommentText CDATASection DocumentElementEntity NodeList interface Document createAttribute(…) createCDATASection(…) createComment(…) createElement(…) createTextNode(…) interface Node appendChild(…) getAttributes(…) getChildNodes(…) interface Element getAttribute(name) getAttributeNode(name) getElementsByTagName(name) The full API can be found at

11 4/13/01 DOM in action We take an HTML page from the IBM Patent server and we XML-ize it. From it, we want to extract some specific information, such as the name of the inventors. 4 ways to do it –Java DOM –Java XQL –Perl –XML-QL (will return an XML document)

12 4/13/01 The Patent Example Converted using W4F

13 4/13/01 DOM with Java import com.ibm.xml.parser.*; import org.w3c.dom.*; import java.io.*; public class Test { public static void main(String args[]) throws Exception { Parser parser = new Parser( args[0] ); Document doc = parser.readStream( new FileInputStream( args[0] )); NodeList nodes = doc.getElementsByTagName("Inventor"); int n = nodes.getLength(); for(int i=0; i<n; i++) { Element node = (Element) nodes.item(i); String href= node.getAttribute("First_Name"); System.out.println(href); }

14 4/13/01 DOM with Java and XQL (GMD, IBM) import de.gmd.ipsi.xql.*; import org.w3c.dom.*; import com.ibm.xml.parser.*; import java.io.*; public class XQLTest { public static void main(String args[]) throws Exception { Parser parser = new Parser( args[0] ); Document doc = parser.readStream( new FileInputStream( args[0] )); XQLResult r = XQL.execute("//Inventor", doc ); for(int i=0; i<r.getLength(); i++) { Element inventor = (Element) r.getItem(i); String href = inventor.getAttribute("First_Name"); System.out.println(href); }

15 4/13/01 DOM with Perl Extracting the name of the Inventors from the IBM Patent database. #!/usr/bin/perl use XML::DOM; my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile ("patent.xml"); my $nodes = $doc->getElementsByTagName ("Inventor"); my $n = $nodes->getLength; for (my $i = 0; $i < $n; $i++) { my $node = $nodes->item ($i); my $href = $node->getAttribute ("First_Name"); print $href, "\n"; } Include the Perl package Instantiate a new parser and parse the source file. Get the list of nodes that correspond to. For each node, extract the First_Name attribute and print it.

16 4/13/01 SAX, a low-level alternative to DOM SAX –simple API for XML –supported by most XML parsers –event-driven parser Instead of reading the entire file in memory and building a tree, SAX reads a stream of tokens and triggers events –startDocument –startElement –endElement –endDocument The programmer has to write a document handler that captures these events and do something with the tokens.

17 4/13/01 An Example of SAX public class OutputHandler implements DocumentHandler { private PrintWriter pw; } public OutputHandler() { this.pw = new PrintWriter( System.out ); } public OutputHandler(PrintWriter pw) { this.pw = pw; } public String toString() { pw.flush(); return ""; } public void characters(char[] ch, int start, int length) { pw.print(new String(ch, length)); return ""; } /* to be continued … */ public void endDocument() { pw.println(" "); } public void endElement(String name) { pw.println(" "); } public void startDocument() { pw.println(" "); return; } public void startElement(String name, AttributeList atts) { pw.print("<" + name); if (atts != null) for(int i = 0; i < atts.getLength(); ++i) pw.print(" " + atts.getName(i) + "=\"" + atts.getValue(i) + "\""); pw.println(">"); return; }

18 4/13/01 SAX vs DOM SAX –does not store anything in memory (great for stream-based processing) –navigation in the document is clumsy –does not permit to update an XML document DOM –permits updates –offers the DOM API for navigation/construction –requires the entire document to be stored in main memory

19 4/13/01 The Missing Link There is only a “gentlemen’s agreement” between the application and its XML environment. Why do we need to go beyond that? –performance –static guarantees (helps to identify and control failures) How do we create a tight contract between the application and its XML environment? XML (input) Application XML (output)

20 4/13/01 XML Binding Requirements –high-level specification for XML (e.g. DTD, XML-Schemas, UML, etc.) –a mapping to your favorite programming language (e.g. Java) –a compiler that will generate code (“stubs” that define an API) (Same paradigm as CORBA/IDL or ODMG/ODL) Sun’s Proposal: XML spec. compiler stubs

21 4/13/01 Generic (DOM/SAX) vs Domain Specific API generic API –generic parsing –getElement(“order”) –getAttribute(“date”) –generic marshalling only runtime checks domain specific API –domain specific parsing –get_order() –get_date() –domain specific marshalling both static and runtime checks Instead of a generic API (e.g. SAX, DOM), the application will use a domain specific API generated from the specification. Issues –mapping accurately XML “types” to a programming language –static checks vs runtime checks (some features from the specification cannot be checked statically)

22 4/13/01 XML programming Resources –Java and XML, Brett McLaughlin, Mike Loukides XML parsers (DOM/SAX) –Apache –Oracle –Sun Project X –Microsoft XML-binding frameworks –Oracle ClassGenerator –Castor