1 Processing XML with Java Representation and Management of Data on the Internet A comprehensive tutorial about XML processing with JavaXML processing.

Slides:



Advertisements
Similar presentations
J0 1 Marco Ronchetti - Basi di Dati Web e Distribuite – Laurea Specialistica in Informatica – Università di Trento.
Advertisements

The Document Object Model
Document Object Model. Lecture 18 The Document Object Model (DOM) is not a programming language It is an object-oriented model of web documents Each.
Document Object Model (DOM): An Abstract Data Structure for XML data Alex Dekhtyar Department of Computer Science University of Kentucky.
14-Jun-15 DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
1 Processing XML with Java Representation and Management of Data on the Internet.
1 Processing XML with Java A comprehensive tutorial about XML processing with JavaXML processing with Java XML tutorial of W3SchoolsW3Schools.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
Document Object Model (DOM): An Abstract Data Structure for XML data Alex Dekhtyar Department of Computer Science CSC 560: Management of XML Data.
1 Processing XML with Java CS , Spring 2008/9.
1 Processing XML with Java Representation and Management of Data on the Internet.
JAX- Java APIs for XML by J. Pearce. Some XML Standards Basic –SAX (sequential access parser) –DOM (random access parser) –XSL (XSLT, XPATH) –DTD Schema.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
1 Processing XML with Java Dr. Praveen Madiraju Modified from Dr.Sagiv ’ s slides.
Chapter 24 XML. CHAPTER GOALS Understanding XML elements and attributes Understanding the concept of an XML parser Being able to read and write XML documents.
Xpath Sources:
Processing of structured documents Spring 2003, Part 5 Helena Ahonen-Myka.
1 Processing XML with Java CS , Spring 2010.
1 XML Data Management 4. Domain Object Model Werner Nutt.
XML for E-commerce III Helena Ahonen-Myka. In this part... n Transforming XML n Traversing XML n Web publishing frameworks.
5 Processing XML Parsing XML documents  Document Object Model (DOM)  Simple API for XML (SAX) Class generation Overview.
17 Apr 2002 XML Programming - DOM Andy Clark. DOM Design Premise Derived from browser document model Defined in IDL – Lowest common denominator programming.
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
Parsing with DOM using MSXML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
The XML Document Object Model (DOM) Aug’10 – Dec ’10.
XML 6.4 DOM 6. The XML ‘Alphabet Soup’ XMLExtensible Markup Language Defines XML documents XSLExtensible Stylesheet Language Language for expressing stylesheets;
XML Processing in Java. Required tools Sun JDK 1.4, e.g.: JAXP (part of Java Web Services Developer Pack, already in Sun.
SDPL 2002Notes 3.2: Document Object Model1 3.2 Document Object Model (DOM) n How to provide uniform access to structured documents in diverse applications.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
1 Processing XML with Java Modified from Dr.Sagiv ’ s slides.
Consuming eXtensible Markup Language (XML) feeds.
Consuming eXtensible Markup Language (XML) feeds.
DOM Programming The Document Object Model standardises  what an application can see of the XML data  how it can access it An XML structure is a tree.
C# and Windows Programming XML Processing. 2 Contents Markup XML DTDs XML Parsers DOM.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
1 Dr Alexiei Dingli XML Technologies SAX and DOM.
XML Study-Session: Part III
Document Object Model DOM. Agenda l Introduction to DOM l Java API for XML Parsing (JAXP) l Installation and setup l Steps for DOM parsing l Example –Representing.
SNU OOPSLA Lab. DOM/SAX Applications The ubiquitous XML(9) © copyright 2001 SNU OOPSLA Lab.
Java and XML. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information about a document. Tags are added.
SDPLNotes 3.2: DOM1 3.2 Document Object Model (DOM) n How to provide uniform access to structured documents in diverse applications (parsers, browsers,
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
Schema Data Processing
1 JAXP & XPATH. Objectives 2  XPath  JAXP Processing of XPath  Workshops.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
XML DOM.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
1 Processing XML with Java Representation and Management of Data on the Internet.
Web services. DOM parsing and SOAP.. Summary. ● Exercise: SAX-Based checkInvoice(), ● push parsing, ● event-based parsing, ● traversal order is depth-first.
18-Feb-16 XML eXtensible Markup Language THETOPPERSWAY.COM.
Working with Elements and Attributes Using DOM Eugenia Fernandez IUPUI.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
13-Mar-16 DOM. 2 Difference between SAX and DOM DOM reads the entire XML document into memory and stores it as a tree data structure SAX reads the XML.
USING ANDROID WITH THE DOM. Slide 2 Lecture Summary DOM concepts SAX vs DOM parsers Parsing HTTP results The Android DOM implementation.
21-Jun-16 Document Object Model DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C.
XML. Contents  Parsing an XML Document  Validating XML Documents.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Unit 4 Representing Web Data: XML
Parsing XML into programming languages
Chapter 24 XML.
DOM Robin Burke ECT 360.
Java/XML.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Chapter 7 Representing Web Data: XML
DOM Document Object Model.
DOM 8-Dec-18.
WaysInJavaToParseXML
DOM 24-Feb-19.
XML DOM and CSS Instructors: Geoffrey Fox and Bryan Carpenter
WaysInJavaToParseXML
Presentation transcript:

1 Processing XML with Java Representation and Management of Data on the Internet A comprehensive tutorial about XML processing with JavaXML processing with Java XML tutorial of W3SchoolsW3Schools

2 Resources used for this presentation The Hebrew University of Jerusalem – CS Faculty. An Introduction to XML and Web Technologies – Course ’ s Literature.

3 Parsers What is a parser? Parser Formal grammar Input Analyzed Data The structure(s) of the input, according to the atomic elements and their relationships (as described in the grammar)

4 XML-Parsing Standards We will consider two parsing methods that implement W3C standards for accessing XML DOM -convert XML into a tree of objects - “ random access ” protocol SAX - “ serial access ” protocol -event-driven parsing

5 XML Examples

6 Israel 6,199,008 Jerusalem Ashdod France 60,424,213 world.xml root element validating DTD file reference to an entity

7 XML Tree Model element attribute simple content

8 world.dtd default value Open world.xml in your browser Check world2.xml for #PCDATA exmaple As opposed to required parsed Not parsed

9 <forsale date="12/2/03" xmlns:xhtml=" DBI:.]]> <comment xmlns=" My favorite book! sales.xml “xhtml” namespace declaration default namespace declaration namespace overriding (non-parsed) character data Namespaces

10 <forsale date="12/2/03" xmlns:xhtml=" DBI.]]> <comment xmlns=" My favorite book! sales.xml Namespace: “ ” Local name: “ h1 ” Qualified name: “ xhtml:h1 ”

11 <forsale date="12/2/03" xmlns:xhtml=" DBI.]]> <comment xmlns=" My favorite book! sales.xml Namespace: “ ” Local name: “ par ” Qualified name: “ par ”

12 <forsale date="12/2/03" xmlns:xhtml=" DBI.]]> <comment xmlns=" My favorite book! sales.xml Namespace: “” Local name: “ title ” Qualified name: “ title ”

13 <forsale date="12/2/03" xmlns:xhtml=" DBI.]]> <comment xmlns=" My favorite book! sales.xml Namespace: “ Local name: “ b ” Qualified name: “ xhtml:b ”

14 DOM – Document Object Model

15 DOM Parser DOM = Document Object Model Parser creates a tree object out of the document User accesses data by traversing the tree -The tree and its traversal conform to a W3C standard The API allows for constructing, accessing and manipulating the structure and content of XML documents

16 Israel 6,199,008 Jerusalem Ashdod France 60,424,213

17 The DOM Tree

18 Using a DOM Tree DOM Parser DOM Tree XML File APIAPI Application in memory

19

20 Creating a DOM Tree A DOM tree is generated by a DocumentBuilder The builder is generated by a factory, in order to be implementation independent The factory is chosen according to the system configuration DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.parse("world.xml");

21 Configuring the Factory The methods of the document-builder factory enable you to configure the properties of the document building For example - factory.setValidating(true) - factory.setIgnoringComments(false) Read more about DocumentBuilderFactory Class, DocumentBuilder ClassDocumentBuilderFactory Class DocumentBuilder Class

22 The Node Interface The nodes of the DOM tree include -a special root (denoted document) The Document interface retrieved by builder.parse( … ) actually extends the Node Interface -element nodes -text nodes and CDATA sections -attributes -comments -and more... Every node in the DOM tree implements the Node interface

23 A light- weight fragment of the document. Can hold several sub-trees Interfaces in a DOM Tree DocumentFragment Document CharacterData Text Comment CDATASection Attr Element DocumentType Notation Entity EntityReference ProcessingInstruction Node NodeList NamedNodeMap DocumentType Figure as appears in : “The XML Companion” - Neil Bradley

24 Interfaces in the DOM Tree Document Document TypeElement AttributeElement AttributeText ElementTextEntity ReferenceText Comment

25 Node Navigation Every node has a specific location in tree Node interface specifies methods for tree navigation - Node getFirstChild(); - Node getLastChild(); - Node getNextSibling(); - Node getPreviousSibling(); - Node getParentNode(); - NodeList getChildNodes(); - NamedNodeMap getAttributes()

26 Node Navigation (cont) getFirstChild() getPreviousSibling() getChildNodes() getNextSibling() getLastChild() getParentNode()

27 Node Properties Every node has -a type -a name -a value -attributes The roles of these properties differ according to the node types Nodes of different types implement different interfaces (that extend Node )

28 InterfacenodeNamenodeValueattributes Attrname of attributevalue of attributenull CDATASection "#cdata-section" content of the Sectionnull Comment "#comment" content of the commentnull Document "#document" null DocumentFragment "#document-fragment" null DocumentTypedoc-type namenull Elementtag namenullNodeMap Entityentity namenull EntityReferencename of entity referencednull Notationnotation namenull ProcessingInstructiontargetentire contentnull Text "#text" content of the text nodenull Names, Values and Attributes

29 Node Types - getNodeType() ELEMENT_NODE = 1 ATTRIBUTE_NODE = 2 TEXT_NODE = 3 CDATA_SECTION_NODE = 4 ENTITY_REFERENCE_NODE = 5 ENTITY_NODE = 6 PROCESSING_INSTRUCTION_NODE = 7 COMMENT_NODE = 8 DOCUMENT_NODE = 9 DOCUMENT_TYPE_NODE = 10 DOCUMENT_FRAGMENT_NODE = 11 NOTATION_NODE = 12 if (myNode.getNodeType() == Node.ELEMENT_NODE) { //process node … } Read more about Node InterfaceNode Interface

30 import org.w3c.dom.*; import javax.xml.parsers.*; public class EchoWithDom { public static void main(String[] args) throws Exception { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setIgnoringElementContentWhitespace(true); DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.parse(“world.xml"); new EchoWithDom().echo(doc); }

31 private void echo(Node n) { print(n); if (n.getNodeType() == Node.ELEMENT_NODE) { NamedNodeMap atts = n.getAttributes(); ++depth; for (int i = 0; i < atts.getLength(); i++) echo(atts.item(i)); --depth; } depth++; for (Node child = n.getFirstChild(); child != null; child = child.getNextSibling()) echo(child); depth--; }

32 private int depth = 0; private String[] NODE_TYPES = { "", "ELEMENT", "ATTRIBUTE", "TEXT", "CDATA", "ENTITY_REF", "ENTITY", "PROCESSING_INST", "COMMENT", "DOCUMENT", "DOCUMENT_TYPE", "DOCUMENT_FRAG", "NOTATION" }; private void print(Node n) { for (int i = 0; i < depth; i++) System.out.print(" "); System.out.print(NODE_TYPES[n.getNodeType()] + ":"); System.out.print("Name: "+ n.getNodeName()); System.out.print(" Value: "+ n.getNodeValue()+"\n"); }}

33 public class WorldParser { public static void main(String[] args) throws Exception { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setIgnoringElementContentWhitespace(true); DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.parse("world.xml"); printCities(doc); } Another Example

34 public static void printCities(Document doc) { NodeList cities = doc.getElementsByTagName("city"); for(int i=0; i<cities.getLength(); ++i) { printCity((Element)cities.item(i)); } public static void printCity(Element city) { Node nameNode = city.getElementsByTagName("name").item(0); String cName = nameNode.getFirstChild().getNodeValue(); System.out.println("Found City: " + cName); } Another Example (cont) Searches within descendents

35 Normalizing the DOM Tree Normalizing a DOM Tree has two effects: -Combine adjacent textual nodes -Eliminate empty textual nodes To normalize, apply the normalize() method to the document element Created by node manipulation…

36 Node Manipulation Children of a node in a DOM tree can be manipulated - added, edited, deleted, moved, copied, etc. To constructs new nodes, use the methods of Document - createElement, createAttribute, createTextNode, createCDATASection etc. To manipulate a node, use the methods of Node - appendChild, insertBefore, removeChild, replaceChild, setNodeValue, cloneNode(boolean deep) etc.

37 Node Manipulation (cont) Ref New insertBefore Old New replaceChild cloneNode deep = 'false' deep = 'true' Figure as appears in “The XML Companion” - Neil Bradley