Java/XML.

Slides:



Advertisements
Similar presentations
J0 1 Marco Ronchetti - Web architectures – Laurea Specialistica in Informatica – Università di Trento Java XML parsing.
Advertisements

XML Parsing Using Java APIs AIP Independence project Fall 2010.
SDPL 2002Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.
SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.
11-Jun-15 More DOM. Manipulating DOM trees DOM, unlike SAX, gives you the ability to create and modify XML trees There are a few roadblocks along the.
14-Jun-15 DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
Tomcat Java and XML. Announcements  Final homework assigned Wednesday  Two week deadline  Will cover servlets + JAXP.
JAXP Transformation Package and Xalan Extensions 黃立昇
Parsing XML into programming languages JAXP, DOM, SAX, JDOM/DOM4J, Xerces, Xalan, JAXB.
Parsing XML into programming languages JAXP, DOM, SAX, JDOM/DOM4J, Xerces, Xalan, JAXB.
Xerces The Apache XML Project Yvonne Yao. Introduction Set of libraries that provides functionalities to parse XML documents Set of libraries that provides.
1 Processing XML with Java CS , Spring 2008/9.
Cspp51037 Parsing XML into other programming languages – alternatives to XSLT.
MC365 XML Parsers. Today We Will Cover: An overview of the Java API’s used for XML processing Creating an XML document in Java Parsing an XML document.
28-Jun-15 StAX Streaming API for XML. XML parser comparisons DOM is Memory intensive Read-write Typically used for documents smaller than 10 MB SAX is.
JAX- Java APIs for XML by J. Pearce. Some XML Standards Basic –SAX (sequential access parser) –DOM (random access parser) –XSL (XSLT, XPATH) –DTD Schema.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
17 Apr 2002 XML Programming: TrAX Andy Clark. Java API for XML Processing Standard Java API for loading, creating, accessing, and transforming XML documents.
Web Services with Apache CXF Part 2: JAXB and WSDL to Java Robert Thornton.
Chapter 24 XML. CHAPTER GOALS Understanding XML elements and attributes Understanding the concept of an XML parser Being able to read and write XML documents.
SDPL : (XML APIs) JAXP1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –In Java: through JAXP –An overview of.
Processing of structured documents Spring 2003, Part 5 Helena Ahonen-Myka.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
17 Apr 2002 XML Programming: JAXP Andy Clark. Java API for XML Processing Standard Java API for loading, creating, accessing, and transforming XML documents.
1 XML Data Management 4. Domain Object Model Werner Nutt.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
Java WWW Week 10 Version 2.1 Mar 2008 Slide Java (JSP) and XML  Format of lecture: What is XML? A sample XML file… How to use.
SDPL 2004Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.
17 Apr 2002 XML Programming - DOM Andy Clark. DOM Design Premise Derived from browser document model Defined in IDL – Lowest common denominator programming.
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
3/29/2001 O'Reilly Java Java API for XML Processing 1.1 What’s New Edwin Goei Engineer, Sun Microsystems.
The XML Document Object Model (DOM) Aug’10 – Dec ’10.
XML Processing in Java. Required tools Sun JDK 1.4, e.g.: JAXP (part of Java Web Services Developer Pack, already in Sun.
Java API for XML Processing (JAXP) Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
Consuming eXtensible Markup Language (XML) feeds.
Web Services with Apache CXF Part 2: JAXB and WSDL to Java Robert Thornton.
Document Object Model DOM. Agenda l Introduction to DOM l Java API for XML Parsing (JAXP) l Installation and setup l Steps for DOM parsing l Example –Representing.
Java and XML. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information about a document. Tags are added.
SDPLNotes 3.2: DOM1 3.2 Document Object Model (DOM) n How to provide uniform access to structured documents in diverse applications (parsers, browsers,
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Web services. DOM parsing and SOAP.. Summary. ● Exercise: SAX-Based checkInvoice(), ● push parsing, ● event-based parsing, ● traversal order is depth-first.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
13-Mar-16 DOM. 2 Difference between SAX and DOM DOM reads the entire XML document into memory and stores it as a tree data structure SAX reads the XML.
1 Validation SAX-DOM. Objectives 2  Schema Validation Framework  XML Validation After Transformation  Workshops.
USING ANDROID WITH THE DOM. Slide 2 Lecture Summary DOM concepts SAX vs DOM parsers Parsing HTTP results The Android DOM implementation.
21-Jun-16 Document Object Model DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C.
Java API for XML Processing
XML. Contents  Parsing an XML Document  Validating XML Documents.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Unit 4 Representing Web Data: XML
Parsing XML into programming languages
DOM Robin Burke ECT 360.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
Chapter 7 Representing Web Data: XML
More DOM 13-Nov-18.
DOM Document Object Model.
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
More DOM 28-Nov-18.
DOM 8-Dec-18.
More DOM.
DOM 24-Feb-19.
WaysInJavaToParseXML
XML and Web Services (II/2546)
Presentation transcript:

Java/XML

Parsing XML Goal: read XML files into data structures in programming languages Possible strategies Parse by hand with some reusable libraries Parse into generic tree structure Parse as sequence of events Automagically parse to language-specific objects

Parsing by-hand Advantages Disadvantages Complete control Good if simple needs – build off of regex package Disadvantages Must write the initial code yourself, even if it becomes generalized Pretty tedious and error prone. Gets very hard when using schema or DTD to validate

Parsing into generic tree structure Advantages Industry-wide, language neutral standard exists called DOM (Document Object Model) Learning DOM for one language makes it easy to learn for any other As of JAXP 1.2, support for Schema Have to write much less code to get XML to something you want to manipulate in your program Disadvantages Non-intuitive API, doesn’t take full advantage of Java Still quite a bit of work

What is JAXP? JAXP: Java API for XML Processing In the Java language, the definition of these standard API’s (together with XSLT API) comprise a set of interfaces known as JAXP Java also provides standard implementations together with vendor pluggability layer Some of these come standard with J2SDK, others are only availdable with Web Services Developers Pack We will study these shortly

Another alternative JDOM: Native Java published API for representing XML as tree Like DOM but much more Java-specific, object oriented However, not supported by other languages Also, no support for schema Dom4j another alternative

JAXB JAXB: Java API for XML Bindings Defines an API for automagically representing XML schema as collections of Java classes. Most convenient for application programming Will cover next class

DOM

About DOM Stands for Document Object Model A World Wide Web Consortium (w3c) standard Standard constantly adding new features – Level 3 Core just released this month Well cover most of the basics. There’s always more, and it’s always changing.

DOM abstraction layer in Java -- architecture Emphasis is on allowing vendors to supply their own DOM Implementation without requiring change to source code Returns specific parser implementation org.w3d.dom.Document

Sample Code DocumentBuilderFactor factory = A factory instance is the parser implementation. Can be changed with runtime System property. Jdk has default. Xerces much better. DocumentBuilderFactor factory = DocumentBuilderFactory.newInstance(); /* set some factory options here */ DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.parse(xmlFile); From the factory one obtains an instance of the parser xmlFile can be an java.io.File, an inputstream, etc. javax.xml.parsers.DocumentBuilderFactory javax.xml.parsers.DocumentBuilder org.w3c.dom.Document For reference. Notice that the Document class comes from the w3c-specified bindings.

Validation Note that by default the parser will not validate against a schema or DTD As of JAXP1.2, java provides a default parse than can handle most schema features See next slide for details on how to setup

Important: Schema validation String JAXP_SCHEMA_LANGUAGE =      "http://java.sun.com/xml/jaxp/properties/schemaLanguage"; String W3C_XML_SCHEMA =      "http://www.w3.org/2001/XMLSchema"; Next, you need to configure DocumentBuilderFactory to generate a namespace-aware, validating parser that uses XML Schema: … DocumentBuilderFactory factory =     DocumentBuilderFactory.newInstance()  factory.setNamespaceAware(true);    factory.setValidating(true); try {    factory.setAttribute(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA); } catch (IllegalArgumentException x) {    // Happens if the parser does not support JAXP 1.2   ... }

Associating document with schema An xml file can be associated with a schema in two ways Directly in xml file in regular way Programmatically from java Latter is done as: factory.setAttribute(JAXP_SCHEMA_SOURCE,    new File(schemaSource));

A few notes Factory allows ease of switching parser implementations Java provides simple DOM implementation, but much better to use vendor-supplied when doing serious work Xerces, part of apache project, is installed on cluster as Eclipse plugin. We’ll use next week. Note that some properties are not supported by all parser implementations.

Document object Once a Document object is obtained, rich API to manipulate. First call is usually Element root = doc.getDocumentElement(); This gets the root element of the Document as an instance of the Element class Note that Element subclasses Node and has methods getType(), getName(), and getValue(), and getChildNodes()

Types of Nodes Note that there are many types of Nodes (ie subclasses of Node: Attr, CDATASection, Comment, Document, DocumentFragment, DocumentType, Element, Entity, EntityReference, Notation, ProcessingInstruction, Text Each of these has a special and non-obvious associated type, value, and name. Standards are language-neutral and are specified on chart on following slide Important: keep this chart nearby when using DOM

Node nodeValue() Attributes nodeType() 2 4 8 9 11 10 1 6 5 7 3 nodeName() nodeValue() Attributes nodeType() Attr Attr name Value of attribute null 2 CDATASection #cdata-section CDATA cotnent 4 Comment #comment Comment content 8 Document #document Null 9 DocumentFragment #document-fragment 11 DocumentType Doc type name 10 Element Tag name NamedNodeMap 1 Entity Entity name 6 EntityReference Name entitry referenced 5 Notation Notation name ProcessingInstruction target Entire string 7 Text #text Actual text 3

Transforming XML

The JAXP Transformation Packages JAXP Transformation APIs: javax.xml.transform This package defines the factory class you use to get a Transformer object. You then configure the transformer with input (Source) and output (Result) objects, and invoke its transform() method to make the transformation happen. The source and result objects are created using classes from one of the other three packages. javax.xml.transform.dom Defines the DOMSource and DOMResult classes that let you use a DOM as an input to or output from a transformation. javax.xml.transform.sax Defines the SAXSource and SAXResult classes that let you use a SAX event generator as input to a transformation, or deliver SAX events as output to a SAX event processor. javax.xml.transform.stream Defines the StreamSource and StreamResult classes that let you use an I/O stream as an input to or output from a transformation.

Transformer Architecture

Writing DOM to XML public class WriteDOM{ public static void main(String[] argv) throws Exception{ File f = new File(argv[0]); DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.parse(f); TransformerFactory tFactory = TransformerFactory.newInstance(); Transformer transformer = tFactory.newTransformer(); DOMSource source = new DOMSource(document); StreamResult result = new StreamResult(System.out); transformer.transform(source, result); }

Creating a DOM from scratch Sometimes you may want to create a DOM tree directly in memory. This is done with: DocumentBuilderFactory factory =  DocumentBuilderFactory.newInstance();          DocumentBuilder builder =         factory.newDocumentBuilder();         document = builder.newDocument();

Manipulating Nodes Once the root node is obtained, typical tree methods exist to manipulate other elements: boolean node.hasChildNodes() NodeList node.getChildNodes() Node node.getNextSibling() Node node.getParentNode() String node.getValue(); String node.getName(); String node.getText(); void setNodeValue(String nodeValue); Node insertBefore(Node new, Node ref);