CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.

Slides:



Advertisements
Similar presentations
J0 1 Marco Ronchetti - Web architectures – Laurea Specialistica in Informatica – Università di Trento Java XML parsing.
Advertisements

Technische universität dortmund Service Computing Service Computing Prof. Dr. Ramin Yahyapour IT & Medien Centrum 22. Oktober 2009.
The Semantic Web. The Web Today Designed for Human to read Cannot express meaning Architecture: URL –Decentralized: Link structure Language: html.
1 CP3024 Lecture 9 XML revisited, XSL, XSLT, XPath, XSL Formatting Objects.
SDPL 2002Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.
Tomcat Java and XML. Announcements  Final homework assigned Wednesday  Two week deadline  Will cover servlets + JAXP.
Parsing XML into programming languages JAXP, DOM, SAX, JDOM/DOM4J, Xerces, Xalan, JAXB.
Xerces The Apache XML Project Yvonne Yao. Introduction Set of libraries that provides functionalities to parse XML documents Set of libraries that provides.
21-Jun-15 SAX (Abbreviated). 2 XML Parsers SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard.
26-Jun-15 SAX. SAX and DOM SAX and DOM are standards for XML parsers--program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
JAX- Java APIs for XML by J. Pearce. Some XML Standards Basic –SAX (sequential access parser) –DOM (random access parser) –XSL (XSLT, XPATH) –DTD Schema.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
17 Apr 2002 XML Stylesheets Andy Clark. What Is It? Extensible Stylesheet Language (XSL) Language for document transformation – Transformation (XSLT)
SDPL : (XML APIs) JAXP1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –In Java: through JAXP –An overview of.
Processing of structured documents Spring 2003, Part 5 Helena Ahonen-Myka.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
The Joy of SAX (and DOM, and JDOM…) Bill MacCartney 11 October 2004.
SDPL 2003Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
XML for E-commerce III Helena Ahonen-Myka. In this part... n Transforming XML n Traversing XML n Web publishing frameworks.
XSLT for Data Manipulation By: April Fleming. What We Will Cover The What, Why, When, and How of XSLT What tools you will need to get started A sample.
Structured-Document Processing Languages Spring 2011 Course Review Repetitio mater studiorum est!
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
SDPL 2004Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
1 JAXP & XSLT. Objectives 2  TrAX API  Transforming XML Documents  Workshops.
3/29/2001 O'Reilly Java Java API for XML Processing 1.1 What’s New Edwin Goei Engineer, Sun Microsystems.
1 Java and XML Modified from presentation by: Barry Burd Drew University Portions © 2002 Hungry Minds, Inc.
EXtensible Markup Language (XML) James Atlas July 15, 2008.
SDPL 2002Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
SDPL 20113: XML APIs and SAX1 3. XML Processor APIs n How can (Java) applications manipulate structured (XML) documents? –An overview of XML processor.
CSE3201/CSE4500 Information Retrieval Systems XSLT – Part 2.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
XSLT Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER.
XSLT Introduction. XSLT is the transformation part of XSL An XSLT file contains rules which are applied against and XML file to produce an output Outputs.
XML Processing in Java. Required tools Sun JDK 1.4, e.g.: JAXP (part of Java Web Services Developer Pack, already in Sun.
Java API for XML Processing (JAXP) Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
Java and XML. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information about a document. Tags are added.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
SAX2 and DOM2 Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML and SAX (A quick overview) ● What is XML? ● What are SAX and DOM? ● Using SAX.
1 JAXP & XPATH. Objectives 2  XPath  JAXP Processing of XPath  Workshops.
More XML XPATH, XSLT CS 431 – February 23, 2005 Carl Lagoze – Cornell University.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
 XSL – Extensible Style Sheet Language  XSLT – XSL Transformations › Used to transform XML documents to other formats,like HTML or other XML documents.
CSE3201/CSE4500 Information Retrieval Systems XSLT – Part 2.
SDPL 20063: XML Processor Interfaces1 3. XML Processor APIs n How can (Java) applications manipulate structured (XML) documents? –An overview of XML processor.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
SDPL 2001Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How applications can manipulate structured documents? –An overview of document parser.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
USING ANDROID WITH THE DOM. Slide 2 Lecture Summary DOM concepts SAX vs DOM parsers Parsing HTTP results The Android DOM implementation.
1 Introduction SAX. Objectives 2  Simple API for XML  Parsing an XML Document  Parsing Contents  Parsing Attributes  Processing Instructions  Skipped.
Java API for XML Processing
XML. Contents  Parsing an XML Document  Validating XML Documents.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Unit 4 Representing Web Data: XML
Parsing XML into programming languages
Java/XML.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
Chapter 7 Representing Web Data: XML
Java API for XML Processing
Presentation transcript:

CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras

CSE 6331 © Leonidas Fegaras XML Tools2 XML Processing document parser document validator application XML document XML infoset XML infoset (annotated) Well-formedness checks Reference expansion DTD or XML schema storage system

CSE 6331 © Leonidas Fegaras XML Tools3 DOM The Document Object Model (DOM) is a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content and structure of XML documents. The following is part of the DOM interface: public interface Node { public String getNodeName (); public String getNodeValue (); public NodeList getChildNodes (); public NamedNodeMap getAttributes (); } public interface Element extends Node { public Node getElementsByTagName ( String name ); } public interface Document extends Node { public Element getDocumentElement (); } public interface NodeList { public int getLength (); public Node item ( int index ); }

CSE 6331 © Leonidas Fegaras XML Tools4 DOM Example import java.io.File; import javax.xml.parsers.*; import org.w3c.dom.*; class Test { public static void main ( String args[] ) throws Exception { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new File("depts.xml")); NodeList nodes = doc.getDocumentElement().getChildNodes(); for (int i=0; i<nodes.getLength(); i++) { Node n = nodes.item(i); NodeList ndl = n.getChildNodes(); for (int k=0; k<ndl.getLength(); k++) { Node m = ndl.item(k); if ( (m.getNodeName() == "dept") && (m.getFirstChild().getNodeValue() == "cse") ) { NodeList ncl = ((Element) m).getElementsByTagName("tel"); for (int j=0; j<ncl.getLength(); j++) { Node nc = ncl.item(j); System.out.print(nc.getFirstChild().getNodeValue()); } } }

CSE 6331 © Leonidas Fegaras XML Tools5 Better Programming import java.io.File; import javax.xml.parsers.*; import org.w3c.dom.*; import java.util.Vector; class Sequence extends Vector { Sequence () { super(); } Sequence ( String filename ) throws Exception { super(); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new File(filename)); add((Object) doc.getDocumentElement()); } Sequence child ( String tagname ) { Sequence result = new Sequence(); for (int i = 0; i<size(); i++) { Node n = (Node) elementAt(i); NodeList c = n.getChildNodes(); for (int k = 0; k<c.getLength(); k++) if (c.item(k).getNodeName().equals(tagname)) result.add((Object) c.item(k)); }; return result; } void print () { for (int i = 0; i<size(); i++) System.out.println(elementAt(i).toString()); } class DOM { public static void main ( String args[] ) throws Exception { (new Sequence("cs.xml")).child("gradstudent").child("name").print(); }

CSE 6331 © Leonidas Fegaras XML Tools6 SAX SAX is the Simple API for XML that allows you to process a document as it's being read –in contrast to DOM, which requires the entire document to be read before it takes any action) The SAX API is event based –The XML parser sends events, such as the start or the end of an element, to an event handler, which processes the information

CSE 6331 © Leonidas Fegaras XML Tools7 Parser Events Receive notification of the beginning of a document void startDocument () Receive notification of the end of a document void endDocument () Receive notification of the beginning of an element void startElement ( String namespace, String localName, String qName, Attributes atts ) Receive notification of the end of an element void endElement ( String namespace, String localName, String qName ) Receive notification of character data void characters ( char[] ch, int start, int length )

CSE 6331 © Leonidas Fegaras XML Tools8 SAX Example: a Printer import java.io.FileReader; import javax.xml.parsers.*; import org.xml.sax.*; import org.xml.sax.helpers.*; class Printer extends DefaultHandler { public Printer () { super(); } public void startDocument () {} public void endDocument () { System.out.println(); } public void startElement ( String uri, String name, String tag, Attributes atts ) { System.out.print(“ ”); } public void endElement ( String uri, String name, String tag ) { System.out.print(“ ”); } public void characters ( char text[], int start, int length ) { System.out.print(new String(text,start,length)); }

CSE 6331 © Leonidas Fegaras XML Tools9 The Child Handler class Child extends DefaultHandler { DefaultHandler next;// the next handler in the pipeline String ptag;// the tagname of the child boolean keep;// are we keeping or skipping events? short level;// the depth level of the current element public Child ( String s, DefaultHandler n ) { super(); next = n; ptag = s; keep = false; level = 0; } public void startDocument () throws SAXException { next.startDocument(); } public void endDocument () throws SAXException { next.endDocument(); }

CSE 6331 © Leonidas Fegaras XML Tools10 The Child Handler (cont.) public void startElement ( String nm, String ln, String qn, Attributes a ) throws SAXException { if (level++ == 1) keep = ptag.equals(qn); if (keep) next.startElement(nm,ln,qn,a); } public void endElement ( String nm, String ln, String qn ) throws SAXException { if (keep) next.endElement(nm,ln,qn); if (--level == 1) keep = false; } public void characters ( char[] text, int start, int length ) throws SAXException { if (keep) next.characters(text,start,length); }

CSE 6331 © Leonidas Fegaras XML Tools11 Forming the Pipeline class SAX { public static void main ( String args[] ) throws Exception { SAXParserFactory pf = SAXParserFactory.newInstance(); SAXParser parser = pf.newSAXParser(); DefaultHandler handler = new Child("gradstudent", new Child("name", new Printer())); parser.parse(new InputSource(new FileReader("cs.xml")), handler); } Child:gradstudent Child:name PrinterSAX parser

CSE 6331 © Leonidas Fegaras XML Tools12 Example Input Stream Computer Science Smith John... SAX Events SD: SE: department SE: deptname C: Computer Science EE: deptname SE: gradstudent SE: name SE: lastname C: Smith EE: lastname SE: firstname C: John EE: firstname EE: name EE: gradstudent... EE: department ED: Child: gradstudentChild: namePrinter

CSE 6331 © Leonidas Fegaras XML Tools13 XSL Transformation A stylesheet specification language for converting XML documents into various forms (XML, HTML, plain text, etc). Can transform each XML element into another element, add new elements into the output file, or remove elements. Can rearrange and sort elements, test and make decisions about which elements to display, and much more. Based on XPath: <xsl:stylesheet version=’1.0’ xmlns:xsl=’http//

CSE 6331 © Leonidas Fegaras XML Tools14 XSLT Templates XSL uses XPath to define parts of the source document that match one or more predefined templates. When a match is found, XSLT will transform the matching part of the source document into the result document. The parts of the source document that do not match a template will end up unmodified in the result document (they will use the default templates). Form: … The default (implicit) templates visit all nodes and strip out all tags:

CSE 6331 © Leonidas Fegaras XML Tools15 Other XSLT Elements select the value of an XML element and add it to the output stream of the transformation, e.g.. copy the entire XML element to the output stream of the transformation. apply the template rules to the elements that match the XPath expression. … add an element to the output with a tag-name derived from the XPath. Example: <xsl:stylesheet version = ’1.0’ xmlns:xsl=’

CSE 6331 © Leonidas Fegaras XML Tools16 Copy the Entire Document <xsl:stylesheet version = ’1.0’ xmlns:xsl=’

CSE 6331 © Leonidas Fegaras XML Tools17 More on XSLT Conflict resolution: more specific templates overwrite more general templates. Templates are assigned default priorities, but they can be overwritten using priority=“n” in a template. Modes can be used to group together templates. No mode is an empty mode. Conditional and loop statements: body Variables can be used to name data: value Variables are used as {$x} in XPaths.

CSE 6331 © Leonidas Fegaras XML Tools18 Using XSLT import javax.xml.parsers.*; import org.xml.sax.*; import org.w3c.dom.*; import javax.xml.transform.*; import javax.xml.. transform.dom.*; import javax.xml.transformstream.*; import java.io.*; class XSLT { public static void main ( String argv[] ) throws Exception { File stylesheet = new File("x.xsl"); File xmlfile = new File("a.xml"); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document document = db.parse(xmlfile); StreamSource stylesource = new StreamSource(stylesheet); TransformerFactory tf = TransformerFactory.newInstance(); Transformer transformer = tf.newTransformer(stylesource); DOMSource source = new DOMSource(document); StreamResult result = new StreamResult(System.out); transformer.transform(source,result); }