1 CS122B: Projects in Databases and Web Applications Spring 2015 Notes 05: XML Professor Chen Li Department of Computer Science UC Irvine CS122BNotes 05:

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
XML: Extensible Markup Language
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
Document Type Definitions
Introduction to XLink Transparency No. 1 XML Information Set W3C Recommendation 24 October 2001 (1stEdition) 4 February 2004 (2ndEdition) Cheng-Chia Chen.
31 Signs That Technology Has Taken Over Your Life: #6. When you go into a computer store, you eavesdrop on a salesperson talking with customers -- and.
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
Java API for XML Processing (JAXP) CSE 4/586: Distributed Systems Department of Computer Science and Engineering University at Buffalo, New York Jia Zhao.
Thayer School of Engineering Dartmouth Lecture 2 Overview Web Services concept XML introduction Visual Studio.net.
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Tutorial 11 Creating XML Document
XML Primer. 2 History: SGML vs. HTML vs. XML SGML (1960) XML(1996) HTML(1990) XHTML(2000)
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
Electronic Commerce COMP3210 Session 4: Designing, Building and Evaluating e-Commerce Initiatives – Part II Dr. Paul Walcott Department of Computer Science,
How do I use HTML and XML to present information?.
By Mohsen ashouri.  Introduction  Comparison between XML and HTML  XML Syntax  Challenges  Summary.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
School of Computing and Information Systems CS 371 Web Application Programming XML and JSON Encoding Data.
Web Technologies COMP6115 Session 4: Adding a Database to a Web Site Dr. Paul Walcott Department of Computer Science, Mathematics and Physics University.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
An Introduction to XML Sandeep Bhattaram
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
C# and Windows Programming XML Processing. 2 Contents Markup XML DTDs XML Parsers DOM.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
SAX2 and DOM2 Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Well Formed XML The basics. A Simple XML Document Smith Alice.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized.
Introduction to DTD A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XML DTD. XML Validation XML with correct syntax is "Well Formed" XML. XML validated against a DTD is "Valid" XML.
XML. HTML Before you continue you should have a basic understanding of the following: HTML HTML was designed to display data and to focus on how data.
Tutorial 9 Working with XHTML. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Describe the history and theory of XHTML.
XML CORE CSC1310 Fall XML DOCUMENT XML document XML document is a convenient way for parsers to archive data. In other words, it is a way to describe.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
XML Introduction to XML Extensible Markup Language.
Unit 4 Representing Web Data: XML
CS122B: Projects in Databases and Web Applications Spring 2017
Java XML IS
The XML Language.
Chapter 7 Representing Web Data: XML
Allyson Falkner Spokane County ISD
Presentation transcript:

1 CS122B: Projects in Databases and Web Applications Spring 2015 Notes 05: XML Professor Chen Li Department of Computer Science UC Irvine CS122BNotes 05: XML

CS122BNotes 05: XML2 Outline XML basics DTD Parsing XML (SAX/DOM)

CS122BNotes 05: XML3 An HTML document

CS122BNotes 05: XML4 HTML code About Overview Distinctions Campus Data Strategic Plan Spotlights on Innovation

CS122BNotes 05: XML5 What is the problem? To do more fancy things with documents: –need to make their logical structure explicit. Otherwise, software applications – do not know what is what – do not have any handle over documents.

CS122BNotes 05: XML6 An XML document QuickBooks Inorganic Chemistry Brooks/Cole Publishing 1991 James Bowser 43.72

CS122BNotes 05: XML7 What is XML? eXtensible Markup Language Data are identified using tags (identifiers enclosed in angle brackets: ) Collectively, the tags are known as “markup” XML tags tell you what the data means, rather than how to display it

CS122BNotes 05: XML8 XML versus relational –Relational: structured –XML: semi-structured –Plain text file: unstructured

CS122BNotes 05: XML9 How does XML work? XML allows developers to write their own Document Type Definitions (DTD) DTD is a markup language’s rule book that describes the sets of tags and attributes that is used to describe specific content If you want to use a certain tag, then it must first be defined in DTD

CS122BNotes 05: XML10 Key Components in XML Three generic components, and one customizable component XML Content DTD Rules XML ParserApplication

CS122BNotes 05: XML11 Meta Markup Language Not a language – but a way of specifying other languages Meta-markup language – gives the rules by which other markup languages can be written Portable - platform independent

CS122BNotes 05: XML12 Markup Languages Presentation based: –Markup languages that describe information for presentation for human consumption Content based: –Describe information that is of interest to another computer application

CS122BNotes 05: XML13 HTML and XML HTML tag says "display this data in bold font" –... XML tag acts like a field name in your program It puts a label on a piece of data that identifies it –...

CS122BNotes 05: XML14 Simple Example XML data for a messaging application: Why is it good? Let me count the ways...

CS122BNotes 05: XML15 Element Data between the tag and its matching end tag defines an element of the data Comment: –

CS122BNotes 05: XML16 Example Why is it good? Let me count the ways...

CS122BNotes 05: XML17 Attributes Tags can also contain attributes Attributes contain additional information included as part of the tag, within the tag's angle brackets Attribute name is followed by an equality sign and the attribute value

CS122BNotes 05: XML18 Other Basics White space is essentially irrelevant Commas between attributes are not ignored - if present, they generate an error Case sensitive: “message” and “MESSAGE” are different

CS122BNotes 05: XML19 Well Formed XML Every tag has a closing tag XML represents hierarchical data structures having one tag to contain others – Tags have to be completely nested Correct: – Incorrect –......

CS122BNotes 05: XML20 Empty Tag Empty tag is used when it makes sense to have a tag that stands by itself and doesn't enclose any content - a "flag" –You can create an empty tag by ending it with /> –

CS122BNotes 05: XML21 Example Why is it good? Let me count the ways...

CS122BNotes 05: XML22 Tree representation Hull California 1995 Su Purdue Hull Purdue BOOKS California Su titleauthor title author article book year 1995 ref loc=“library”

CS122BNotes 05: XML23 Special Characters Some characters need to be escaped because they have special significance: –<< –>> –&& –‘&apos; –“" If they were not escaped - would be processed as markup by XML engine

CS122BNotes 05: XML24 Prolog in XML Files XML file always starts with a prolog The minimal prolog contains a declaration that identifies the document as an XML document: The declaration may also contain additional information –version - version of the XML used in the data –encoding - Identifies the character set used –standalone - whether the document references an external entity or data type specification

CS122BNotes 05: XML25 An Example Introduction to CML Overview Why is XML great?

CS122BNotes 05: XML26 Next: Data Type Definition (DTD)

CS122BNotes 05: XML27 Data Type Definition (DTD) DTD specifies the types of tags that can be included in the XML document –it defines which tags are valid, and in what arrangements –where text is expected, letting the parser determine whether the whitespace it sees is significant or ignorable An optional part of the document prolog

CS122BNotes 05: XML28 XML document and DTD item1 Slideshow DB item slide item title item2 AI title item item3 Slideshow Slide title item + * XML Document XML DTD

CS122BNotes 05: XML29 Qualifiers

CS122BNotes 05: XML30 Defining Text PCDATA: –Parsed Character DATA (PCDATA) –"#" that precedes PCDATA indicates that what follows is a special word, rather than an element name CDATA: –Unparsed character data –Normally used for embedding scripts (such as Javascript scripts). Comparison:

CS122BNotes 05: XML31 Attribute Types

32 Next: Parsing XML (SAX/DOM) CS122BNotes 05: XML

CS122BNotes 05: XML33 What is an XML Parsing API? Programming model for accessing an XML document Sits on top of an XML parsing engine Language/platform independent

CS122BNotes 05: XML34 Java XML Parsing Specification The Java XML Parsing Specification is a request to include a standardised way of parsing XML into the Java standard library The specification defines the following packages: –javax.xml.parsers –org.xml.sax –org.xml.sax.helpers –org.w3c.dom The first is an all-new plugability layer, the others come from existing packages

CS122BNotes 05: XML35 Two ways of using XML parsers: SAX and DOM The Java XML Parsing Specification specifies two interfaces for XML parsers: –Simple API for XML (SAX) is a flat, event-driven parser –Document Object Model (DOM) is an object-oriented parser which translates the XML document into a Java Object hierarchy

CS122BNotes 05: XML36 SAX Simple API for XML Event-based XML parsing API Not governed by any standards body –Guy named David Megginson basically owns it… SAX is simply a programming model that the developers of individual XML parsers implement –SAX parser written in Java would expose the equivalent events –"serial access" protocol for XML

CS122BNotes 05: XML37 SAX (cont) A SAX parser reads the XML document as a stream of XML tags: –starting elements, ending elements, text sections, etc. Every time the parser encounters an XML tag it calls a method in its HandlerBase object to deal with the tag. The HandlerBase object is usually written by the application programmer. The HandlerBase object is given as a parameter to the parse() method in the SAX parser. It includes all the code that defines what the XML tags actually ”do”.

CS122BNotes 05: XML38 XML Document How Does SAX work? John Doe Jane Doe SAX Objects Parser startDocument Parser startElement Parser startElement & characters Parser Parser endElement Parser startElement Parser startElement & characters Parser Parser endElement Parser endElement & endDocument

CS122BNotes 05: XML39 SAX structure

CS122BNotes 05: XML40 SAX tutorial x/parsing.html

CS122BNotes 05: XML41 Document Object Model (DOM) Most common XML parser API Tree-based API W3C Standard All DOM compliant parsers use the same object model

CS122BNotes 05: XML42 DOM (cont) A DOM parser is usually referred to as a document builder. It is not really a parser, more like a translator that uses a parser. In fact, most DOM implementations include a SAX parser within the document builder. A document builder reads in the XML document and outputs a hierarchy of Node objects, which corresponds to the structure of the XML document.

CS122BNotes 05: XML43 DOM Objects How Does DOM work? XML Document John Doe Jane Doe Node addressbook person Node Name=“John Doe” Node Node person Node Name=“John Doe” Node XML Parser

CS122BNotes 05: XML44 DOM Structure Model and API hierarchy of Node objects:hierarchy –document, element, attribute, text, comment,... language independent programming DOM API: –get... first/last child, prev/next sibling, childNodes –insertBefore, replace –getElementsByTagName –... Alternative event-based SAX API (Simple API for XML) –does not build a parse tree (reports events when encountering begin/end tags) –for (partially) parsing very large documents

CS122BNotes 05: XML45 DOM references Online tutorial: m/readingXML.html

CS122BNotes 05: XML46 A few functions –Create a DOM document (tree) Document doc = builder.parse( new File(argv[0]) ); –Remove those text nodes from those XML formatting spaces doc.normalize(); –Generate a list of nodes for those “link” elements NodeList links = doc.getElementsByTagName("link"); –W3C treats the text as a node …getFirstChild().getNodeValue()