Lecture 8 XML & its applications

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

MSc IT UFCE8K-15-M Data Management Prakash Chatterjee Room 3P16
UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture 11 Introduction to XML.
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture 12 Extensible Stylesheet Language Transformations : XSLT.
An Introduction to XML Based on the W3C XML Recommendations.
Lecture 13 XML and its applications. Definition Extensible Markup Language, abbreviated XML, describes a class of data objects called XML documents and.
Lecture 14 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
XML A brief introduction ---by Yongzhu Li. XML --- a brief introduction 2 CSI668 Topics in System Architecture SUNY Albany Computer Science Department.
Outline IS400: Development of Business Applications on the Internet Fall 2004 Instructor: Dr. Boris Jukic XML.
26-Jun-15 XML. 2 HTML and XML, I XML stands for eXtensible Markup Language HTML is used to mark up text so it can be displayed to users XML is used to.
Tutorial 11 Creating XML Document
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
Topics The "bigger picture" –The "XML sales pitch" –XML/XHTML vs. SGML/HTML –XML in electronic publishing –XML and the future, web 2.0 XML basics: –Building.
ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard.
XML at Work John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: (x2073)
Lecture 15 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
XML and XSL Institutional Web Management 2001: Organising Chaos.
Scientific Markup Languages Birds of a Feather A 10-Minute Introduction to XML Timothy W. Cole Mathematics Librarian & Professor of.
XML and friends Part 1 - XML and DTD ELAG 2001 workshop 8 Jan Erik Kofoed © BIBSYS Library Automation.
XML The Overview. Three Key Questions What is XML? What Problems does it solve? Where and how is it used?
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Introduction to XML Eugenia Fernandez IUPUI. What is XML? From the World Wide Web Consortium (W3C) The Extensible Markup Language (XML) is the universal.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
XML About XML Things to be known Related Technologies XML DOC Structure Exploring XML.
Lecture 14 Extensible Stylesheet Language Transformations : XSLT.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
CP3024 Lecture 9 XML: Extensible Markup Language.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
1 Credits Prepared by: Rajendra P. Srivastava Ernst & Young Professor University of Kansas Sponsored by: Ernst & Young, LLP (August 2005) XBRL Module Part.
XML & varieties, e.g. VoiceXML By: Shawn Ramdass, Saji Abraham & Billy Santamorena.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
Well Formed XML The basics. A Simple XML Document Smith Alice.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
XML The Overview. Three Key Questions What is XML? What Problems does it solve? Where and how is it used?
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
CHAPTER NINE Accessing Data Using XML. McGraw Hill/Irwin ©2002 by The McGraw-Hill Companies, Inc. All rights reserved Introduction The eXtensible.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
SNU OOPSLA Lab. A Tour of XML © copyright 2001 SNU OOPSLA Lab.
Unit 4 Representing Web Data: XML
XML QUESTIONS AND ANSWERS
Lecture 9 XML & its applications
eXtensible Markup Language
Introduction to XHTML.
Session I - Introduction
Session I - Introduction
Introduction to XML Mr. Majed Bouchahma
Chapter 7 Representing Web Data: XML
Creating an XML Document
Introducing HTML & XHTML:
Web Programming Maymester 2004
XML Introduction By Hongming Yu Feb 6th, 2002.
Introduction to XML Mr. Majed Bouchahma
What is XML?.
eXtensible Markup Language
Lecture 9 XML & its applications
Allyson Falkner Spokane County ISD
Presentation transcript:

Lecture 8 XML & its applications

Definition Extensible Markup Language, abbreviated XML, describes a class of data objects called XML documents and partially describes the behavior of computer programs which process them. XML is an application profile or restricted form of SGML, the Standard Generalized Markup Language [ISO 8879]. By construction, XML documents are conforming SGML documents. Extensible Markup Language (XML) 1.0 (Third Edition) W3C Recommendation 04 February 2004

So what is it really? A document syntax (markup) standard for text documents that is simple and open (non-proprietary) for electronic data exchange and storage. It is flexible and eXtendable (Xml) because it allows users to create their own vocabularies (new markup languages) - no fixed set of tags as in HTML or XHTML. XML documents contain only data delimited by tags – no formatting instructions or style. Arguably the most important document syntax standard in the history of computing – “the ASCII of the Internet Age”.

A little history Developed by an XML Working Group formed under the auspices of the World Wide Web Consortium (W3C) in 1996. A subset of SGML (Standard Generalized Markup Language) originally designed to meet the challenges of large-scale electronic publishing. XML now adopted in fields as diverse as law, healthcare, insurance, multimedia, web publishing, EDI, telecommunications, aeronautics, engineering, software, hospitality, tourism, retail, stock trading, etc. etc. etc. ………

Design goals The original design goals for XML were: - that it should be straightforwardly usable over the Internet. - that it should support a wide variety of applications. - that it be compatible with SGML. - that it should be easy to write programs which process XML documents. - that the number of optional features in XML were to be kept to the absolute minimum, ideally zero. - that XML documents should be human-legible and reasonably clear. - that the XML design would be prepared quickly. - that the design of XML would be formal and concise. - that XML documents would be easy to create. - that terseness in XML markup was to be of minimal importance.

Example XML document <?xml version="1.0" encoding="UTF-8"?> <patient nhs-no="7503557856"> <name> <first>Joseph</first> <middle>Michael</middle> <last>Bloggs</last> <previous /> <preferred>Joe</preferred> </name> <title>Mr</title> <address> <street1>2 Gloucester Road</street1> <street2 /> <street3 /> <city>Bristol</city> <county>Avon</county> <postcode>BS2 4QS</postcode> </address> <tel> <home>0117 9541054</home> <mobile>07710 234674</mobile> </tel> <email>joe.bloggs@email.com</email> <fax /> </patient>

Other (traditional) formats pipe dilimited nhs-no|first|middle|last|previous|preferred|………………. |email|fax 7503557856|Joseph|Michael|Bloggs|||Joe|………………….|joe.bloggs@email.com| relational table Patient nhs-no 7503557856 first Joseph middle Michael

Example XML document deconstructed xml declaration (optional) used by xml processor; this documents conforms to xml version 1 and uses the UTF-8 standard (Unicode optimized for ASCII) <?xml version="1.0" encoding="UTF-8"?> <patient nhs-no="7503557856"> <!-- Patient demographics --> <name > <first>Joseph</first> <middle>Michael</middle> <last>Bloggs</last> <previous/> <preferred>Joe</preferred> </name> <title>Mr</title> <address> <street>2 Gloucester Road</street1> <street /> <city>Bristol</city> <county>Avon</county> <postcode>BS2 4QS</postcode> </address> <tel> <home>0117 9541054</home> <mobile>07710 234674</mobile> </tel> <email>joe.bloggs@email.com</email> <fax /> </patient> root element; every well formed xml document must be enclosed by exactly one root element. attribute; attributes provide additional information about an element and consist of a name value pair; the value must be enclosed in a single (‘) or double quote (“) a comment; comments must be delimited by the <!-- --> characters as in xhtml a simple element containing text a complex element containing other elements and text empty elements

Tree view of example XML document (all xml documents are hierarchical in structure) patient nhs-no 7503557856 name title address tel fax Mr first middle last previous preferred street1 street2 street3 city county postcode 2 Gloucester Rd Bristol Avon BS2 4QS Joseph Michael Bloggs Joe home mobile 01179541054 07710234674 KEY element content attribute

Well-formed XML documents (1) Every XML document must be well-formed and must therefore adhere to the following rules (among others): Every start-tag must have a matching end tag. Elements may nest but must not overlap. <name>Anna<em>Coffey</em></name> - √ <name><em>Anna</name>Coffey</em> - × There must be exactly one root element. Attribute values must be quoted. An element must not be quoted. Comments and processing instructions may not appear inside tags. No unescaped < or & signs may occur in the character data of an element. Note: A XML document may be well-formed but not valid. A valid document requires a declaration that identifies a Document Type Definition (DTD) or Schema that the document conforms to. This ensures that the document meets various grammar rules for each of its elements and attributes, their order and the values that are allowed. A validating parser can check the document to ensure these rules are met. We will look at XML Schemas in some detail in the next lecture.

Well-formed XML documents (2) Element names are case sensitive - <NAME>, <name>, <Name> & <NaMe> are four different element types. No white spaces in element name - <First Name> not allowed; <First_Name> OK. Element names cannot start with the letters “XML” or “xml” – reserved terms. Element names must start with a letter or a underscore. Element names cannot start with a number but numbers may be embedded within an element name - <2you> not allowed; <me2you> is OK. Attribute names are constrained by the above rules for element names. Entity references are used to substitute specific characters. There are five predefined entities built into XML: Entity Char Notes & & Do not use inside processing instructions < < Use inside attribute values quoted with “. > > Use after ]] in normal text and inside processing instruction. " “ Use inside attribute values quoted with “. &apos; ‘ Use inside attribute values quoted with ‘.

XML Validation (1) XML documents are not directly written; instead XML is used to create one or more vocabularies, specific custom markup languages (often referred to as XML applications), and it is these languages which are used to create documents. such a language (a set of namespaces, elements, attributes etc. – a vocabulary) is defined using a set of rules which specify the set (potentially infinite) of complying documents. such a set of rules is generically referred to as a schema. for instance, in our example document, we may want to specify rules that state that the <name> element must always contain exactly one each of the <first>, <middle>, <last>, <previous> & <preferred> elements and that they must occur in this order. additional rules we might want to specify are that the <first> & <last> elements must always contain alphanumeric values (not empty) and that they must never exceed 256 characters each.

XML Validation (2) - A set of rules that are used to validate a xml document is referred to as a schema. A document conforming to a particular schema is said to be valid against that schema, and the process of checking that conformance is called validation. Schema languages differentiate between at least four levels of validation: The validation of the markup -- controlling the structure of a document. The validation of the content of individual leaf nodes (data-typing) The validation of integrity, i.e. of the links between nodes within a document or between documents. Any other tests (often called "business rules"). There are currently two types of schema languages : - grammar based - for specifying structure, form, and syntax (e.g. DTD, XML Schema, Relax NG) - rule based - for expressing data relationships, such as operational and business rules (e.g. Schematron) We will look at XML validation in detail in a forthcoming lecture

XML Namespaces Namespaces serve two functions in the XML specification: To distinguish between elements and attributes from two different vocabularies with different meanings that might share the same name and hence avoid naming collisions. To group all the related attributes from a single XML application together so that software can easily recognise them. Consider the following fragments from two different documents: <name>Bernadette Coffey</name> and <name>Hegel in a Nutshell</name> The first <name> element refers to the name of a person and the second to the name of a book. If we were to build a merged document (say Bernadette’s reading list) we will have a collision since there are two <name> elements with different meanings. Namespaces can distinguish between the two by using prefixes. <student:name>Bernadette Coffey</student:name> <book:name>Hegel in a Nutshell</book:name> Each element has a prefix corresponding to a uniform resource identifier (URI) that uniquely identifies the namespace e.g. <student xmlns = http://www.uwe.ac.uk/CEMS/Students> and <book xmlns = http://www.uwe.ac.uk/Library/Books> BUT – don’t confuse URI’s with URL’s. URL’s are a subset of URI’s that locate resources based on a network filename concept. A URL is a path to a file or resource on the Web. A URI used as a namespace is simply a unique name.

XML Applications (1) XSLT – Extensible Stylesheet Language Transformations is an application for specifying rules which transform one XML document into another document. It uses template rules in the stylesheet to match patterns in the input document and when a match is found it writes the template from the rule to the output tree. We will look at XSLT in detail in Lecture15.

XML Applications (2) XLinks - is the XML Linking Language. It defines how one document links to another. It is divided into two parts XLinks and XPointer (which identifies a particular part of the document (re: anchors in HTML)). XPath – XPath is a non-XML language for identifying particular parts of an XML document. It is designed to be used in conjunction with the Extensible Stylesheet Language Transformations (XSLT) and XPointer. XForms – is the W3C’s name for a specification of Web forms that can be used with a wide variety of platforms including desktop computers, hand helds, information appliances and even paper. XQuery – an XML based query language to extract data from real or virtual documents providing the needed interaction between the Web and databases. SVG – Scalable Vector Graphics. A XML application which describes vector graphics data for JPEG, GIF and PNG for distribution and display over the web. Other applications (and the list is growing rapidly) include – XML Signature, XML Encryption, Web Services (SOAP, WDSL & UDDI), XML Key Management, Synchronized Multimedia Integration Language (SMIL), etc. etc. etc.

XML Vocabularies XHTML – the Extensible HyperText Markup Language which reproduces and extends HTML. An XHTML document conforms to all rules required of a well formed XML document and drops many of the weak features of HTML e.g. the <font> tag. WML – the Wireless Markup Language is a strict HTML type vocabulary for use with wireless-enabled devices such as mobile phones, PDA’s & pagers. InkML – For representing digital ink data that is input with a pen. MathML – For the inclusion of mathematical formulas in web pages and machine to machine communications. CML – Chemical Markup Language is a XML vocabulary for representing molecular and chemical information. A formula can be transformed into a graphic represenation for displaying on a web page. Others standardized vocabularies include the Banking Industry Technology Secretariat (BITS); Financial Exchange (IFX); Bank Internet Payment System (BIPS); Telecommunications Interchange Markup (TIM); Common Business Library (xCBL); Electronic Business XML Initiative (ebXML); Product Data Markup Language (PDML); Financial Information eXchange protocol (FIX); The Text Encoding Initiative (TEI) and hundreds of others.