Lecture 16 Introduction to XML Boriana Koleva Room: C54

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

XML I.
© De Montfort University, XML – a meta language Howell Istance and Peter Norris School of Computing De Montfort University.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
Document Type Definition DTDs CS-328. What is a DTD Defines the structure of an XML document Only the elements defined in a DTD can be used in an XML.
Document Type Definitions
A Technical Introduction to XML Transparency No. 1 XML quick References.
Thayer School of Engineering Dartmouth Lecture 2 Overview Web Services concept XML introduction Visual Studio.net.
Chapter 10 © 2001 by Addison Wesley Longman, Inc. 1 Chapter 10 Sebesta: Programming the World Wide Web.
COS 381 Day 15. Agenda Assignment 3 Corrected 3 A’s, 1 B, 1 C and 2 D’s Assignment 4 posted Due March 28 Those that wanted to resubmit Assignment 2 must.
COS 381 Day 14. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Chapter 8 Introduction to XML. © 2006 Pearson Addison-Wesley. All rights reserved Introduction - SGML is a meta-markup language - Developed in.
Validating DOCUMENTS with DTDs
Chapter 7 © 2013 by Pearson Introduction - SGML is a meta-markup language - HTML was developed using SGML in the early 1990s - specifically for.
Chapter 8 © 2003 by Addison-Wesley, Inc. 1 Chapter 8 Introduction to XML.
Chapter 7 © 2010 by Addison Wesley Longman, Inc Introduction - SGML is a meta-markup language - Developed in the early 1980s; ISO std. In
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Introduction to XML. What is XML? Extensible Markup Language XML Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Extensible Markup Language. What is XML? ● meta-markup language ● a language for defining a family of languages ● semantic/structured mark-up language.
XML Syntax - Writing XML and Designing DTD's
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
XML (2) DTD Sungchul Hong.
1 Introduction to Web Application Introduction to XML.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
Lecture 6 XML DTD Content of.xml fileContent of.dtd file.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
XML - DTD Week 4 Anthony Borquez. What can XML do? provides an application independent way of sharing data. independent groups of people can agree to.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
IS432 Semi-Structured Data Lecture 2: DTD Dr. Gamal Al-Shorbagy.
Chapter 7 © 2014 by Pearson Education Introduction - SGML is a meta-markup language - Developed in the early 1980s; ISO std. In HTML was developed.
Chapter 8 Introduction to XML. © 2006 Pearson Addison-Wesley. All rights reserved Introduction - SGML is a meta-markup language - Developed in.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
An Introduction to XML Sandeep Bhattaram
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Understanding How XML Works Ellen Pearlman Eileen Mullin Programming the.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Introduction to XML February 07, From HTML to XML As mentioned in previous classes, if you know HTML, then you already know XML… really! In this.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
225 City Avenue, Suite 106 Bala Cynwyd, PA , phone , fax presents… XML Syntax v2.0.
Well Formed XML The basics. A Simple XML Document Smith Alice.
CS3101: Internet Programming Chapter 06: XML and RSS.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
Document Type Definition (DTD) Eugenia Fernandez IUPUI.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
Extensible Markup Language (XML) Pat Morin COMP 2405.
The XML Language.
Session III Chapter 6 – Creating DTDs
7.1 Introduction - SGML is a meta-markup language
Session II Chapter 6 – Creating DTDs
Allyson Falkner Spokane County ISD
Review of XML IST 421 Spring 2004 Lecture 5.
Presentation transcript:

Lecture 16 Introduction to XML Boriana Koleva Room: C54

Overview Introduction The Syntax of XML XML Document Structure Document Type Definitions

Introduction SGML is a meta-markup language Developed in the early 1980s; ISO standard in 1986 HTML was developed using SGML in the early 1990s - specifically for Web documents Two problems with HTML: 1. Fixed set of tags and attributes User cannot define new tags or attributes So, the given tags must fit every kind of document, and the tags cannot connote any particular meaning 2. There are no restrictions on arrangement or order of tag appearance in a document

Introduction One solution to the first of these problems: Let each group of users define their own tags (with implied meanings) (i.e., design their own “HTML”s using SGML) Problem with using SGML: It’s too large and complex to use, and it is very difficult to build a parser for it A better solution: Define a lite version of SGML

XML XML is not a replacement for HTML HTML is a markup language used to describe the layout of any kind of information XML is a meta-markup language that can be used to define markup languages that can define the meaning of specific kinds of information XML is a very simple and universal way of storing and transferring data of any kind XML does not predefine any tags XML has no hidden specifications All docs described with an XML-derived markup language can be parsed with a single parser

XML We will refer to an XML-based markup language as a tag set Strictly speaking, a tag set is an XML application, but that terminology can be confusing An XML processor is a program that parses XML documents and provides the parts to an application Documents that use an XML-based markup language are XML documents Both IE and Netscape support basic XML

The Syntax of XML The syntax of XML is in two distinct levels: 1. The general low-level rules that apply to all XML documents 2. For a particular XML tag set, either a document type definition (DTD) or an XML schema

General XML Syntax XML documents consist of: 1. Data elements 2. Markup declarations instructions for the XML parser 3. Processing instructions for the application program that is processing the data in the document All XML documents begin with an XML declaration: XML comments are just like HTML comments

General XML Syntax XML names: Must begin with a letter or an underscore They can include digits, hyphens, and periods There is no length limitation They are case sensitive (unlike HTML names) Syntax rules for XML: same as those of XHTML Every XML document defines a single root element, whose opening tag must appear as the first line of the document An XML document that follows all of these rules is well formed

Simple XML example 1960 Cessna Centurian Yellow with white trim Gulfport Mississippi

XML Attributes Attributes are not used in XML the way they are in HTML In XML, you often define a new nested tag to provide more info about the content of a tag Nested tags are better than attributes, because attributes cannot describe structure and the structural complexity may grow Attributes should always be used to identify numbers or names of elements (like HTML id and name attributes)

Attribute Example... Maggie Dee Magpie... Maggie Dee Magpie...

XML Document Structure An XML document often uses two auxiliary files: One to specify the structural syntactic rule One to provide a style specification An XML document has a single root element, but often consists of one or more entities An XML document has one document entity All other entities are referenced in the document entity

XML Document Structure Reasons for entity structure: 1. Large documents are easier to manage 2. Repeated entities need not be literally repeated 3. Binary entities can only be referenced in the document entities (XML is all text!) When the XML parser encounters a reference to a non-binary entity, the entity is merged in

XML Entities When the XML parser encounters a reference to a non-binary entity, the entity is merged in Entity names: No length limitation Must begin with a letter, a dash, or a colon Can include letters, digits, periods, dashes, underscores, or colons A reference to an entity has the form: &entity_name;

XML Entities One common use of entities is for special characters that may be used for markup delimiters These are predefined (as in XHTML): < < > > & & " " ' &apos; The user-defined entities can be defined only in DTDs

Document Type Definitions (DTDs) A DTD is a set of structural rules called declarations These rules specify a set of elements, along with how and where they can appear in a document Purpose: provide a standard form for a collection of XML documents and define a markup language for them Not all XML documents have or need a DTD The DTD for a document can be internal or external

Document Type Definitions (DTDs) All of the declarations of a DTD are enclosed in the block of a DOCTYPE markup declaration DTD declarations have the form: There are four possible declaration keywords: ELEMENT – to define tags ATTLIST – to define tag attributes ENTITY – to define entities NOTATION – to define data type notations

Declaring Elements An element declaration specifies the name of an element, and the element’s structure If the element is a leaf node of the document tree, its structure is in terms of characters If it is an internal node, its structure is a list of children elements (either leaf or internal nodes) General form: E.g.:

Declaring Elements Child elements can have modifiers Choices animal element contains either a cat child or a dog child

Declaring Elements Parentheses either a choice or a sequence can be enclosed in parentheses to describe a content model Leaf nodes specify data types, most often PCDATA (parsable character data) Data type could also be EMPTY (no content) and ANY (can have any content) Example of a leaf declaration:

Declaring Attributes General form: There are ten different attribute types CDATA – any string of characters ENUMERATION – list of all possible values for the attribute separated by vertical bars

Declaring Attributes Default values:

Declaring Attributes Attribute specifications in a DTD: An XML element that is valid for above DTD...

Declaring Entities A general entity can be referenced anywhere in the content of an XML document General form of declaration: e.g. A reference: &jfk; If the entity value is longer than a line, define it in a separate file (an external text entity)

A Sample DTD Example DTD /planes.dtdhttp:// /planes.dtd An XML document valid for planes.dtd /planes.xmlhttp:// /planes.xml

DTDs XML Parsers Always check for well formedness Some check for validity, relative to a given DTD Called validating XML parsers You can download a validating XML parser from: Internal DTDs <!DOCTYPE root_name [ … ]> External DTDs

Summary Introduction The Syntax of XML XML Document Structure Document Type Definitions Declaring elements Declaring attributes Declaring entities