SGML and XML Text Encoding and Markup Languages Michael Popham

Slides:



Advertisements
Similar presentations
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
Advertisements

3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
Chapter 10 © 2001 by Addison Wesley Longman, Inc. 1 Chapter 10 Sebesta: Programming the World Wide Web.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
XML: What, Why, When & How? Hope Greenberg Center for Teaching & Learning June 11 & 18.
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Introduction to XML This material is based heavily on the tutorial by the same name at
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
XML and XSL Institutional Web Management 2001: Organising Chaos.
XML eXtensible Markup Language by Darrell Payne. Experience Logicon / Sterling Federal C, C++, JavaScript/Jscript, Shell Script, Perl XML Training XML.
XML and friends Part 1 - XML and DTD ELAG 2001 workshop 8 Jan Erik Kofoed © BIBSYS Library Automation.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
An Introduction to XML Presented by Scott Nemec at the UniForum Chicago meeting on 7/25/2006.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
MIS 315 Bsharah An Introduction to XML 1MIS Bsharah.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML - Why: The HTML-Dilemma HTML, SGML, XML - How: Syntax, Concept, Language Elements Basics Well-formed XML-Documents (without DTD) Valid XML-Documents.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
FIGIS’ML Hands-on training - © FAO/FIGIS An introduction to XML Objectives : –what is XML? –XML and HTML –XML documents structure well-formedness.
EXtensible Markup Language (XML) and Documentation --ManojBokil -- Manoj Bokil.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
CSC 551: Web Programming Fall 2001 emerging & alternate Web technologies  Dynamic HTML  ActiveX  XML course overview  online review sheet  advice.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XML eXtensible Markup Language. Topics  What is XML  An XML example  Why is XML important  XML introduction  XML applications  XML support CSEB.
XP Tutorial 9 1 Working with XHTML. XP SGML 2 Standard Generalized Markup Language (SGML) A standard for specifying markup languages. Large, complex standard.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
XP 2 HTML Tutorial 1: Developing a Basic Web Page.
CP3024 Lecture 9 XML: Extensible Markup Language.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
ISO/TC 211 WG4 WI 18 Encoding Foil no. 1 Annex C XML and XMI David Skogan SINTEF Telecom and Informatics
XML for Text Markup An introduction to XML markup.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
1 “Universal Data-Speak”: The eXtensible Markup Language Zack Ives CSE 590DB, Winter 2000 University of Washington 3 January 2000.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
XML A Language Presentation. Outline 1. Introduction 2. XML 2.1 Background 2.2 Structure 2.3 Advantages 3. Related Technologies 3.1 DTD 3.2 Schemas and.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
SCHOOL OF LIBRARY, ARCHIVE AND INFORMATION STUDIES Andy Dawson LIS1510 Library and Archives Automation Issues XML and extensible systems Andy Dawson School.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
Games: XML Presented by: Idham bin Mat Desa Mohd Sharizal bin Hamzah Mohd Radzuan bin Mohd Shaari Shukor bin Nordin.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XML Technology. Emerging Importance of XML –HTML-tagging is display oriented. –XML-based content tagging has important uses: data mining role-oriented.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
Introduction to Informatics - Fall 02 I.What is XML? XML and HTML Where does it fit in with other markup languages? II. How does it work? Your own private.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 14 This presentation © 2004, MacAvon Media Productions XML.
XP 1Creating Web Pages with XML Tutorial 1 New Perspectives on XML Tutorial 1 – Creating an XML Document.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
1 Introduction to XML Babak Esfandiari. 2 What is XML? introduced by W3C in 98 Stands for eXtensible Markup Language it is more general than HTML, but.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
XML QUESTIONS AND ANSWERS
Session I - Introduction
Session I - Introduction
CSCE 315 – Programming Studio Spring 2013
Creating an XML Document
Web Programming Maymester 2004
CSE591: Data Mining by H. Liu
Presentation transcript:

SGML and XML Text Encoding and Markup Languages Michael Popham

Overview (Welcome to acronym hell) The Oxford Text Archive and Arts and Humanities Data Service Markup languages SGML: development and features XML Activity at the W3C Why does all this matter?

Arts & Humanities Data Service AHDS Executive ADSHDSOTAPADSVADS KCL YorkEssexOxfordGlasgow Surrey Inst.

Markup languages A markup language is a set of conventions governing the use of markup These rules typically state what kinds of markup are allowed or required where they are allowed or required how they relate to each other how to distinguish markup from content (the text itself)

Is all markup interchangeable? Loomings \chapter \chapter[1]{Loomings} :h1.1. Loomings.chapter Loomings.cp;.sp 6 a;.ce.bd 1. Loomings ~x Loomings

SGML = ISO 8879 An ISO standard for the definition of markup languages Markup a method of making explicit (and therefore processable) interpretations of a text Markup language a set of defined codes and rules for specifying markup

An SGML document SGML Declaration (techie stuff) Document Type Definition (DTD) Document instance (document) Elements Attributes Entities

Putting it all together SGML Declaration DOCTYPE Declaration Document Instance Intended for “human” readers + optional, local extensions The text itself (content+markup)

SGML is a metalanguage SGML/XML DTD DTD DTD docs docsdocsdocsdocs ISO/W3C A.N.Other Users

SGML HTML docs docsdocsdocs docs TEI ISO12083 SGML DTDs

A newspaper story Elements A story consists of data fields, followed by a headline, and then paragraphs containing sentences of character data, names etc. Attributes It also has an identifier, a date, section etc. Entities Represent boilerplate info., special characters etc. NB: we’re saying nothing about what the elements look like, only what they are

<!ATTLIST storyid ID #REQUIRED date CDATA #REQUIRED section CDATA #IMPLIED> <!ATTLIST name type (person|place|org|any) any reg CDATA #IMPLIED > … A simple(!) SGML DTD

An SGML instance Taylor Daniel Manchester Beckham, Posh Spice, Manchester United, childcare, Sir Alex Ferguson &ellipsis; but the spin may not wash with Ferguson David Beckham ’s advisers claimed yesterday that he had been given no reason whatsoever for being banished from training and dropped from &ManU; ’s first-team after incurring the wrath of his manager &SAF; As Beckham attempted to focus on…

The formatted view

element name or GI content model Omissibility Defining an Element

attribute name attribute value David Beckham ’s advisers claimed yesterday that he had… Elements may take attributes Providing information other than type or context Useful for identification of element occurrences Limited data validation

Documents: another view Documents are made up of entities Entities are named units of storage, using an associated notation Entities can be… A single character or symbol (or a string of these) Another file (e.g. text, image, sound, video etc.) Something on the Web

Like HTML, XML must... Be usable on the net (but not restricted to it!) Support a wide variety of applications Be compatible with SGML Be easy to process Have few optional features (ideally none) Be human-legible and reasonably clear Be specified in a way that is both formal and concise

Unlike HTML... XML is an extensible markup language XML markup can be verified XML markup reflects the meaning of your data, not its appearance

XML cf. SGML— differences No tag omission/minimization Properly delimited comments No inclusions/exclusions Mixed content models optional-repeatable OR-groups with #PCDATA first No & in content model groups Simpler rules for handling whitespace Empty tags use new syntax

How do they really differ? Pre-/Post- the success of the Web Ease-of-implementation and use Greater raw computing power on the desktop “XML is what SGML should have been” More tools, more books, easier to learn

XML Activity at W3C XML Applications Resource Description Framework (RDF), Synchronized Multimedia Integration Language (SMIL), XHTML Extensible Stylesheet Language (XSL) XSL Transformation Language, XSL Formatting Objects XML Linking Language(Xlink) and XML Pointer Language (Xpointer) XML Schema, namespaces

Why does this matter? The XML revolution (hype?) XML = big names XML means application independence for your data XML means shareable, reusable data Improved data longevity(?)

Further information The SGML/XML web page W3C’s XML web page The Text Encoding Initiative …and even “XML: the future of web markup?” by Elliott Pritchard at