1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

Chapter 7 An Introduction to XML.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
An Introduction to XML Based on the W3C XML Recommendations.
© De Montfort University, XML – a meta language Howell Istance and Peter Norris School of Computing De Montfort University.
History Leading to XHTML
3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
HTML and XHTML Controlling the Display Of Web Content.
XHTML1 Building Document Structure. XHTML2 Objectives In this chapter, you will: Learn how to create Extensible Hypertext Markup Language (XHTML) documents.
Markup Languages Controlling the Display Of Web Content.
Creating a Well-Formed Valid Document. 2 Objectives Introducing XHTML Creating a Well-Formed Document Creating a Valid Document Creating an XHTML Document.
Tutorial 11 Creating XML Document
XML Primer. 2 History: SGML vs. HTML vs. XML SGML (1960) XML(1996) HTML(1990) XHTML(2000)
Upgrading to XHTML DECO 3001 Tutorial 1 – Part 1 Presented by Ji Soo Yoon 19 February 2004 Slides adopted from
Cornell CS 502 Markup Languages SGML, HTML, XML, XHTML CS 502 – Carl Lagoze – Cornell University.
Copyright © 2003 Pearson Education, Inc. Slide 2-1 Created by Cheryl M. Hughes, Harvard University Extension School — Cambridge, MA The Web Wizard’s Guide.
ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard.
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
XP Tutorial 9New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Text.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language)  XML is a markup language for creating documents containing structured information.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
CS134 Web Design & Development Creating a Basic Web Page Mehmud Abliz.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
August Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
EXTENSIBLE MARKUP LANGUAGE(XML)
XHTML. Introduction to XHTML What Is XHTML? – XHTML stands for EXtensible HyperText Markup Language – XHTML is almost identical to HTML 4.01 – XHTML is.
XML - Why: The HTML-Dilemma HTML, SGML, XML - How: Syntax, Concept, Language Elements Basics Well-formed XML-Documents (without DTD) Valid XML-Documents.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Understanding How XML Works Ellen Pearlman Eileen Mullin Programming the.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
225 City Avenue, Suite 106 Bala Cynwyd, PA , phone , fax presents… XML Syntax v2.0.
Well Formed XML The basics. A Simple XML Document Smith Alice.
2.1 XHTML. Motto High thoughts must have high language. –Aristophanes.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XML The Extensible Markup Language (XML ), which is comparable to SGML and modeled on it, describes how to describe a collection of data. A standard way.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
Web Technology (NCS-504) Prepared By Mr. Abhishek Kesharwani Assistant Professor,UCER Naini,Allahabad.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
XML Introduction to XML Extensible Markup Language.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
HTML is about making documents. Simple Code for Simple Layout My Document This is an example HTML document First paragraph Second paragraph This is the.
Blended HTML and CSS Fundamentals 3 rd EDITION Tutorial 1 Using HTML to Create Web Pages.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
Unit 4 Representing Web Data: XML
Chapter 7 Representing Web Data: XML
Allyson Falkner Spokane County ISD
Document Type Definition (DTD)
Presentation transcript:

1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel Lecture 9 Markup languages – XML

2 herbert van de sompel Problem The richness of text elements: letters, scripts, symbols structure: words, sentences, paragraphs, headings, tables appearance: fonts, layout, design, materials special: mathematics, music Digital libraries must represent ever variant!

3 herbert van de sompel Markup and Page description languages Mark-up languages represent the structure of text e.g., SGML, XML The mark-up must be combined with a style sheet for rendering. Page description languages represent the appearance of text e.g., PostScript, PDF

4 herbert van de sompel Markup and style sheets rendering software formatted document document content & structure markup-ed document style sheet rendering instructions

5 herbert van de sompel Multiple renderings from same markup-ed documents rendering software PC display document content & structure markup-ed document style sheet 1 print rendering software style sheet 2

6 herbert van de sompel Example: Oxford English Dictionary typography of printed text represented semantic information. Keyboard the text, capturing all typographic information. Automatic parser to extract semantics (e.g., date, quotation, phonetics, etc.). Markup in SGML to tag semantic information. Separate style sheets for various editions: print, CD- ROM, online. Before the web, yet used with the web.

7 herbert van de sompel XML - general Extensible Markup Language simplified SGML meta-language that allows defining markup languages for documents may replace HTML HTML can be seen as a markup language defined in XML => XHTML

8 herbert van de sompel XML – basic terminology XML instance document: the document that contains the text in a mark-up-ed form style sheet: the document that contains the formatting instructions to be applied to an instance document Document Type Definition: the document that defines the grammar with which instance documents are compliant (elements, attributes, character set, required elements, optional elements, …) XML Schema: similar as DTD, but more powerful An XML application will usually process 3 types of documents

9 herbert van de sompel XML documents – basic building blocks an XML document consists of one or more elements: opening tag closing tag element contenttext (PCDATA or other elements) Paul Smith an element can have attributes, specifying properties of the element attribute namename attribute value “value” Paul Smith an empty element has attributes only

10 herbert van de sompel XML – sample instance document (standalone) Kevin Davies Cracking the Genome 20.00

11 herbert van de sompel XML – XML declaration XML processing instructions: XML version character encoding used in the text standalone: is a DTD required to interpret this document? attribute order is significant

12 herbert van de sompel XML – comment line comment line: will be ignored by XML processor can not appear before the XML declaration can not reside inside an element tag

13 herbert van de sompel XML – the elements Kevin Davies Cracking the Genome elements: hold no special significance for the XML processor, except for document and style rules that are defined for them parent, child, ancestor, descendant

14 herbert van de sompel XML – well formed-ness XML is not at all as forgiving as HTML HTML browser may accept something like this: This is a paragraph. And this is another one. And yet another one. not so with XML. XML is picky => well-formed XML

15 herbert van de sompel XML – well formed-ness download file mltest.xml mltest.xml open in Notepad

16 herbert van de sompel Every XML document must have a declaration Every opening tag must have a closing tag. Tags can not overlap (well-nested) XML documents can only have 1 root element Attribute values must be in quotation marks (single or double) – Only one value per attribute. XML – well formed-ness

17 herbert van de sompel can not be used in text. Encode “sanity characters”: << && ]]>]]& >> “" ‘&apos; XML – well formed-ness

18 herbert van de sompel element names must obey XML naming conventions: start with letter or underscore can contain letters, numbers, hyphens, periods, underscores no spaces in names! no leading space after < colon can only be used to separate namespace of the element from the element name case-sensitive can not start with xml, XML, xML, … XML – well formed-ness

19 herbert van de sompel XML – well formed-ness white spaces: space, tab, line feed, carriage return in HTML: must explicitly write white spaces as &nsbsp; because HTML processors strip off white spaces not so in XML: space in PCDATA stays tab in PCDATA stays multiple new line characters transformed into a single one

20 herbert van de sompel XML – character references Unicode code point: © == © == © go pick some character references and include in XML doc