Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Nottingham School of Computer Science & Information Technology Introduction to XML 1. The XML Language Tim Brailsford.

Similar presentations


Presentation on theme: "University of Nottingham School of Computer Science & Information Technology Introduction to XML 1. The XML Language Tim Brailsford."— Presentation transcript:

1 University of Nottingham School of Computer Science & Information Technology Introduction to XML 1. The XML Language Tim Brailsford

2 University of Nottingham 2 Markup Languages l The word “Markup” is derived from the printing industry l Detailed stylistic instructions for typesetting l Usually hand-written on the copy (eg underlining some text that is to be set in italics). l Markup languages do the same job for computerised documentation systems. l Markup adds logical structure to a document, or indicates how it is to be laid out (on paper or screen). l Markup languages are a set of instructions that are amenable to automatic processing.

3 University of Nottingham 3 Markup Languages (cont.) l Usually a sequence of characters in a text file that indicate structure or behaviour of the content. l For example (in HTML) l This is bold and this is italic l This is the title. l Markup may be created by directly editing the symbols, but is more usually hidden from end-users. l Examples l HTML l RTF l Hytime

4 University of Nottingham 4 Generalised Markup Languages l Proprietary markup languages are problematic. l Generalised markup languages are langauges for defining markup languages. l Metalanguages l SGML

5 University of Nottingham 5 SGML - History l Standard Generalised Markup Language l 1969 - GML from IBM l text editing l formatting l information retrieval l 1980 SGML first published l 1980’s SGML adopted by US IRS & DOD l 1986 - ISO standard ISO 8879: Information processing--Text and office systems--Standard Generalized Markup Language (SGML), ([Geneva]: ISO, 1986).

6 University of Nottingham 6 SGML l SGML defines a system of tag markup This is a pair of SGML tags l SGML is a standard for how to specify a tag set. l Document Type Definition (DTD) l SGML documents contain structural elements that can be described without consideration of how they are displayed. l SGML application. l HTML is an SGML application.

7 University of Nottingham 7 Benefits of SGML l Documents are created by thinking in terms of structure rather than appearance (which may change over time). l Documents are portable because any SGML compliant software can interpret them by reference to the DTD. l Documents originally intended for one medium can easily be re-purposed for other media, such as the computer display screen.

8 University of Nottingham 8 What is XML? l XML is based upon SGML, but is substantially simplified for use on the WWW. l Like SGML, XML is a metalanguage l arbitrary definition of elements l l Syntax may optionally be described by a DTD l Valid documents - have a DTD l Well formed documents do not have a DTD l Style and content are completely separate l XML documents contain content l Style is specified by stylesheets

9 University of Nottingham 9 Example XML Applications l MathML - maths l CML - chemistry l SVG - vector graphics l XHTML - WWW l SMIL - synchronised multimedia l MusicML - sheet music l FpML - financial products l RETML - real estate transactions l and many, many others

10 University of Nottingham 10 XML Elements l XML documents consist of one or more elements. Elements consist of a pair of tags and (optionally) enclosed text. The XML Companion Elements may have attributes. The XML Companion Elements may contain other elements. The XML Companion Empty elements may be self closing.

11 University of Nottingham 11 Contents vs Style l XML tags contain meaning not appearance. l This allows extra information to be extracted l Consider the example of the scientific names of animals. l scientific names are in latin l by convention they are always printed in italics The scientific name of the domestic dog is Canis familiaris, and of the domestic cat is Felis catus.

12 University of Nottingham 12 Contents vs Style l XML tags contain meaning not appearance. l This allows extra information to be extracted l Consider the example of the scientific names of animals. l scientific names are in latin l by convention they are always printed in italics The scientific name of the domestic dog is Canis familiaris, and of the domestic cat is Felis catus. In HTML The scientific name of the domestic dog is Canis familiaris, and of the domestic cat is Felis catus. NB there is no distinction between scientific names and emphasis. In HTML The scientific name of the domestic dog is Canis familiaris, and of the domestic cat is Felis catus. NB there is no distinction between scientific names and emphasis.

13 University of Nottingham 13 Contents vs Style l XML tags contain meaning not appearance. l This allows extra information to be extracted l Consider the example of the scientific names of animals. l scientific names are in latin l by convention they are always printed in italics The scientific name of the domestic dog is Canis familiaris, and of the domestic cat is Felis catus. In XML The scientific name of the domestic dog is Canis familiaris, and of the domestic cat is Felis catus. NB emphasis and scientific names are different tags. They may both be displayed as italic, but they can be treated separately. In XML The scientific name of the domestic dog is Canis familiaris, and of the domestic cat is Felis catus. NB emphasis and scientific names are different tags. They may both be displayed as italic, but they can be treated separately.

14 University of Nottingham 14 Rendering of XML l XML files contain content not appearance l Stylesheets contain appearance and behaviour l XML data is rendered by being transformed into some form suitable for display l RTF (for simple printing) l PDF or PostScript (for printing or display) l HTML (for display over the web) l HTML 4.0 / DHTML (for complex interfaces) l The transformation is defined by a stylesheet l Rendering may be done by standalone software, or by a web browser, or on a web server.

15 University of Nottingham 15 Standalone Rendering HTML XML XSL

16 University of Nottingham 16 Client Side Rendering XML XSL HTML Server (any) Browser with XSL engine (eg MS IE > 5.0) XML XSL

17 University of Nottingham 17 Server Side Rendering HTML Browser (any) Server with XSL engine eg Apache/Tomcat/Cocoon XML XSL HTML

18 University of Nottingham 18 Client vs Server Stylesheets l Client side stylesheets are processed in client l XML is delivered to the client l XSL/CSS must be supported by client l MS IE supports CSS & XSLT (non-standard in 5.x mostly standard in 6.x) l Netscape 7 & Mozilla supports CSS and possibly XSLT via plugins. l Server side stylesheets are processed in server l XML is not delivered to the client, it is transformed usually to HTML or PDF l XSL/CSS must be supported by server l Cocoon is an Open Source project, implementing XSL as a Java servlet l Any browser can then be used

19 University of Nottingham 19 Cocoon on Nottingham Servers l Any file placed in a directory called public_html is accessible within the Nottingham network with the url: http://www.cs.nott.ac.uk/~username/filename l Files with the.xml extension are automatically processed by cocoon. l Providing that they have an XSL stylesheet and the correct Cocoon processing instructions they will be “transformed” into (usually) HTML.

20 University of Nottingham 20 A Simple XML Document

21 University of Nottingham 21 A Simple XML Document Root element (one per document) XML declaration

22 University of Nottingham 22 A Simple XML Document

23 University of Nottingham 23 A Simple XML Document Define root element and specify DTD.

24 University of Nottingham 24 A Simple XML Document

25 University of Nottingham 25 A Simple XML Document This is a comment (as SGML / HTML)

26 University of Nottingham 26 A Simple XML Document

27 University of Nottingham 27 A Simple XML Document This defines the XSL stylesheet

28 University of Nottingham 28 A Simple XML Document

29 University of Nottingham 29 A Simple XML Document This is a Cocoon processing directive (NB not standard XML, but required by Cocoon 1.7.4).

30 University of Nottingham 30 Adding Content St. Laurent S 1998 XML: A Primer MIS Press

31 University of Nottingham 31 Benefits of a DTD l DTDs are optional in XML l DTD allows validation of documents l DTD defines the application l Vital for collaborative development l IPR implications l DTD allows entity definitions (ie symbols, shortcuts, “foreign” characters etc.).

32 University of Nottingham 32 XML Namespaces l Namespaces are mechanisms to ensure that elements are unique l Namespaces in XML are optional l Consider the following: The Title The Title

33 University of Nottingham 33 Ensuring uniqueness l Unique element names l Unique attribute content The Title The Title The Title The Title

34 University of Nottingham 34 xmlns attribute l The xmlns attribute is used to declare namespaces l This must be a URI The Title <title xmlns=“http://www.cs.nott.ac.uk/~tjb/NSdemo-two” text=“The Title” /> The Title

35 University of Nottingham 35 Namespace Abbreviations l If an element doesn’t have a namespace defined it inherits that of its parent. l Where multiple namespaces are used together aliases may be declared <demo xmlns:first =“http://www.cs.nott.ac.uk/~tjb/NSdemo-one” xmlns:second =“http://www.cs.nott.ac.uk/~tjb/NSdemo-two” > The Title

36 University of Nottingham 36 XML Namespaces l Namespaces in XML are optional l Namespaces ensure that elements are unique l In different contexts a given tag might mean different things - eg consider l To me it might mean a book in a bibliography l To a bookshop it might contain stock details l To a travel agent it might contain information about flight bookings! l Namespaces attach unique labels to a given tag set. l URLs are usually used as namespace labels.


Download ppt "University of Nottingham School of Computer Science & Information Technology Introduction to XML 1. The XML Language Tim Brailsford."

Similar presentations


Ads by Google