Download presentation
Presentation is loading. Please wait.
Published byMichael Cummings Modified over 9 years ago
1
Topics The "bigger picture" –The "XML sales pitch" –XML/XHTML vs. SGML/HTML –XML in electronic publishing –XML and the future, web 2.0 XML basics: –Building blocks: elements, attributes, … –Structural constraints: Well-formed XML –Character sets –Namespaces –Validity: DTDs and XML schemas Week 0534Introduction to XML1
2
Week 0534Introduction to XML2 Why Use XML (1) Consider a line from a.dat file: 2394287410|Verbatim|DataLife MF 2HD|10|3.5"|black or the XML-fragment: Verbatim DataLife MF 2HD 10 3.5" black Which one is easier to interpret, more robust, easier to use for complex structures?
3
Week 0534Introduction to XML3 Why Use XML (2) Simple syntax Self documenting format Support for hierarchical structures Simple debugging (both for user as machine) Language and platform independent Many different tools Growing library of ”standard” formats
4
Week 0534Introduction to XML4 Main Types of XML Documents Narrative-Centric Documents: –Largely with irregular structure, for instance a novel Data-Centric Documents: –With a regular structure, for instance a telephone directory Hybrid Documents: –Typically contains highly regular parts mixed with irregular contents - e.g., a product catalog
5
Week 0534Introduction to XML5 XML/XHTML vs. SGML/HTML Problems with SGML/HTML: –SGML is a complex markup language –HTML is only suitable for narrative documents –HTML became a bad mix of structure and layout –HTML browsers are too tolerant for language The XML/XHTML promise: –XML has a simple and extendible structure –Suitable for both data and narrative documents –XHTML is for structure only - CSS is for layout –Enforces strict rules
6
Week 0534Introduction to XML6 XML in Electronic Publishing Some important XML-applications: –Text transformation/printing: XSLT, XSL-FO, SVG, … –Content: GML, MathML, NewsML, DocBook, … –Data exchange: SOAP, AJAX, xCAL, … –Semantics: RDF, Dublin Core, …
7
Week 0534Introduction to XML7 Web 2.0 Next generation of web is about data - not documents! –Read the O'Reilly Web 2.0 article
8
Week 0534Introduction to XML8 XML Basics Core literature: –An Introduction to XML and Web Technologies: Chapters 1-2 –XML in a Nutshell: Chapters 1-2, 4-7, 26
9
Week 0534Introduction to XML9 What XML consists of Elements Attributes Entities and entity references Text CDATA sections Processing instructions Comments
10
Week 0534Introduction to XML10 XML declaration XML documents should begin with a XML declaration that give information about: –XML version –Encoding –If external DTDs are to be used Example:
11
Week 0534Introduction to XML11 XML elements The basic entity in XML Consists of a start-tag, content and a end-tag Simple content: Web page for IMT4501 Mixed content: No, you can’t do that ! Empty element:
12
Week 0534Introduction to XML12 Attributes Extra information about an element Example: Values enclosed by apostrophes in pairs: HiG Oppland Arbeiderblad But not: VG CNN
13
Week 0534Introduction to XML13 Well-formed XML One root element Correct nesting of elements Always a matching end-tag to each element Case sensitive names Attribute values in quotes One attribute can’t appear more than once inside an element No comments inside tags No unescaped < or & inside text content
14
Week 0534Introduction to XML14 Sample XML structure
15
Week 0534Introduction to XML15 Tree for the example
16
Week 0534Introduction to XML16 XML names Have to start with ’_’ or letter Followed by numbers, letters, ’_’, ’-’ or.’ XML as a prefix (regardless of capitalization) are reserved Acceptable names: Non acceptable names:
17
Week 0534Introduction to XML17 Entities and entity references Five predefined entity references in XML Other entity references can be defined in an external DTD XHTML Entities ÅÅ(unicode: Å) Æ&Aelig;(unicode : Æ) ØØ(unicode : Ø) åå(unicode : å) ææ(unicode : æ) øø(unicode : ø) Predefined XML Entities >(greater than) &&(ampersand) ”"(quotation) ’'(apostrophe) Unicode Entities ©©(xhtml: ©) αδ(xhtml: α) €€ (xhtml: €)
18
Week 0534Introduction to XML18 Text and character parsing Text is basically PCDATA (Parsed Character Data): –The parser replaces entity references with value CDATA can be used where we want the parser to interpret the character data: 0) && (len
19
Week 0534Introduction to XML19 Comments Enclosed by Should not appear inside a tag A double hyphen -- can not appear anywhere inside the comment Are meant for users, not application Correct use:... Wrong use: >
20
Week 0534Introduction to XML20 Processing instructions Enclosed by Target follows right after <? Can be used to send information to the application Comments were used before, but XML parsers can choose not to send comments to the application Example : log in first”; } ?>
21
Week 0534Introduction to XML21 Exercise Complete the ZVON XML tutorial: http://www.zvon.org/xxl/XMLTutorial/General/contents.html
22
Week 0534Introduction to XML22 Character Sets Historically, character encoding has been a challence: –The same code has been used for different characters on different systems Now, there are standards: –ISO-8859-1 (ISO Latin), ˝default˝ on the web –Unicode - defines a larger character set, used by XML on default: UTF-8 efficient for western languages UTF-16 UTF-32
23
Week 0534Introduction to XML23 Namespaces – why? Distinguish between elements and attributes from different XML vocabularies Namespaces allow two or more XML vocabularies to use the same document Group all related elements and attributes from a single XML application – easier to be recognized by the software
24
Week 0534Introduction to XML24 Namespaces – how? A prefix attached to a vocabulary (identified by a URI) with attributes xmlns: The prefix is defined inside the sub tree where the element are root Elements in a vocabulary identified by the prefix: XML in a Nutshell Elliotte Rusty Harold W. Scott Means 2002
25
Week 0534Introduction to XML25 More about the prefix You choose the name of the prefix, the URI identifies the vocabulary The prefix has to be a leagal XML name
26
Week 0534Introduction to XML26 Namespaces – what is it really? A vocabulary identified by a fixed Uniform Resource Identifier: –http://... –ftp://... –… The URI has to be unique to make the vocabulary unique The URI does not need to point at any defined document
27
Week 0534Introduction to XML27 Example scope
28
Week 0534Introduction to XML28 Default namespace Default namespace can be used where all non- prefixed elements belongs to a fixed vocabulary Example: XML in a Nutshell Elliotte Rusty Harold W. Scott Means 2002
29
Week 0534Introduction to XML29 Exercise Complete the ZVON XML tutorial: http://www.zvon.org/xxl/NamespaceTutorial/Outpu t/contents.html
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.