SNU OOPSLA Lab. XML Documents 1 : Structure The ubiquitous XML(2) © copyright 2001 SNU OOPSLA Lab.
SNU OOPSLA Lab. The ubiquitous XML 2 XML Documents 1 : structure Peeping into XML document at Physical view : Entity at logical view : DTD XML document
SNU OOPSLA Lab. The ubiquitous XML 3 Peeping into XML document(1/5) Hello, XML!! Mark-up data Mark-up and character data XML document
SNU OOPSLA Lab. The ubiquitous XML 4 <!DOCUMENT DATE [ ] > XML document : date.xml XML declaration xml 문서임을 선언. 로 끝난다. DTD(Document Type Definition) user 가 사용할 tag 를 정의한다. 여기서는 DATE tag 를 정의. Content Comment : parser 는 이를 무시. XML document Peeping into XML document(2/5)
SNU OOPSLA Lab. The ubiquitous XML 5 Structure of XML document physical structure : allows components of the document, called entities logical structure : allows a document to be divided into named units and sub-units, called elements Peeping into XML document(3/5) XML document
Sub-unit Unit Document elements Logical Structure entities (internal) (separate) Physical Structure SNU OOPSLA Lab. 5 Peeping into XML document(4/5) XML document
SNU OOPSLA Lab. The ubiquitous XML 7 XML document kim kim “k.jpg” element entity Peeping into XML document(5/5)
SNU OOPSLA Lab. The ubiquitous XML 8 XML Documents 1 : structure Peeping into XML document at Physical view : Entity at logical view : DTD
SNU OOPSLA Lab. The ubiquitous XML 9 Content of Physical structure Entity Figures of Document Entity Defining an entity Grammar in Declaring Entity Examples of EntityDeclaration URL format Physical structure
Entity (1/3) unit of physically isolating and storing any part of a document ( 정보저장단위 ) Each unit of information is called an entity entities (internal) (separate) Physical Structure kim “k.jpg” entity SNU OOPSLA Lab. Physical structure
SNU OOPSLA Lab. The ubiquitous XML 11 Entity (2/3) Purpose of Entity contain all the information (well-formed XML data, other text file, binary data…) kim “k.jpg” Document entity Image entity Physical structure
SNU OOPSLA Lab. The ubiquitous XML 12 Entity (3/3) Internal Entity 해당 document 안에서 완전하게 정의되는 entity External Entity URL 을 통해 알려진 외부의 source 로부터 그들의 content 를 받아 오는 entity Physical structure
SNU OOPSLA Lab. The ubiquitous XML 13 Figures of Document Entity document entity (no entities) document entity (main content) A A B C D document entity (framework file) Physical structure
SNU OOPSLA Lab. The ubiquitous XML 14 Defining an entity Entity must be defined before the first reference to them in the data stream Declared in the DTD(Document Type Definition ) <!DOCTYPE DOCUMENT [ ]> Entity definition in DTD Physical structure
SNU OOPSLA Lab. The ubiquitous XML 15 Example : EntityDeclaration(1/3) Internal text entities Built-in entities ( 내장 entity) &li; > & ' " for ‘<‘ for ‘>’ for ‘&’ for ‘ ’ ’ for ‘ ” ’; Physical structure
SNU OOPSLA Lab. The ubiquitous XML 16 Example : EntityDeclaration(2/3) External text entities Binary entities Physical structure
Example : EntityDeclaration(3/3) /xml/document.xml/entities/en tity9.xml /xml/docs/document.xml/ entities/entity9.xml xml document.xml entities entity9.xml xml entities entity9.xml docs document.xml URL format SNU OOPSLA Lab. Physical structure
SNU OOPSLA Lab. The ubiquitous XML 18 XML Documents 1 : structure Peeping into XML document at Physical view : Entity at logical view : DTD
SNU OOPSLA Lab. The ubiquitous XML 19 Concepts DTD Structure Element Declaration Attribute Declarations Parameter Entities Conditional Sections Notation Declarations DTD Processing Issues Content of Logical structure logical structure
SNU OOPSLA Lab. The ubiquitous XML 20 DTD(Document Type Definition) An optional but powerful feature of XML Comprises a set of declarations that define a document structure tree XML processors read the DTD and check whether the document is valid and use it to build the document model in memory Describes user’s own tag set as meta markup language Concepts of DTD(1/3) logical structure
SNU OOPSLA Lab. The ubiquitous XML 21 Concepts of DTD(2/3) DTD describes.. Element, attribute, notation, relation between each elements Establishes formal document structure rules
SNU OOPSLA Lab. The ubiquitous XML 22 Declare Vs. Define Declare “This document is a concert poster” Define “A concert poster must have the following features” DTD define Element type + Attribute + Entities Valid Vs. Invalid Valid conforms to DTD Invalid fail to conform to DTD Well formed XML Document Valid XML Document Concepts of DTD(3/3) logical structure
SNU OOPSLA Lab. The ubiquitous XML 23 Valid & Invalid Documents Valid: various random text but no markup Invalid: anything else including various random text logical structure Example: <!DOCTYPE GREETING[ ]>
SNU OOPSLA Lab. The ubiquitous XML 24 DTD is composed of a number of declarations ELEMENT (tag definition) ATTLIST (attribute definitions) ENTITY (entity definition) NOTATION(data type notation definition) DTD can be stored in an external subset or an internal subset DTD structure logical structure
SNU OOPSLA Lab. The ubiquitous XML 25 Internal subset Form : <!DOCTYPE … [ … ]> Pros Easy to write XML Cons Editing two files without moving Other document can’t reuse without copying internal subset Internal and External Subset(1/3) logical structure
SNU OOPSLA Lab. The ubiquitous XML 26 External subset better to use external DTDs Reason why? Many benefits document management updating editing Few reasons If you use an external DTD, you can use public DTDs(capability) External DTDs provide for better document management External DTDs make it easier to validate you document Internal and External Subset(2/3) logical structure
SNU OOPSLA Lab. The ubiquitous XML 27 Internal and External Subset(3/3) internal external Internal subset external subset full parsing path logical structure
SNU OOPSLA Lab. The ubiquitous XML 28 Used to define a new element, specify its allowed content and gives the name and content model of the element Each tag must be declared in a declaration. The content model uses a simple regular expression- like grammar to precisely specify what is and isn't allowed in an element ELEMENT Type declaration ‘ ’ Element Declarations logical structure
SNU OOPSLA Lab. The ubiquitous XML 29 Content Specifications ANY #PCDATA Sequences Choices Mixed Content Modifiers Empty logical structure
SNU OOPSLA Lab. The ubiquitous XML 30 A SEASON can contain any child element and/or raw text (parsed character data) Rarely used in practice, due to the lack of constraint on structure it encourages. ANY logical structure
SNU OOPSLA Lab. The ubiquitous XML 31 Parsed Character Data; i.e. raw text, no markup Represent normal data and preceded by the hash-symbol, ‘#’, to avoid confusion with an identical element name, when used within a model group ( for example, ‘(#PCDATA | PCDATA)’) #PCDATA logical structure
SNU OOPSLA Lab. The ubiquitous XML 32 Use of #PCDATA in XML Valid: Invalid: E. The year of our Lord one thousand, nine hundred, and ninety-nine January February March April May June July August September October November December logical structure
SNU OOPSLA Lab. The ubiquitous XML 33 Child Elements To declare that a LEAGUE element must have a LEAGUE_NAME child: logical structure
SNU OOPSLA Lab. The ubiquitous XML 34 Sequences(1/2) Separate multiple required child elements with commas; e.g. One or More Children + logical structure
SNU OOPSLA Lab. The ubiquitous XML 35 Sequences(2/2) Zero or More Children * Choices logical structure
SNU OOPSLA Lab. The ubiquitous XML 36 Grouping With Parentheses Parentheses combine several elements into a single element. Parenthesized element can be nested inside other parentheses in place of a single element. The parenthesized element can be suffixed with a plus sign, a comma, or a question mark. logical structure
SNU OOPSLA Lab. The ubiquitous XML 37 Mixed Content Both #PCDATA and child elements in a choice #PCDATA must come first #PCDATA cannot be used in a sequence Empty elements logical structure
SNU OOPSLA Lab. The ubiquitous XML 38 Attribute Declarations Consider this element: It is declared like this: Hola! logical structure
SNU OOPSLA Lab. The ubiquitous XML 39 Multiple Attribute Declarations Consider this element With two attribute declarations: With one attribute declaration Indentation is a convetion, not a requirement <!ATTLIST RECTANGLE LENGTH CDATA "0px" WIDTH CDATA "0px"> logical structure
SNU OOPSLA Lab. The ubiquitous XML 40 Attribute Types CDATA ID IDREF IDREFS ENTITY ENTITIES NOTATION NMTOKEN NMTOKENS Enumerated logical structure
SNU OOPSLA Lab. The ubiquitous XML 41 CDATA Most general attribute type Value can be any string of text not containing a less-than sign (<) or quotation marks (") logical structure
SNU OOPSLA Lab. The ubiquitous XML 42 ID Value must be an XML name May include letters, digits, underscores, hyphens, and periods May not include whitespace May contain colons only if used for namespaces Value must be unique within ID type attributes in the document Generally the default value is #REQUIRED logical structure
SNU OOPSLA Lab. The ubiquitous XML 43 IDREF Value matches the ID of an element in the same document Used for links and the like IDREFS A list of ID values in the same document Separated by white space logical structure
SNU OOPSLA Lab. The ubiquitous XML 44 ENTITY Value is the name of an unparsed general entity declared in the DTD ENTITIES Value is a list of unparsed general entities declared in the DTD Separated by white space logical structure
SNU OOPSLA Lab. The ubiquitous XML 45 NOTATION Value is the name of a notation declared in the DTD TEXVIEW.EXE LOGO.TEX logical structure
SNU OOPSLA Lab. The ubiquitous XML 46 NMTOKEN Value is any legal XML name NMTOKENS Value is a list of XML names Separated by white space logical structure
SNU OOPSLA Lab. The ubiquitous XML 47 Enumerated Not a keyword Refers to a list of possible values from which one must be chosen Default value is generally provided explicitly logical structure
SNU OOPSLA Lab. The ubiquitous XML 48 Attribute Default Values A literal string value One of these three keywords #REQUIRED #IMPLIED #FIXED logical structure
SNU OOPSLA Lab. The ubiquitous XML 49 #REQUIRED No default value is provided in the DTD Document authors must provide attribute value for each element logical structure
SNU OOPSLA Lab. The ubiquitous XML 50 #IMPLIED No default value in the DTD Author may(but does not have to) provide a value with each element logical structure
SNU OOPSLA Lab. The ubiquitous XML 51 #FIXED Value is the same for all elements Default value must be provided in DTD Document author may not change default value logical structure
SNU OOPSLA Lab. The ubiquitous XML 52 Example of Internal DTDs <!DOCTYPE GREETING [ ]> Hello XML! logical structure
SNU OOPSLA Lab. The ubiquitous XML 53 Internal DTD Subsets Internal declarations override external declarations <!DOCTYPE GREETING SYSTEM "greeting.dtd" [ ]> Hello XML! logical structure