Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted.

Similar presentations


Presentation on theme: "Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted."— Presentation transcript:

1 Tutorial 3: XML Creating a Valid XML Document

2 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted. For example, each customer order should include a customer name, address, and phone number. A document is validated to prevent errors in their content or structure. An XML document can be validated using either DTDs (Document Type Definitions) or schemas.

3 Customer orders table

4 David Lynn <![CDATA[ 211 Fox Street Greenville, NH 80021 ]]> (315) 555-1812 dlynn@nhs.net 8/1/2008 DCT5Z SM128 RCL

5 The structure of the orders.xml document

6 DTD statements are inserted here the root element of the document must match the root element listed in the DOCTYPE declaration <!DOCTYPE customers [ ]> Writing the Document Type Declaration the root element

7 7 Declaring a DTD A DTD can be used to: ensure all required elements are present prevent undefined elements from being used enforce a specific data structure specify the use of attributes and define their possible values define default values for attributes describe how the parser should access non- XML or non-textual content

8 8 Declaring a DTD A document type definition is a collection of rules or declarations that define the content and structure of the document. A document type declaration attaches those rules to the document’s content. You create a DTD by first entering a document type declaration into your XML document.

9 9 Declaring a DTD While there can only be one DTD per XML document, it can be divided into two parts: an internal subset and an external subset. An internal subset is declarations placed in the same file as the document content. An external subset is located in a separate file.

10 10 Declaring a DTD To declare an internal DTD subset, use: <!DOCTYPE root [ declarations ]> Where root is the name of the document’s root element, and declarations are the statements that comprise the DTD.

11 To declare an external DTD subset with a system or public location, use: or id is a text string that tells an application how to locate the external subset uri is the location and filename of the external subset Unless your application requires a public identifier, you should use the SYSTEM location form.

12 A DOCTYPE declaration can indicate both an external and an internal subset. The syntax is: <!DOCTYPE root SYSTEM “uri” [ declarations ]> or <!DOCTYPE root PUBLIC “id” “uri” [ declarations ]>

13 13 Declaring a DTD The real power of XML comes from an external DTD that can be shared among many documents. If a document contains both an internal and an external subset, the internal subset takes precedence over the external subset if there is a conflict between the two. This way, the external subset would define basic rules for all the documents, and the internal subset would define those rules specific to each document.

14 Combining an External and Internal DTDs

15 15 Declaring Document Elements In a valid document, every element must be declared in the DTD. An element (type) declaration specifies the name of the element and indicates what kind of content the element can contain. The syntax is: Where element is the name of the element and content-model specifies what type of content the element contains. The element name is case sensitive

16 16 Five different types of element content for content-model ANY - No restrictions on the element’s content. EMPTY - The element cannot store any content. #PCDATA - The element can only contain parsed character data. Elements - The element can only contain child elements. Mixed - The element contains both parsed character data and child elements.

17 17 ANY Content: The declared element can store any type of content The syntax is: Example: Any of the following would satisfy the above declaration: – SLR100 Digital Camera

18 18 EMPTY content: This is reserved for elements that store no content The syntax is: Example: The following would satisfy the above declaration: –

19 19 #PCDATA Content: can store parsed character data The syntax is: would permit the following element in an XML document: – Lea Ziegler PCDATA element does not allow for child elements

20 20 Working with Child Elements The syntax is: – Where element is the parent element and children is a listing of its child elements. The declaration indicates that the following would be invalid: Lea Ziegler 555-2819

21 21 Working with Child Elements To declare the order of child elements, use: – Where child1, child2, … is the order in which the child elements must appear within the parent element. Thus, indicates the customer element should contain three child elements named name, phone, email.

22 22 Working with Child Elements To allow for a choice of child elements, use: – where child1, child2, etc. are the possible child elements of the parent element. – allows the customer element to contain either the name element or the company element.

23 23 Modifying Symbols A modifying symbol specifies the number of occurrences of each element: – ? allows zero or one of the item. – + allows one or more of the item. – * allows zero or more of the item. Modifying symbols can be applied within sequences or choices. They can also modify entire element sequences or choices by placing the character immediately following the closing parenthesis of the sequence or choice.

24 24 Modifying Symbols indicates that the customers element must contain at least one element named customer. indicates that the child element sequence (orderDate, items) can be repeated one or more times within each order element. allows the customer element to contain zero or one email elements.

25 25 Working with Mixed Content Mixed content elements contain both parsed character data and child elements. The syntax is: The parent element can contain character data or any number of the specified child elements, or it can contain no content at all. It is better not to work with mixed content if you want a tightly structured document.

26 26 Declaring Element Attributes For a document to be valid, all the attributes associated with elements must also be declared. To enforce attribution properties, you must add an attribute-list declaration to the document’s DTD.

27 ElementAttributesRequired?Default Value(s) customercustID custType Yes No None “home” or “business nameTitleNo“Mr.”, “Mrs.”, “Ms.” orderorderID orderBy Yes None none itemitemPrice itemQty Yes No None “1” Attributes used in orders.xml

28 28 Declaring Element Attributes The attribute-list declaration: Lists the names of all attributes associated with a specific element Specifies the data type of the attribute Indicates whether the attribute is required or optional Provides a default value for the attribute, if necessary

29 29 Declaring Element Attributes The syntax to declare a list of attributes is: <!ATTLIST element attribute1 type1 default1 attribute2 type2 default2 attribute3 type3 default3 … > – Where element is the name of the element associated with the attributes, attribute is the name of an attribute, type is the attribute’s data type, and default indicates whether the attribute is required and whether it has a default value.

30 30 Declaring Element Attributes Attribute-list declaration can be placed anywhere within the document type declaration, although it is easier if they are located adjacent to the declaration for the element with which they are associated.

31 31 Working with Attribute Types Attribute values can consist only of character data, but you can control the format of those characters. Three general categories of attribute values are: CDATA can contain any character except those reserved by XML Enumerated types are attributes that are limited to a set of possible values Tokenized types are text strings that follow certain rules for the format and content

32 32 CDATA The syntax is: Examples: Any of the following attributes values are allowed:.........

33 33 Enumerated Types The general form for an enumerated type is: where value1, value2,.. are allowed values Under the declaration below: any custType attribute whose value is not “home” or “business” causes parsers to reject the document as invalid.

34 34 Working with Attribute Types Another type of enumerated attribute is notation. It associates the value of the attribute with a declaration located elsewhere in the DTD. The notation provides information to the XML parser about how to handle non-XML data.

35 35 Tokenized Types are character strings that follow certain rules for format and content To declare an attribute as a tokenized type, use: attribute token DTDs support seven tokens: IDs, IDREF, IDREFS, NMTOKEN, NMTOKENS, ENTITY, ENTITIES An ID is used when an attribute value must be unique within an document. For example: – This ensures each customer will have a unique ID.

36 36 IDREF token IDREF token must have a value equal to the value of an id attribute. This enables an XML document to contain cross-references between one element and another.

37 37 Attribute Defaults There are four possible defaults: – #REQUIRED: the attribute must appear with every occurrence of the element. – #IMPLIED: The attribute is optional. – An optional default value: A validated XML parser will supply the default value if one is not specified. – #FIXED: The attribute is optional but if one is specified, it must match the default.

38 38 Validating a Document with SMLSpy XMLSpy is an XML development environment created by Altova, which is used for designing and editing professional applications involving XML, XML Schema, and other XML-based technologies. Install and use the XMLSpy Home Edition, a free application which can be downloaded from the Altova Web site at http://www.altova.com/http://www.altova.com/

39 39 Introducing Entities Entities are storage units for a document’s content. The most fundamental entity is the XML document itself and is known as the document entity. Entities can also refer to: a text string a DTD an element or attribute declaration an external file containing character or binary data

40 40 Working with Entities Entities can be declared in a DTD. How to declare an entity depends on how it is classified. There are three factors involved in classifying entities: The content of the entity How the entity is constructed Where the definition of the entity is located

41 41 General Parsed Entities To create an internal parsed entity, use: – Where entity is the name assigned to the entity and value is the entity’s value. For example, to store the product description for the Tapan digital camera, use: or Tapan Digital Camera 5 Mpx – zoom ”>

42 42 General Parsed Internal Entities After an entity is declared, it can be referenced anywhere within the document. The syntax is: &entity For example, &DCT5Z is interpreted as Tapan Digital Camera 5 Mpx – zoom

43 43 General Parsed External Entities For longer text strings, it is preferable to place the content in an external file. To create an external parsed entity, use: For example, in the declaration: <!ENTITY DCT5Z SYSTEM “description.xml” an entity named “DCT5Z” gets its value from the description.xml file

44 Declare parsed entities in the codes.dtd file for the product codes in the orders.xml documentation entity name entity value

45 45 Parameter Entities Parameter entities are used to store the content of a DTD. For internal parameter entities, the syntax is: For external parameter entities, the syntax is: Once a parameter has been declared, you can add a reference to it within the DTD using: %entity

46 Using Parameter Entities to Combine Multiple DTDs

47 <!DOCTYPE customers [. %itemCodes; ]>. 8/1/2008 &DCT5Z &SM128 &RCL Add a parameter entity to the DTD within the orders.xml file to load the contents of the codes.dtd file

48 8/1/2008 Tapan DIgital Camera 5 Mpx – zoom SmartMedia 128MB Card Rechargeable Lithium Ion Battery

49 49 Parameter Entities Parameter entity references can only be placed where a declaration would normally occur, such as an internal or external DTD. Parameter entities used with an internal DTD do not offer any time or effort savings. However, an external parameter entity can allow XML to use more than one DTD per document by combining declarations from multiple DTDs.

50 50 Unparsed Entities You need to create an unparsed entity in order to reference binary data such as images or video clips, or character data that is not well formed. The unparsed entity includes instructions for how the unparsed entity should be treated. A notation is declared that identifies a resource to handle the unparsed data.

51 51 Unparsed Entities For example, to create a notation named “audio” that points to an application recorder.exe: Once the notation has been declared, you then declare an unparsed entity that instructs the XML parser to associate the data to the notation.

52 52 Unparsed Entities To take unparsed data in an audio file and assign it to an unparsed entity named “theme”, use: Here, the notation is the audio notation that points to the recorder.exe file. This declaration does not tell the record.exe application to run the file but simply identifies for the XML parser what resource is able to handle the unparsed data.


Download ppt "Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted."

Similar presentations


Ads by Google