Download presentation
Presentation is loading. Please wait.
1
2005 http://www.cs.huji.ac.il/~dbi 1 XML eXtensible Markup Language Part 2
2
2005 http://www.cs.huji.ac.il/~dbi 2 XML Entities
3
2005 http://www.cs.huji.ac.il/~dbi 3 XML Entities should not be Confused with Entities in the Sense of the ER Model An entity is a short string that denotes more complex information, which may reside inside or outside the XML document or its DTD Entities save typing Entities facilitate easy changes (when the same change is likely to be made in many places) Sometimes entities must be used to circumvent XML syntax violations Applications should decode and encode entities, using their definitions
4
2005 http://www.cs.huji.ac.il/~dbi 4 General entities A general entity is defined in the DTD And it is used in the document by writing &Name;
5
2005 http://www.cs.huji.ac.il/~dbi 5 Example <!DOCTYPE mdb [ ]> Oh God! Woody Allen $2M
6
2005 http://www.cs.huji.ac.il/~dbi 6 Browser View
7
2005 http://www.cs.huji.ac.il/~dbi 7 Parameter Entities Parameter entities are used only within DTDs Internal entities are references within the DTD External entities are references that draw information from outside files Parameter Entity declaration:
8
2005 http://www.cs.huji.ac.il/~dbi 8 An Example of a Parameter Entity <!ATTLIST person friend (yes | no) #IMPLIED id ID #REQUIRED knows IDREFS #IMPLIED>
9
2005 http://www.cs.huji.ac.il/~dbi 9 Unparsed Entities <!DOCTYPE mdb [ <!ATTLIST movie id ID #REQUIRED opinion CDATA #IMPLIED starimage ENTITY #IMPLIED> ]> Entities are defined Types are defined
10
2005 http://www.cs.huji.ac.il/~dbi 10 Data Oh God! Woody Allen $2M
11
2005 http://www.cs.huji.ac.il/~dbi 11 Defining Entities Entities can be defined –in the local document as part of the DOCTYPE definition –with a link to external files that contain the entity data (this, too, is done through the DOCTYPE definition) –in an external DTD Define locally when the entity is being used only in one particular document Define by a link to an external file when the entity is being used in many documents
12
2005 http://www.cs.huji.ac.il/~dbi 12 Defining Entities – An Example Local Definition: <!DOCTYPE [ <!ENTITY copyright "Copyright 2000, As The World Spins Corp. All rights reserved. Please do not copy or use without authorization. For authorization contact legal@worldspins.com."> ]> Global Definition: <!DOCTYPE [ <!ENTITY copyright SYSTEM "http://www.worldspins.com/legal/copyright.xml" > ]>
13
2005 http://www.cs.huji.ac.il/~dbi 13 Another Example <!DOCTYPE [ <!ENTITY copyright "Copyright 2000, As The World Spins Corp. All rights reserved. Please do not copy or use without authorization. For authorization contact legal@worldspins.com."> ]>
14
2005 http://www.cs.huji.ac.il/~dbi 14 Example (cont’d) Mini-globe revolutionizes keychain industry Today As The World Spins introduces a new approach to key chains. With the new MINI-GLOBE keys can be kept inside a chain, called for upon demand, and stored safely. Never more will consumers lose a key or stand at a door flipping through a stack of keys seeking the right one. &trademark;©right;
15
2005 http://www.cs.huji.ac.il/~dbi 15 XML Namespaces
16
2005 http://www.cs.huji.ac.il/~dbi 16 XML Namespaces When an element name appears in two different XML documents, we would like to know that it has the same meaning in both documents –Is the tag used as the XHTML tag in both documents? –If two documents about books have the tag, does it mean that they use the same system for cataloging books?
17
2005 http://www.cs.huji.ac.il/~dbi 17 What XML Namespaces are and What They are not Namespaces merely provide a mechanism for creating unique names (for elements and attributes) that can be used in XML documents all over the Web –A namespace is just a collection of names that were created for a specific domain of applications Namespaces are not DTDs and they do not provide a mechanism for validation of XML documents using multiple DTDs
18
2005 http://www.cs.huji.ac.il/~dbi 18 Identifying an XML Namespace A name space is identified by a URI The URI does not have to point to anything –It is merely used as a mechanism for creating unique names An element or attribute name from a namespace has two parts prefix:name prefix identifies the namespace name is just a name from the namespace
19
2005 http://www.cs.huji.ac.il/~dbi 19 Namespaces are not Part of the XML 1.0 Recommendation When an XML 1.0 parser sees a qualified name prefix:name the parser treats this name just as it would treat any other attribute or element name (it is legal to use the character “:” in element and attribute names) Namespaces must be hardwired into DTDs
20
2005 http://www.cs.huji.ac.il/~dbi 20 But When an application sees a qualified name, it may recognize it and act accordingly –A browser identifies tags that belong to the XHTML namespace and processes them –An XSLT processor identifies tags and attributes that belong to the XSLT namespace and executes them
21
2005 http://www.cs.huji.ac.il/~dbi 21 The W3C Recommendation for Namespaces in XML The two-part naming system is the only thing defined in the W3C Namespace recommendationW3C Namespace recommendation –and even that is not so short! This recommendation is just a collection of syntactic rules –Some rules are rather subtle
22
2005 http://www.cs.huji.ac.il/~dbi 22 Declaring a Namespace An XML namespace is declared in the xmlns attribute XML Namespaces John Doe Using foo as the prefix, instead of using the URI, is more convenient
23
2005 http://www.cs.huji.ac.il/~dbi 23 The Default Namespace The default namespace is declared without a prefix XML Namespaces John Doe All the elements belong to the default namespace
24
2005 http://www.cs.huji.ac.il/~dbi 24 Technically The namespace mechanism is just a mapping from prefixes to URIs, e.g., – is replaced with It is done in a processing layer that operates on the element tree resulting from XML 1.0 parsing It creates unique names
25
2005 http://www.cs.huji.ac.il/~dbi 25 DTDs as Namespaces The URI of a namespace may point to a DTD A DTD defines a namespace comprising all its element names and attribute names –But it is just a namespace – not a DTD!
26
2005 http://www.cs.huji.ac.il/~dbi 26 Example xmlns:bib=“http://www.acm.org/bibliography.dtd” xmlns:isbn=“http://www.isbn-org.org/def.dtd”> Proceedings of SIGMOD 472010 1-58113-332-4 This document is invalid according to either DTD! But the document is well formed! (e.g., in the book element, attribute names are unique)
27
2005 http://www.cs.huji.ac.il/~dbi 27 Alternatively, One Namespace can be Declared as the Default xmlns=“http://www.acm.org/bibliography.dtd” xmlns:isbn=“http://www.isbn-org.org/def.dtd”> Proceedings of SIGMOD 472010 1-58113-332-4 This document is well formed but invalid according to either DTD!
28
2005 http://www.cs.huji.ac.il/~dbi 28 Scope of Namespaces The scope of a namespace declaration is the element containing the declaration and all descendant elements –Must use the prefix anywhere in the scope Only the default namespace can be redeclared More than one namespace can be declared in the same scope –At most one can be the default namespace –All others must have unique prefixes
29
2005 http://www.cs.huji.ac.il/~dbi 29 What about Attributes? Recall that element names and attribute names must be qualified if they belong to a nondefault namespace Unqualified element names belong to the default namespace (if they are inside the scope) However, an unqualified attribute does not belong to the default namespace An unqualified attribute is processed according to the rules that apply to its element name
30
2005 http://www.cs.huji.ac.il/~dbi 30 Namespaces and DTDs: The Problem DTD syntax does not support namespaces The previous example showed an XML document with two DTDs that were used as namespaces –It is impossible to declare constraints that specify where fragments from each namespace can occur
31
2005 http://www.cs.huji.ac.il/~dbi 31 Namespaces and DTDs: The Solutions Use a namespace-aware schema language, or Modify one of the two DTDs so that it will be a DTD for the new document –Two alternatives, as illustrated on the next two slides, using the previous example
32
2005 http://www.cs.huji.ac.il/~dbi 32 One Alternative Add the required new elements to the DTD Give the appropriate unique names to these elements using parameter entities
33
2005 http://www.cs.huji.ac.il/~dbi 33 The Second Alternative Add the required new elements to the DTD, using qualified names Use the attribute-list declaration for the new elements to declare the namespace as a fixed value
34
2005 http://www.cs.huji.ac.il/~dbi 34 Data Exchange and Data Representation in XML
35
2005 http://www.cs.huji.ac.il/~dbi 35 Exchanging Relational Data Each tuple can be wrapped inside an element See example on the following slides
36
2005 http://www.cs.huji.ac.il/~dbi 36 Two Ways of Wrapping Relations in XML Documents projects: title budget managedBy employees: name ssn age
37
2005 http://www.cs.huji.ac.il/~dbi 37 The Project and Employee Relations in XML Pattern recognition 10000 Joe Joe 344556 34 Sandra 2234 35 Auto guided vehicle 70000 Sandra : Projects and employees are intermixed
38
2005 http://www.cs.huji.ac.il/~dbi 38 Pattern recognition 10000 Joe Auto guided vehicles 70000 Sandra : Joe 344556 34 Sandra 2234 35 : Employees follow projects Projects Employees
39
2005 http://www.cs.huji.ac.il/~dbi 39 Pattern recognition 10000 Joe Auto guided vehicles 70000 Sandra : Joe 344556 34 Sandra 2234 35 : Or without “separator” tags … Can be done if it is clear where each employee and each project starts
40
2005 http://www.cs.huji.ac.il/~dbi 40 DTDs for the First Two Documents <!DOCTYPE db [... ]> <!DOCTYPE db [... ]>
41
2005 http://www.cs.huji.ac.il/~dbi 41 Wrapping Relations is not a Good Design Strategy When designing XML documents from ER diagrams, –ER entities are described by XML elements –ER attributes can be described either by XML attributes or by subelements –How to represent ER relationships? By using the built-in relationship in XML between elements and subelements But it is not always possible, so ID references might have to be used
42
2005 http://www.cs.huji.ac.il/~dbi 42 How to use XML Attributes XML attributes describe properties of the contents, rather than the contents cheese fromage branza A food made …
43
2005 http://www.cs.huji.ac.il/~dbi 43 Attributes (cont’d) Another common use for attributes is to express dimensions or types 2400 96 M05-.+C$@02!G96YE<FEC...
44
2005 http://www.cs.huji.ac.il/~dbi 44 Jeff Cohen 04-828-1345 054-470-778 jeffc@cs.technion.ac.il Irma Levy 03-426-1142 irmal@yourmail.com Using Attributes
45
2005 http://www.cs.huji.ac.il/~dbi 45 It is not Always Clear When to Use Attributes L. Simpson lisa@cs.huji.ac.il... 123 4589 L. Simpson lisa@cs.huji.ac.il...
46
2005 http://www.cs.huji.ac.il/~dbi 46 Using IDs Jeff Cohen 04-828-1345 054-470-778 jeffc@cs.technion.ac.il Irma Levy 03-426-1142 irmal@yourmail.com ID attributes
47
2005 http://www.cs.huji.ac.il/~dbi 47 How to Represent Relationships Two related ER entities, e.g., employees and departments, can be represented as follows A department is an element, and the employees are subelements of the department The relationship must be many-to-one or one-to-one –Subelements are the “many”
48
2005 http://www.cs.huji.ac.il/~dbi 48 No Multiple Copies of the Same Element (to Avoid Redundancies) Cannot represent in this way –A many-to-many relationship –A relationship with more than two entities –A binary relationship between an entity and itself or between two entities that are related by an ISA relationship ID references must be used in the above cases
49
2005 http://www.cs.huji.ac.il/~dbi 49 More Problematic Cases If there are several many-to-one relationships between two ER entities, then only one can be represented as an element-subelement relationship For example, employees can be subelements of their department But the relationship between a department and its manager (who is one of the employees) must be represented by an IDREF
50
2005 http://www.cs.huji.ac.il/~dbi 50 Missing Information is another Problem If there could be an employee without a department, then employees cannot be represented as subelements of departments –IDREFS have to be used
51
2005 http://www.cs.huji.ac.il/~dbi 51 Inverse Relationships XML does not have built-in inverse relationships Must use IDREF to represent inverse relationships For example, add an IDREF attribute to each employee element for denoting the department of the employee
52
2005 http://www.cs.huji.ac.il/~dbi 52 XML Schemas W3Schools on XML Schemas
53
2005 http://www.cs.huji.ac.il/~dbi 53 XML Schemas W3C XML Schema Language, also known as the language for XML Schema Definition (XSD) There are other proposals for XML Schemas
54
2005 http://www.cs.huji.ac.il/~dbi 54 XSDs have Types XSDs use complex types that generalize the content model of DTDs (i.e., the regular expressions for describing elements) Many simple types, e.g., String, Integer –Generalize PCDATA and CDATA Many facets of simple types, e.g., length, maxInclusive, maxExclusive
55
2005 http://www.cs.huji.ac.il/~dbi 55 xs:sequence and xs:all Can specify that subelements should appear in a specific order (i.e., sequence) or in any order (i.e., all) –But xs:all is not as general as xs:sequence Can restrict the number of occurrences of subelements, e.g., a departments can have between 10 and 100 employees
56
2005 http://www.cs.huji.ac.il/~dbi 56 References References are to specific elements or attributes, e.g., a reference to “person”, where “person” is the name of an element
57
2005 http://www.cs.huji.ac.il/~dbi 57 More Features Mixed content can be defined more generally, compared to DTDs Local and global definitions of elements and types Derived types by restriction or extension
58
2005 http://www.cs.huji.ac.il/~dbi 58 XSDs and Namespaces XSDs recognize namespaces Easier (than with DTDs) to check validity of a document with respect to multiple schemas –A very important feature when collecting information from multiple heterogeneous sources –XSDs are more extensible than DTDs
59
2005 http://www.cs.huji.ac.il/~dbi 59 Summary of XML XML is a new data format andits main virtues: –widespread acceptance –the (important) ability to handle semistructured data (data without schema) DTDs provide some useful syntactic constraints on documents, but as schemas they are weak How to store large XML documents? How to query them? How to map between XML and other representations?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.