Download presentation
Presentation is loading. Please wait.
Published byLewis Harrison Modified over 9 years ago
1
XML: Extensible Markup Language FST-UMAC Gong Zhiguo
2
Gong Z.G.2 How the Web is Today HTML documents all intended for human consumption many generated automatically by applications Easy to fetch any Web page, from any server, any platform
3
Gong Z.G.3 Limits of the Web Today Application cannot consume HTML HTML wrapper technology is brittle –screen scraping OO technology (Corba) requires controlled environment Companies merge, form partnerships; need interoperability fast
4
Gong Z.G.4 Paradigm Shift on the Web new Web standard XML: –XML generated by applications –XML consumed by applications data exchange –across platforms: enterprise interoperability –across enterprises Web: from collection of documents to data and documents
5
Gong Z.G.5 XML a W3C standard to complement HTML origins: structured text SGML motivation: –HTML describes presentation –XML describes content http://www.w3.org/TR/REC-xml (2/98)
6
Gong Z.G.6 From HTML to XML HTML describes the presentation
7
Gong Z.G.7 HTML Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999
8
Gong Z.G.8 XML Foundations… Abiteboul Hull Vianu Addison Wesley 1995 …
9
Gong Z.G.9 XML Terminology tags: book, title, author, … start tag:, end tag: elements: …, … elements are nested empty element: abbrv. an XML document: single root element well formed XML document: if it has matching tags
10
Gong Z.G.10 More XML: Attributes Foundations of Databases Abiteboul … 1995 attributes are alternative ways to represent data
11
Gong Z.G.11 Query Languages: Motivation granularity of the HTML Web: one file granularity of Web data varies: –single data item: “get John’s salary” –entire database: “get all salaries” –aggregates: “get average salary” need query language to define granularity
12
Gong Z.G.12 XML-QL: A Query Language for XML http://www.w3.org/TR/NOTE-xml-ql (8/98) features: –regular path expressions –patterns, templates –Skolem Functions based on OEM data model
13
Gong Z.G.13 Pattern Matching in XML-QL where Morgan Kaufmann $a in “www.a.b.c/bib.xml” construct $a where Morgan Kaufmann $a in “www.a.b.c/bib.xml” construct $a
14
Gong Z.G.14 Simple Constructors in XML-QL Note: abbreviates or or... where $a in “www.a.b.c/bib.xml” construct $a $l where $a in “www.a.b.c/bib.xml” construct $a $l Smith English Smith Mandarin Doe English
15
Gong Z.G.15 Schemas in XML Document Type Definition (DTD) XML Schema RDF Schema
16
Gong Z.G.16 Document Type Definition: DTD part of the original XML specification an XML document may have a DTD terminology for XML: –well-formed: if tags are correctly closed –valid: if it has a DTD and conforms to it validation is useful in data exchange
17
Gong Z.G.17 DTDs as Grammars <!DOCTYPE paper [ ]> <!DOCTYPE paper [ ]> …
18
Gong Z.G.18 DTDs as Schemas Not so well suited: impose unwanted constraints on order references cannot be constrained can be too vague:
19
Gong Z.G.19 XML Storage text file (XML) store in ternary relation use DTD to derive schema mine data to derive schema build special purpose repository (Lore)
20
Gong Z.G.20 XML Storage: Text File advantages –simple –less space than one thinks –reasonable clustering disadvantage –no updates –require special purpose query processor
21
Gong Z.G.21 &o1 &o3 &o2 &o4&o5 paper title author year &o6 “…” “1986” Store XML in Ternary Relation [Florescu, Kossman 1999] Ref Val
22
Gong Z.G.22 Use DTD to derive Schema DTD: ODMG classes: [Christophides et al. 1994, Shanmugasundaram et al. 1999] class Employee public type tuple (name:string, address:Address, project:List(Project)) class Address public type tuple (street:string, …)
23
Gong Z.G.23 Mine Data to Derive Schema paper author title year fn ln Paper1 Paper2 [Deutsch et al. 1999]
24
Gong Z.G.24 XML and Databases (1) “Is XML a database?” In a strict sense, no. In a more liberal sense, yes, but … –XML has: Storage (the XML document) A schema (DTD) Query languages (XQL, XML-QL, …) Programming interfaces (SAX, DOM) –XML lacks: Efficient storage, indexes, security, transactions, multi- user access, triggers, queries across multiple documents
25
Gong Z.G.25 XML and Databases (2) Data versus Documents –There are two ways to use XML in a database environment: Use XML as a data transport, i.e., to get data in and out of the database –Data is stored in a relational or object-oriented database –Middleware converts between the database and XML Use a “native XML” database, i.e., store data in document form –Use a content management system
26
Gong Z.G.26 XML and Databases (3) Data-centric documents –Fairly regular structure –Fine-grained data –Little or no mixed content –Order of sibling elements often not significant Document-centric documents –Irregular structure –Larger-grained data –Lots of mixed content –Order of sibling elements is significant
27
Gong Z.G.27 XML and Databases (4) Data-centric storage and retrieval systems –Use a database Add middleware to convert to/from XML –Use an XML server (specialized product for e- commerce) –Use an XML-enabled web server with a database backend Document-centric storage and retrieval systems –Content management system –Persistent DOM implementation
28
Gong Z.G.28 XML and Databases (5) Mapping document structure to database structure –Template-driven No predefined mapping Embedded commands process (retrieve) data Currently only available from RDBMS to XML The following flights have available seats: SELECT Airline, FltNumber, Depart, Arrive FROM Flights We hope one of these meets your needs
29
Gong Z.G.29 XML and Databases (6) –Template-driven - Example result: The following flights have available seats: ACME 123 Dec 12, 2000, 13:43 Dec 13, 2000, 01:21 We hope one of these meets your needs
30
Gong Z.G.30 XML and Databases (7) Mapping document structure to database structure –Model-driven A data model is imposed on the structure of the XML document This model is mapped to the structures in the database There are two common models: –Model the XML document as a single table or a set of tables –Model the XML document as a tree of data-specific objects (good for OODBMS mapping)
31
Gong Z.G.31 XML and Databases (8) –Single table or set of tables:......... –Tree organization: Orders | SalesOrder / | \ Customer Item Item | | Part Part
32
Gong Z.G.32 XML and Databases (9) Generating DTDs from a database schema and vice versa –Many times the DTD does not change often for an application and does not need to be automatically generated. –Some simple conversions are possible Example: DTD from relational schema: ¬ For each table, create an ELEMENT. For each column in a table, create an attribute or a PCDATA-only child ELEMENT. ® For each primary key/foreign key relationship in which a column of the table contributes the primary key, create a child ELEMENT.
33
Gong Z.G.33 XML and Databases (10) Document-centric storage and retrieval systems –Content management system Allows the storage of discrete content fragments, such as examples, procedures, chapters, as well as metadata such as author names, revision dates, etc. Many content management systems are built on top of relational or object-oriented database systems. Examples: –BladeRunner (Interleaf), SigmaLink (STEP), Parlance Content Manager (XyEnterprise), Target 2000 (Progressive Information Technology) –Persistent DOM implementation
34
Gong Z.G.34 Further Readings www. w3.org/XML www-db.stanford.edu/~widom www-rocq.inria.fr/~abiteboul db.cis.upenn.edu www.research.att.com/~suciu Abiteboul, Buneman, Suciu Data on the Web: From Relational to Semistructured to XML Morgan Kaufmann, 1999 (appears in October)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.