XML: Extensible Markup Language FST-UMAC Gong Zhiguo.

Slides:



Advertisements
Similar presentations
XML to Relational Database Mapping
Advertisements

XML: Extensible Markup Language
XML and Enterprise Computing. What is XML? Stands for “Extensible Markup Language” –similar to SGML and HTML –document “tags” are used to define content.
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Managing XML and Semistructured Data Lecture 8: Query Languages - XML-QL Prof. Dan Suciu Spring 2001.
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
1 COS 425: Database and Information Management Systems XML and information exchange.
1 Statistics XML: –Altavista: 800,000 pages returned. –Amazon.com: 242 books. In comparison: –God: 12,000 books, 7 Million pages –Bible: 32,000 books,
1 New Ways of Querying the Web by Eliahu Brodsky and Alina Blizhovsky.
1 Introduction to XML Yanlei Diao UMass Amherst April 19, 2007 Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau.
XML and Databases 198:541. XML Motivation  Huge amounts of unstructured data on the web: HTML documents  No structure information  Only format instructions.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
End of SQL XML April 22 th, Null Values If x=Null then 4*(3-x)/7 is still NULL If x=Null then x=“Joe” is UNKNOWN Three boolean values: –FALSE =
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
Managing XML and Semistructured Data Lecture 2: XML Prof. Dan Suciu Spring 2001.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
XML – a data sharing standard DSC340 Mike Pangburn.
4/20/2017.
10/14/2001 Coping with Semantics in XML Document Management Thomas Kudrass Leipzig University of Applied Sciences Department of Computer Science and Mathematics.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
Dan SuciuTools for XML Data Exchange Dan Suciu AT&T Labs Joint work with Mary Fernandez.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
CISC 3140 (CIS 20.2) Design & Implementation of Software Application II Instructor : M. Meyer Address: Course Page:
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 What Is XML? eXtensible Markup Language for data –Standard for publishing and interchange –“Cleaner” SGML for the Internet Applications: –Data exchange.
XML and Database COSC643 Sungchul Hong. Is XML a Database? Yes but only in the strictest sense of the term. It is a collection of data. (some sort) XML.
Lecture 6: XML Query Languages Thursday, January 18, 2001.
Lecture 5: XML Tuesday, January 16, Outline XML, DTDs (Data on the Web, 3.1) Semistructured data in XML (3.2) Exporting Relational Data in XML (8.3.1)
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
An Introduction to XML Sandeep Bhattaram
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
More XML: semantics, DTDs, XPATH February 18, 2004.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
Management of XML and Semistructured Data Lecture 10: Schemas Monday, April 30, 2001.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
XML and Database.
XML e X tensible M arkup L anguage (XML) By: Albert Beng Kiat Tan Ayzer Mungan Edwin Hendriadi.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
XML A Language Presentation. Outline 1. Introduction 2. XML 2.1 Background 2.2 Structure 2.3 Advantages 3. Related Technologies 3.1 DTD 3.2 Schemas and.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
Semi-structured Data In many applications, data does not have a rigidly and predefined schema: –e.g., structured files, scientific data, XML. Managing.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
Lecture 14: Relational Algebra Projects XML?
XML: Extensible Markup Language
XML QUESTIONS AND ANSWERS
Management of XML and Semistructured Data
Database Processing with XML
Managing XML and Semistructured Data
Semi-Structured data (XML Data MODEL)
Data Model.
Alin Deutsch, University of Pennsylvania Mary Mernandez, AT&T Labs
Lecture 9: XML Monday, October 17, 2005.
Wednesday, May 29, 2002 XML Storage Final Review
Lecture 8: XML Data Wednesday, October
Introduction to Database Systems CSE 444 Lecture 10 XML
Semi-Structured data (XML)
Lecture 11: XML and Semistructured Data
Presentation transcript:

XML: Extensible Markup Language FST-UMAC Gong Zhiguo

Gong Z.G.2 How the Web is Today HTML documents all intended for human consumption many generated automatically by applications Easy to fetch any Web page, from any server, any platform

Gong Z.G.3 Limits of the Web Today Application cannot consume HTML HTML wrapper technology is brittle –screen scraping OO technology (Corba) requires controlled environment Companies merge, form partnerships; need interoperability fast

Gong Z.G.4 Paradigm Shift on the Web new Web standard XML: –XML generated by applications –XML consumed by applications data exchange –across platforms: enterprise interoperability –across enterprises Web: from collection of documents to data and documents

Gong Z.G.5 XML a W3C standard to complement HTML origins: structured text SGML motivation: –HTML describes presentation –XML describes content (2/98)

Gong Z.G.6 From HTML to XML HTML describes the presentation

Gong Z.G.7 HTML Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999

Gong Z.G.8 XML Foundations… Abiteboul Hull Vianu Addison Wesley 1995 …

Gong Z.G.9 XML Terminology tags: book, title, author, … start tag:, end tag: elements: …, … elements are nested empty element: abbrv. an XML document: single root element well formed XML document: if it has matching tags

Gong Z.G.10 More XML: Attributes Foundations of Databases Abiteboul … 1995 attributes are alternative ways to represent data

Gong Z.G.11 Query Languages: Motivation granularity of the HTML Web: one file granularity of Web data varies: –single data item: “get John’s salary” –entire database: “get all salaries” –aggregates: “get average salary” need query language to define granularity

Gong Z.G.12 XML-QL: A Query Language for XML (8/98) features: –regular path expressions –patterns, templates –Skolem Functions based on OEM data model

Gong Z.G.13 Pattern Matching in XML-QL where Morgan Kaufmann $a in “ construct $a where Morgan Kaufmann $a in “ construct $a

Gong Z.G.14 Simple Constructors in XML-QL Note: abbreviates or or... where $a in “ construct $a $l where $a in “ construct $a $l Smith English Smith Mandarin Doe English

Gong Z.G.15 Schemas in XML Document Type Definition (DTD) XML Schema RDF Schema

Gong Z.G.16 Document Type Definition: DTD part of the original XML specification an XML document may have a DTD terminology for XML: –well-formed: if tags are correctly closed –valid: if it has a DTD and conforms to it validation is useful in data exchange

Gong Z.G.17 DTDs as Grammars <!DOCTYPE paper [ ]> <!DOCTYPE paper [ ]> …

Gong Z.G.18 DTDs as Schemas Not so well suited: impose unwanted constraints on order references cannot be constrained can be too vague:

Gong Z.G.19 XML Storage text file (XML) store in ternary relation use DTD to derive schema mine data to derive schema build special purpose repository (Lore)

Gong Z.G.20 XML Storage: Text File advantages –simple –less space than one thinks –reasonable clustering disadvantage –no updates –require special purpose query processor

Gong Z.G.21 &o1 &o3 &o2 &o4&o5 paper title author year &o6 “…” “1986” Store XML in Ternary Relation [Florescu, Kossman 1999] Ref Val

Gong Z.G.22 Use DTD to derive Schema DTD: ODMG classes: [Christophides et al. 1994, Shanmugasundaram et al. 1999] class Employee public type tuple (name:string, address:Address, project:List(Project)) class Address public type tuple (street:string, …)

Gong Z.G.23 Mine Data to Derive Schema paper author title year fn ln Paper1 Paper2 [Deutsch et al. 1999]

Gong Z.G.24 XML and Databases (1) “Is XML a database?” In a strict sense, no. In a more liberal sense, yes, but … –XML has: Storage (the XML document) A schema (DTD) Query languages (XQL, XML-QL, …) Programming interfaces (SAX, DOM) –XML lacks: Efficient storage, indexes, security, transactions, multi- user access, triggers, queries across multiple documents

Gong Z.G.25 XML and Databases (2) Data versus Documents –There are two ways to use XML in a database environment: Use XML as a data transport, i.e., to get data in and out of the database –Data is stored in a relational or object-oriented database –Middleware converts between the database and XML Use a “native XML” database, i.e., store data in document form –Use a content management system

Gong Z.G.26 XML and Databases (3) Data-centric documents –Fairly regular structure –Fine-grained data –Little or no mixed content –Order of sibling elements often not significant Document-centric documents –Irregular structure –Larger-grained data –Lots of mixed content –Order of sibling elements is significant

Gong Z.G.27 XML and Databases (4) Data-centric storage and retrieval systems –Use a database Add middleware to convert to/from XML –Use an XML server (specialized product for e- commerce) –Use an XML-enabled web server with a database backend Document-centric storage and retrieval systems –Content management system –Persistent DOM implementation

Gong Z.G.28 XML and Databases (5) Mapping document structure to database structure –Template-driven No predefined mapping Embedded commands process (retrieve) data Currently only available from RDBMS to XML The following flights have available seats: SELECT Airline, FltNumber, Depart, Arrive FROM Flights We hope one of these meets your needs

Gong Z.G.29 XML and Databases (6) –Template-driven - Example result: The following flights have available seats: ACME 123 Dec 12, 2000, 13:43 Dec 13, 2000, 01:21 We hope one of these meets your needs

Gong Z.G.30 XML and Databases (7) Mapping document structure to database structure –Model-driven A data model is imposed on the structure of the XML document This model is mapped to the structures in the database There are two common models: –Model the XML document as a single table or a set of tables –Model the XML document as a tree of data-specific objects (good for OODBMS mapping)

Gong Z.G.31 XML and Databases (8) –Single table or set of tables: –Tree organization: Orders | SalesOrder / | \ Customer Item Item | | Part Part

Gong Z.G.32 XML and Databases (9) Generating DTDs from a database schema and vice versa –Many times the DTD does not change often for an application and does not need to be automatically generated. –Some simple conversions are possible Example: DTD from relational schema: ¬ For each table, create an ELEMENT. ­ For each column in a table, create an attribute or a PCDATA-only child ELEMENT. ® For each primary key/foreign key relationship in which a column of the table contributes the primary key, create a child ELEMENT.

Gong Z.G.33 XML and Databases (10) Document-centric storage and retrieval systems –Content management system Allows the storage of discrete content fragments, such as examples, procedures, chapters, as well as metadata such as author names, revision dates, etc. Many content management systems are built on top of relational or object-oriented database systems. Examples: –BladeRunner (Interleaf), SigmaLink (STEP), Parlance Content Manager (XyEnterprise), Target 2000 (Progressive Information Technology) –Persistent DOM implementation

Gong Z.G.34 Further Readings www. w3.org/XML www-db.stanford.edu/~widom www-rocq.inria.fr/~abiteboul db.cis.upenn.edu Abiteboul, Buneman, Suciu Data on the Web: From Relational to Semistructured to XML Morgan Kaufmann, 1999 (appears in October)