1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.

Slides:



Advertisements
Similar presentations
XML Examples. Bank Information Basic structure: A-101 Downtown 500 … Johnson Alma Surrey … A-101 Johnson …
Advertisements

XML: Extensible Markup Language
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 10: XML.
1 COS 425: Database and Information Management Systems XML and information exchange.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Manohar – Why XML is Required Problem: We want to save the data and retrieve it further or to transfer over the network. This.
XMLII XSchema XSchema XQuery XQuery. XML Schema XML Schema is a more sophisticated schema language which addresses the drawbacks of DTDs. Supports XML.
4/20/2017.
XML – Data Model, DTD and Schema
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Lecture 7 of Advanced Databases XML Querying & Transformation Instructor: Mr.Ahmed Al Astal.
Extensible Markup Language
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
©Silberschatz, Korth and Sudarshan10.1Database System ConceptsIntroduction XML: Extensible Markup Language Defined by the WWW Consortium (W3C) Originally.
XML By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Chapter 10: XML.
Maziar Sanaii Ashtiani – SCT – EMU, Fall 2011/12.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
Lecture 6 of Advanced Databases XML Querying & Transformation Instructor: Mr.Eyad Almassri.
Computing & Information Sciences Kansas State University Friday, 17 Oct 2007CIS 560: Database System Concepts Lecture 21 of 42 Friday, 17 October 2008.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Chapter 10: XML XML Structure of XML Data XML Document Schema Querying and Transformation Application Program Interfaces to XML Storage of XML Data.
Extensible Markup and Beyond
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XMLI Structure of XML Data Structure of XML Data XML Document Schema XML Document Schema XPATH XPATH.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
Computing & Information Sciences Kansas State University Thursday, 15 Mar 2007CIS 560: Database System Concepts Lecture 24 of 42 Thursday, 15 March 2007.
XML Name: Niki Sardjono Class: CS 157A Instructor : Prof. S. M. Lee.
1 Credits Prepared by: Rajendra P. Srivastava Ernst & Young Professor University of Kansas Sponsored by: Ernst & Young, LLP (August 2005) XBRL Module Part.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
Jeff Ullman: Introduction to XML 1 XML Semistructured Data Extensible Markup Language Document Type Definitions.
An Introduction to XML Sandeep Bhattaram
Chapter 23 XML. 2 Introduction  XML: eXtensible Markup Language (What is a Markup language?)  Defined by the WWW Consortium (W3C)  Originally intended.
1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
1 Indexing The syntax for creating a index is: CREATE [UNIQUE] INDEX index_name ON table_name (column1, column2,... column_n) [ COMPUTE STATISTICS ]; Why.
Database System Concepts Bin Mu at Tongji University Chapter 10: XML.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
XML Extensible Markup Language. What is XML? o XML stands for EXtensible Markup Language o XML is a markup language much like HTML o XML was designed.
1. XML Structure of XML Data XML Document Schema Querying and Transformation Application Program Interfaces to XML Storage of XML Data XML Applications.
ADT 2010 Introduction to XML, XPath (& XQuery) Chapter 10 in Silberschatz, Korth, Sudarshan “Database System Concepts” Stefan Manegold
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 10: XML.
Martin Kruliš by Martin Kruliš (v1.1)1.
Chapter 10: XML. XML Structure of XML Data XML Document Schema Querying and Transformation Application Program Interfaces to XML Storage of XML Data XML.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 10: XML.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML – Basic Concepts (modified version from Dr. Praveen Madiraju) 2015, Fall Pusan National University Ki-Joune Li.
XML Extensible Markup Language
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
1 XML eXtensible Markup Language. 2 Introduction and Motivation Dr. Praveen Madiraju Modified from Dr.Sagiv’s slides.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML: Extensible Markup Language
Unit 4 Representing Web Data: XML
Querying and Transforming XML Data
XML QUESTIONS AND ANSWERS
XML in Web Technologies
Database Processing with XML
Chapter 7 Representing Web Data: XML
Semi-Structured data (XML Data MODEL)
Semi-Structured data (XML)
Presentation transcript:

1 Advanced Topics XML and Databases

2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation –XPath –XSLT –XQuery

3 XML Overview u eXtensible Markup Language xML u Hyper-Text Markup Language (HTML) for document presentation and Standard Generalized Markup Language SGML for document management. u XML can handle structured data typical of DBMS. u XML is flexible and can handle semi-structured data that cannot be handled by relational DBMS. u XML is the de facto representation to exchange data between applications on the Web.

4 XML Overview u Markup Language –separation of content and markup; –meaning of the markup; –E.g., HTML shows document markup for presentation; –Tags – Database System Concepts –HTML has a specific set of tags; –XML is extensible and applications can specify tags as needed.

5 XML Overview u Comparison with DBMS –Focus is on the EXCHANGE of data between applications. –Storage and management of XML is more complex than for relational DBMS since XML is semi- structured. –Tagged XML means that the message is self- documenting. No need for catalog, etc. –Format of XML is not rigid and an application can ignore any fields. –Versatile since most browsers are XML enabled and most DBMS vendors support XML data.

6 Structure of XML Data u XML document; single root, e.g., bank in Figure 10.1 u Element: bank is the root element and document also contains customer, account and depositor elements. u Elements in the XML document must be properly nested, i.e., matching start and end tag within parent. u is properly nested. u is not properly nested. u Figure 10.2 – Combine unstructured data (text) and semi-structured data. This is one of the strengths of XML data exchange.

7 Structure of XML Data u Nested data in XML can be considered similar to the output of a join from multiple tables or an unnormalized (nested) relational table. u Figure 10.3 shows account elements nested within customer elements. –Advantage is that there is no need to join customer and account. –Shipping address is stored with each shipment. –Disadvantage is that if customer and account is a many-to-many relationship then the account information will be replicated with all the disadvantages of replicated information.

8 Structure of XML Data u Element u Subelement – or u Attribute –Figure 10.4 –Attribute is of type string; it cannot be repeated within an element and cannot have sub-elements. –account is an element; acct-type is an attribute; account-number and branch-name and balance are subelements of element account.

9 XML Namespace u Namespace allows organizations to specify globally unique names for element tags. u Each tag or attribute is associated with a URI and this combination of URI and tag (attribute) is unique. u Namespace can be declared in the root element. u …. …. …

10 XML DTD u XML documents do not have to conform to any schema or set of pre-defined tags. u However, in most cases, applications require that data conforms to some pre-defined tags. u XML DTD –Allowed list of elements and subelements within elements. –Does not identify data types and other constraints. –| (or) + (1 or more) ? (0 or more)

11 XML DTD u Figure 10.6 DTD Example –bank element consists of one or more account or customer or depositor elements (in that order). –account element has subelements account-number, branch-number, balance, etc. –elements account-number, branch-name, etc. are of type #PCDATA (text or string). –empty – element has no contents. –any – element can have any subelements. –attrributes must have a type declaration and a default value.

12 XML DTD u ID and IDREF and IDREFS Figure 10.7 u ID –An attribute of type ID for an element provides a unique (global) identifier or key for that element. –An element can at most have one such attribute of type ID. –<!ATTLIST account account-number ID #REQUIRED u An attribute of type IDREF is a reference to an element; its value MUST BE the unique ID value of some element in the document. u IDREFS is a set of ID values. u ID and IDREF and IDREFS capture primary key and foreign key functionality of the relational data model. u Figure 10.8 Example of XML document with ID and IDREFS. u IDREF must point to an ID but there is no type checking so it can point to the ID of an account or the ID of a customer or the ID of a branch!

13 XML Schema – Figure 10.9 u XML Schema is closer in spirit to relational schemas. u It is closely associated with namespaces, e.g., xmlns:xsd= u Supports uniqueness of primary keys and constraints on foreign keys. u element has name and type u complexType (account or customer or depositor) is a sequence of subelements. u complexType BankType is a sequence of references to elements of type account or customer or depositor. –More well defined than XML DTD since IDREF could refer to an element irrespective of whether it was an account or a customer. u minOccurs and maxOccurs are multiplicity constraints.

14 Query and Transformation of XML u 3 kinds of query languages –XPath is the building block of path expressions. –XSLT is a transformation language. »Originally designed to convert to HTML. »XSLT can transform one XML document to another so it is also a query language. »Most widely supported. –XQuery is more like an object query language. u Tree model of XML data –Root –Nodes are either elements or attributes. –Element nodes can have children which are subelements or attributes of that element.

15 Query and Transformation of XML u Path expression –Sequence of /xx/yy/zz where / refers to the root. –Result is a set of values from the XML document. –/bank-2/customer/customer-name on Figure 10.8 returns Joe and Lisa and Mary –/bank-2/customer/customer-name/text() would return only the values and not the tagged elements. also returns the set of account cannot be applied to IDREFS. u Selection – /bank-2/account[balance > 400] – /bank-2/account[balance > u Count –/bank-2/account/[customer/count() > 2] u Skip intermediate elements –/bank-2//name

16 BMGTG402 Namespace u …. …