1 What Is XML? eXtensible Markup Language for data –Standard for publishing and interchange –“Cleaner” SGML for the Internet Applications: –Data exchange.

Slides:



Advertisements
Similar presentations
XML: Extensible Markup Language
Advertisements

XML/EDI Overview West Chester Electronic Commerce Resource Center (ECRC)
Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
IELM 511: Information System design Introduction Part 1. ISD for well structured data – relational and other DBMS Part 2. ISD for systems with non-uniformly.
Managing XML and Semistructured Data Lecture 8: Query Languages - XML-QL Prof. Dan Suciu Spring 2001.
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
1 COS 425: Database and Information Management Systems XML and information exchange.
XML - QL A Query Language for XML Version /2000XML-QL2 Outline * Introduction * Examples in XML-QL * A Data Model for XML * Advanced Examples in.
1 Statistics XML: –Altavista: 800,000 pages returned. –Amazon.com: 242 books. In comparison: –God: 12,000 books, 7 Million pages –Bible: 32,000 books,
CSC056-Z1 – Database Management Systems – Vinnie Costa – Hofstra University1 Database Management Systems Session 10 Instructor: Vinnie Costa
1 New Ways of Querying the Web by Eliahu Brodsky and Alina Blizhovsky.
4/15/2002Bo Du 1 - Bo Du, April 15, XML - QL A Query Language for XML.
XML and Databases 198:541. XML Motivation  Huge amounts of unstructured data on the web: HTML documents  No structure information  Only format instructions.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Putting Semi-structured Data to Practice Alon Levy Seattle, Washingon University of Washington.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
XMLII XSchema XSchema XQuery XQuery. XML Schema XML Schema is a more sophisticated schema language which addresses the drawbacks of DTDs. Supports XML.
4/20/2017.
XML: Extensible Markup Language FST-UMAC Gong Zhiguo.
XML – what is it? eXtensible Markup Language Standard for publishing and interchange on the web and over the wire simpler version of SGML adapted to internet.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
XML By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Chapter 10: XML.
XML-QL A Query Language for XML Charuta Nakhe
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
Dan SuciuTools for XML Data Exchange Dan Suciu AT&T Labs Joint work with Mary Fernandez.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
XML eXtensible Markup Language w3c standard Why? Store and transport data Easy data exchange Create more languages WSDL (Web Service Description Language)
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
XML BIS4430 – unit 10. XML Origins Extensible Markup Language (XML) 1998 Inspired by Standard Generalized Markup Language (SGML) and HTML. SGML defines.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
Company LOGO OODB and XML Database Management Systems – Fall 2012 Matthew Moccaro.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
Waqas Anwar Next SlidePrevious Slide. Waqas Anwar Next SlidePrevious Slide XML XML stands for EXtensible Markup Language.
XML Name: Niki Sardjono Class: CS 157A Instructor : Prof. S. M. Lee.
Lecture 5: XML Tuesday, January 16, Outline XML, DTDs (Data on the Web, 3.1) Semistructured data in XML (3.2) Exporting Relational Data in XML (8.3.1)
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
XML e X tensible M arkup L anguage (XML) By: Albert Beng Kiat Tan Ayzer Mungan Edwin Hendriadi.
1 “Universal Data-Speak”: The eXtensible Markup Language Zack Ives CSE 590DB, Winter 2000 University of Washington 3 January 2000.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
Structured Documents - XML and FrameMaker 7 Asit Pant.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
Semi-structured Data In many applications, data does not have a rigidly and predefined schema: –e.g., structured files, scientific data, XML. Managing.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
임 순 범 숙명여대 정보과학부 멀티미디어학과 1 III. XML-QL 멀티미디어 데이터베이스 ( ~11.1)
1 Introduction to XML Babak Esfandiari. 2 What is XML? introduced by W3C in 98 Stands for eXtensible Markup Language it is more general than HTML, but.
XML: Extensible Markup Language
XML Related Technologies
XML in Web Technologies
XML Data DTDs, IDs & IDREFs.
eXtensible Markup Language (XML)
Alin Deutsch, University of Pennsylvania Mary Mernandez, AT&T Labs
Lecture 9: XML Monday, October 17, 2005.
Lecture 8: XML Data Wednesday, October
CSE591: Data Mining by H. Liu
Semi-structured Data In many applications, data does not have a rigidly and predefined schema: e.g., structured files, scientific data, XML. Managing such.
Semi-Structured data (XML)
Presentation transcript:

1 What Is XML? eXtensible Markup Language for data –Standard for publishing and interchange –“Cleaner” SGML for the Internet Applications: –Data exchange over intranets, between companies –E-business –Native file formats (Word, SVG) –Publishing of data –Storage format for irregular data –…

2 How Does it Look? –Emerging format for data exchange on the web and between applications.

3 XML Terminology tags: book, title, author, … start tag:, end tag: elements: …, … elements are nested empty element: abbrv. an XML document: single root element well formed XML document: if it has matching tags

4 Attributes and References  XML distinguishes attributes from sub-elements.  ID’s and IDREFs are used to reference objects. oids and references in XML are just syntax

5 What’s Special about XML? Supported by almost everyone Easy to parse (even with no info about the doc) Can encode data with little or much structure Supports data references inside & outside document Presentation layer for publishing (XSL) Human readable. No need for proprietary formats anymore. Many, many tools

6 Origin of XML Comes from SGML (very nasty language). Principle: separate the data from the graphical presentation.

7 XML, After the roots A format for sharing data. Applications: –EDI: electronic data exchange: Transactions between banks Producers and suppliers sharing product data (auctions) Extranets: building relationships between companies Scientists sharing data about experiments. –Sharing data between different components of an application. –Format for storing all data in Office Basis for data sharing and integration.

8 Why are we DB’ers interested? It’s data, stupid. That’s us. Proof by Altavista: –database+XML -- 40,000 pages. Database issues: –How are we going to model XML? (graphs). –How are we going to query XML? (XML-QL) –How are we going to store XML (in a relational database? object-oriented?) –How are we going to process XML efficiently? (uh… well..., um..., ah..., get some good grad students!)

9 Document Type Descriptors  Sort of like a schema but not really.  Inherited from SGML DTD standard  BNF grammar establishing constraints on element structure and content  Definitions of entities

10 Shortcomings of DTDs Useful for documents, but not so good for data: No support for structural re-use –Object-oriented-like structures aren’t supported No support for data types –Can’t do data validation Can have a single key item (ID), but: –No support for multi-attribute keys –No support for foreign keys (references to other keys) –No constraints on IDREFs (reference only a Section)

11 XML Schema In XML format Includes primitive data types (integers, strings, dates, etc.) Supports value-based constraints (integers > 100) User-definable structured types Inheritance (extension or restriction) Foreign keys Element-type reference constraints

12 Sample XML Schema …

13 Subtyping in

14 Important XML Standards XSL/XSLT*: presentation and transformation standards RDF: resource description framework (meta-info such as ratings, categorizations, etc.) Xpath/Xpointer/Xlink*: standard for linking to documents and elements within Namespaces: for resolving name clashes DOM: Document Object Model for manipulating XML documents SAX: Simple API for XML parsing This weekend, somewhere in Germany, a W3C committee is meeting to discuss standard query language.

15 XML Data Model (Graph) Issues: distinguish between attributes and sub-elements? Should we conserve order? Think of the labels as names of binary relations.

16 Comparison with Relational Data No strict typing Arbitrary nesting Data can be irregular Schema is part of the data row name phone “John”3634“Sue”“Dick”

17 Querying XML Requirements: –Query a graph, not a relation. –The result should be a graph (representing an XML document), not a relation. –No schema. –We may not know much about the data, so we need to navigate the XML.

18 Query Languages First, there was XQL (from Microsoft). Very quickly realized that it was very limited. Then, a bunch of database researchers looked at XML and invented XML-QL. –XML-QL comes from the nicer StruQL language. –Many people got excited. Formed a committee. Last week: Quilt, a new language combining the best of XML-QL and XQL. Stay tuned.

19 Extracting Data by Query Matching data using elements patterns. WHERE Addison-Wesley $t $a IN “ CONSTRUCT $a

20 Constructing XML Data WHERE Addison-Wesley $t $a IN “ CONSTRUCT $a $t

21 Grouping with Nested Queries WHERE $t, Addison-Wesley CONTENT_AS $p IN “ CONSTRUCT $t WHERE $a IN $p CONSTRUCT $a

22 Joining Elements by Value WHERE $f $l ELEMENT_AS $e IN “ $f $l IN “ y > 1995 CONSTRUCT $e Find all articles whose writers also published a book after 1995.

23 Tag Variables WHERE $f $l ELEMENT_AS $e IN “ $f $l IN “ y > 1995 CONSTRUCT $e Find all articles whose writers have done something after 1995.

24 Regular Path Expressions WHERE $r Ford IN " CONSTRUCT $r Find all parts whose brand is Ford, no matter what level they are in the hierarchy.

25 Regular Path Expressions WHERE $r IN " CONSTRUCT $r

26 XML Data Integration WHERE ELEMENT_AS $n $ssn IN “ $ssn ELEMENT_AS $I IN “ CONSTRUCT $n $I Query can access more than one XML document.

27 Skolem Functions in XML-QL where $a in “ construct $a $l where $a in “ construct $a $l Smith English Mandarin Doe English

28 Query Processing For XML Approach 1: store XML in a relational database. Translate an XML-QL query into a set of SQL queries. –Leverage 20 years of research & development. Approach 2: store XML in an object- oriented database system. –OO model is closest to XML, but systems do not perform well and are not well accepted. Approach 3: build an entire DBMS tailored to XML. –Still in the research phase.

29 &o1 &o3 &o2 &o4&o5 paper title author year &o6 “The Calculus”“…” “1986” Store XML in Ternary Relation [Florescu, Kossman 1999] Ref Val

30 Use DTD to derive Schema DTD: ODMG classes: [Christophides et al. 1994, Shanmugasundaram et al. 1999] class Employee public type tuple (name:string, address:Address, project:List(Project)) class Address public type tuple (street:string, …)

31 The Future Many research problems remain: –Efficient storage of XML –How to leverage relational DBMS –Update formalisms –Processing streaming data –Transactions –Everything else we think about in databases.