Managing XML and Semistructured Data Lecture 1: Preliminaries and Overview Prof. Dan Suciu Spring 2001.

Slides:



Advertisements
Similar presentations
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA.
Advertisements

Managing XML and Semistructured Data Lecture 12: XML Schema Prof. Dan Suciu Spring 2001.
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Managing XML and Semistructured Data Lecture 8: Query Languages - XML-QL Prof. Dan Suciu Spring 2001.
Web-site Management System Strudel Presented by: LAKHLIFI Houda Instructor: Dr. Haddouti.
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
Query Languages Aswin Yedlapalli. XML Query data model Document is viewed as a labeled tree with nodes Successors of node may be : - an ordered sequence.
Managing XML and Semistructured Data
Managing XML and Semistructured Data Lecture 6: XPath Prof. Dan Suciu Spring 2001.
1 Introduction to XML Yanlei Diao UMass Amherst April 19, 2007 Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau.
Managing XML and Semistructured Data Lecture 16: Indexes Prof. Dan Suciu Spring 2001.
Managing XML and Semistructured Data Lecture 19: Compressing XML Data Prof. Dan Suciu Spring 2001.
CSE 636 Data Integration Introduction. 2 Staff Instructor: Dr. Michalis Petropoulos Location: 210 Bell Hall Office Hours:
Managing XML and Semistructured Data
Managing XML and Semistructured Data Lecture 14: Constraints and Keys Prof. Dan Suciu Spring 2001.
XML and Databases 198:541. XML Motivation  Huge amounts of unstructured data on the web: HTML documents  No structure information  Only format instructions.
Managing XML and Semistructured Data Lecture 17: Publishing XML Data From Relations Prof. Dan Suciu Spring 2001.
Putting Semi-structured Data to Practice Alon Levy Seattle, Washingon University of Washington.
Managing XML and Semistructured Data Lecture 18: Publishing XML Data From Relations Prof. Dan Suciu Spring 2001.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
Managing XML and Semistructured Data Lecture 2: XML Prof. Dan Suciu Spring 2001.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
XML: Extensible Markup Language FST-UMAC Gong Zhiguo.
Cooperative Query Answering for Semistructured data Michael Barg Raymond K. Wong Reviewed by SwethaJack Christian (Absent) Chris.
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
Dan SuciuTools for XML Data Exchange Dan Suciu AT&T Labs Joint work with Mary Fernandez.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
Semistructured data and XML CS 645 April 5, 2006 Some slide content courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives.
1 Maintaining Semantics in the Design of Valid and Reversible SemiStructured Views Yabing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
CSE544 Introduction Monday, March 29, Staff Instructor: Dan Suciu –CSE 662, –Office hours: Tuesday, 1-2pm. TA: Nilesh Dalvi.
Management of XML and Semistructured Data Lecture 5: Query Languages Wednesday, 4/1/2001.
Managing XML and Semistructured Data Lecture 13: XDuce and Regular Tree Languages Prof. Dan Suciu Spring 2001.
Lecture 6: XML Query Languages Thursday, January 18, 2001.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
Lecture 5: XML Tuesday, January 16, Outline XML, DTDs (Data on the Web, 3.1) Semistructured data in XML (3.2) Exporting Relational Data in XML (8.3.1)
CSE 636 Data Integration Fall 2006 XML Query Languages XPath.
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
More XML: semantics, DTDs, XPATH February 18, 2004.
Web-site Building Methodologies Current Research.
1 Automatic Generation of XQuery View Definitions from ORA-SS Views Ya Bing Chen Tok Wang Ling Mong Li Lee School of Computing National University of Singapore.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
Management of XML and Semistructured Data Lecture 10: Schemas Monday, April 30, 2001.
1 M ATERIALIZED V IEW M AINTENANCE FOR THE X ML D OCUMENTS Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen.
Semi-structured Data In many applications, data does not have a rigidly and predefined schema: –e.g., structured files, scientific data, XML. Managing.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
Lecture 14: Relational Algebra Projects XML?
Introduction to Database Systems CSE 444
XML path expressions CSE 350 Fall 2003.
Managing XML and Semistructured Data
Management of XML and Semistructured Data
Semi-Structured Data and Agile Application Development
Management of XML and Semistructured Data
Managing XML and Semistructured Data
(b) Tree representation
Managing XML and Semistructured Data
CSE544 Lecture 1: Introduction
Alin Deutsch, University of Pennsylvania Mary Mernandez, AT&T Labs
Introduction to Database Systems CSE 444
Lecture 9: XML Monday, October 17, 2005.
Wednesday, May 29, 2002 XML Storage Final Review
Query Optimization.
Introduction to Database Systems CSE 444 Lecture 10 XML
Introduction to Database Systems CSE 444
Introduction to Database Systems CSE 444
Lecture 11: XML and Semistructured Data
Presentation transcript:

Managing XML and Semistructured Data Lecture 1: Preliminaries and Overview Prof. Dan Suciu Spring 2001

Managing XML and Semistructured Data In this lecture Goals of the course Prerequisites Resources – textbooks – research papers Overview of the course

Managing XML and Semistructured Data Goals of the Course Purpose: Foundations of semistructured data Issues in semistructured data management Glimpse at current XML standards and technology

Managing XML and Semistructured Data Prerequisites A graduate course in database systems Logic Programming languages Complexity theory Algorithms and data structures

Managing XML and Semistructured Data Textbooks Data on the Web: from Relations, to Semistructured Data and XML, Abiteboul, Buneman, Suciu –For foundations W3C homepage, –For current standards Professional XML Databases, Kevin Williams –For current XML technologies

Managing XML and Semistructured Data Other Useful Texts A first course in database systems (2 vols) Ullman, Widom and Garcia-Molina Data and Knowledge based Systems (2 vols) Ullman Foundations of data bases Abiteboul, Hull Vianu Proceedings of SIGMOD, VLDB, PODS conferences.

Managing XML and Semistructured Data Papers: Data Models XML, Java, and the future of the Web by Jon Bosak, Sun Microsystems.XML, Java, and the future of the Web W3C XML Query Data Model Mary Fernandez, Jonathan Robie.W3C XML Query Data Model Adding structure to semistructured data by Buneman, Davidson, Fernandez, Suciu, in ICDT 97Adding structure to semistructured data Object Exchange Across Heterogeneous Information Sources Y. Papakonstantinou and H. Garcia-Molina and J. Widom, Data Engineering 95Object Exchange Across Heterogeneous Information Sources

Managing XML and Semistructured Data Papers: Query Languages A formal semantics of patterns in XSLT by Phil Wadler.A formal semantics of patterns in XSLT XQuery: A Query Language for XML Chamberlin, Florescu, et al.XQuery: A Query Language for XML XML-QL: A Query Language for XML by Deutsch, Fernandez, Florescu, Levy, Suciu, in WWW8.XML-QL: A Query Language for XML Catching the boat with Strudel VLDBJ 2001.Catching the boat with Strudel UnQL: A Query Language and Algebra for Semistructured Data Based on Structural Recursion Buneman, Fernandez, Suciu.VLDBJ 2000UnQL: A Query Language and Algebra for Semistructured Data Based on Structural Recursion The Lorel Query Language for Semistructured Data by Abiteboul, Quass, McHugh, Widom, Wiener, in International Journal on Digital Libraries, 1997.The Lorel Query Language for Semistructured Data

Managing XML and Semistructured Data Papers: Schemas MSL: A Model for W3C XML Schema by Brown, Fuchs, Robie, Wadler, in WWW10, 2001.MSL: A Model for W3C XML Schema Keys for XML by Buneman, Davidson, Fan, Hara, Tan, in WWW10, 2001.Keys for XML Subsumption for XML Types by Kuper and Simeon, ICDT'2001.Subsumption for XML Types Extracting Schema from Semistructured Data Nestorov, Abiteboul, Motwani. SIGMOD 98Extracting Schema from Semistructured Data

Managing XML and Semistructured Data Papers: Query Analysis, Typechecking Optimizing Regular Path Expressions Using Graph Schemas Fernandez, Suciu, ICDE'98.Optimizing Regular Path Expressions Using Graph Schemas XDuce: A typed XML processing language by Hosoya and PierceXDuce: A typed XML processing language Regular Expresssion Pattern Matching for XML by Hosoya and Pierce (in POPL 2001)Regular Expresssion Pattern Matching for XML Typechecking for XML TransformersMilo, Vianu, Suciu.Typechecking for XML Transformers

Managing XML and Semistructured Data Papers: Indexing Index Structures for Path Expressions by Milo and Suciu, in ICDT'99.Index Structures for Path Expressions

Managing XML and Semistructured Data Papers: Publishing Efficiently Publishing Relational Data as XML Ducments by Shanmugasundaram, Shekita, Barr, Carey, Lindsay, Pirahesh, Reinwald in VLDB'2000Efficiently Publishing Relational Data as XML Ducments SilkRoute: Trading between relations and XML by Fernandez, Suciu, Tan R, in WWW9, 2000SilkRoute: Trading between relations and XML Efficient Evaluation of XML Middle-ware Queries in SIGMOD'2001Efficient Evaluation of XML Middle-ware Queries

Managing XML and Semistructured Data Papers: Compression XMILL: An Efficient Compressor for XML Data by Liefke and Suciu, in SIGMOD'2001XMILL: An Efficient Compressor for XML Data

Managing XML and Semistructured Data Overview Semistructured Data –Model –Syntax –Comparison with relational data

Managing XML and Semistructured Data Overview XML –Motivation –Syntax: Basic stuff: elements, attributes, content Esoteric stuff: PIs, entities, CDATA, comments –DTDs –Data model (XQuery) –Miscellaneous: Name spaces, XPointer, XLink

Managing XML and Semistructured Data Overview Query Languages –Lorelextends OQL –UnQLstructural recursion, patterns –StruQLSkolem Functions –XML-QLeverything for XML –Quilt/Xquerythe standard –XSLthe standard –XDucea general-purpose language

Managing XML and Semistructured Data Overview Schemas –Theory: lower bound, upper bound –XML-Schema –“XML-Schema are regular tree languages” –Constraints (keys for XML)

Managing XML and Semistructured Data Overview Query analysis –Query pruning –Query containment

Managing XML and Semistructured Data Overview XML Publishing from Relational Databases –Virtual XML publishing: SilkRoute, Microsoft’s XDR –Materialized XML publishing: Experanto, SilkRoute, Microsoft’s “for XML”

Managing XML and Semistructured Data Overview Indexes –Indexes for ss data: data guides, T-indexes –Indexes for XML: we are still waiting for them...

Managing XML and Semistructured Data Overview Miscellaneous –XML compression (Xmill)