Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dan SuciuTools for XML Data Exchange Dan Suciu AT&T Labs Joint work with Mary Fernandez.

Similar presentations


Presentation on theme: "Dan SuciuTools for XML Data Exchange Dan Suciu AT&T Labs Joint work with Mary Fernandez."— Presentation transcript:

1 Dan SuciuTools for XML Data Exchange Dan Suciu AT&T Labs Joint work with Mary Fernandez

2 Dan SuciuTools for XML Data Exchange XML Has Many Facets XML for fancier Web pages –XML generated with structural editors XML for messaging –generated during applications XML for Data Exchange –generated from legacy data

3 Dan SuciuTools for XML Data Exchange XML in Data Exchange communities agree on common DTD export their data in XML exchange over HTTP protocol applications understand only that DTD

4 Dan SuciuTools for XML Data Exchange An Example of XML Data Addison-Wesley Serge Abiteboul Rick Hull Victor Vianu Foundations of Databases 1995 Freeman Jeffrey D. Ullman Principles of Database and Knowledge Base Systems 1998

5 Dan SuciuTools for XML Data Exchange XML Exchange Vision application relational data Transform Integrate Warehouse XML DataWEB (HTTP) application legacy data object-relational

6 Dan SuciuTools for XML Data Exchange Tools export legacy data to XML –RXL query/transform/integrate XML data –XML-QL compress XML data –XMill store/process incoming XML data –STORED

7 Dan SuciuTools for XML Data Exchange XML-QL: A Query Language for XML http://www.w3.org/TR/NOTE-xml-ql (8/98) W3C new Working Group on QL (9/99) XML-QL characteristics: –relational complete (like SQL) –XML input, XML output –queries, transforms, integrates XML data [Deutsch et al., 1999 (WWW8)]

8 Dan SuciuTools for XML Data Exchange Querying in XML-QL where Morgan Kaufmann $a in “www.a.b.c/bib.xml” construct $a where Morgan Kaufmann $a in “www.a.b.c/bib.xml” construct $a Pattern

9 Dan SuciuTools for XML Data Exchange Transformations in XML-QL Note: abbreviates or or... where $a in “www.a.b.c/bib.xml” construct $a $l where $a in “www.a.b.c/bib.xml” construct $a $l...... Template

10 Dan SuciuTools for XML Data Exchange Transformations in XML-QL where $a in “www.a.b.c/bib.xml” construct $a $l where $a in “www.a.b.c/bib.xml” construct $a $l......... Skolem Functions in Templates

11 Dan SuciuTools for XML Data Exchange Data Integration in XML-QL { where $n $t in “www.books.com” construct $t } { where $n $r in “www.reviews.com” construct $r } { where $n $t in “www.books.com” construct $t } { where $n $r in “www.reviews.com” construct $r }...

12 Dan SuciuTools for XML Data Exchange RXL: Export Legacy Data To XML legacy data –fragmented into many flat relations –3rd normal form –schema is proprietary XML data –nested –un-normalized –schema designed by agreement

13 Dan SuciuTools for XML Data Exchange RXL: An Example relational database: virtual XML view: n1... n2... … StoreSBBook

14 Dan SuciuTools for XML Data Exchange A Simple RXL Query specify XML view declaratively from Store, SB, Book where Store.sid=SB.sid and SB.bid=Book.bid construct Store.name Book.title from Store, SB, Book where Store.sid=SB.sid and SB.bid=Book.bid construct Store.name Book.title

15 Dan SuciuTools for XML Data Exchange RXL: Querying the XML View users ask XML-QL queries: –find stores who sell “The Calculus” where $n The Calculus construct $n where $n The Calculus construct $n

16 Dan SuciuTools for XML Data Exchange RXL: Query composition system composes query with view: from Store, SB, Book where Store.sid=SB.sid and SB.bid=Book.bid and Book.title=“The Calculus” construct Store.name from Store, SB, Book where Store.sid=SB.sid and SB.bid=Book.bid and Book.title=“The Calculus” construct Store.name StoreSBBook n1... n2... … RXLXML-QL

17 Dan SuciuTools for XML Data Exchange Compressing XML Data for exchange and archiving can use general tool (gzip) but specialized tool twice as good (Xmill)

18 Dan SuciuTools for XML Data Exchange Xmill Example: Weblogs 202.239.238.16|GET / HTTP/1.0|text/html|200|1997/10/01-00:00:02|-|4478 |-|-|http://www02.so-net.or.jp/|Mozilla/3.01 [ja] (Win95; I) 202.239.238.16 GET / HTTP/1.0 text/html 200 1997/10/01-00:00:02 4478 http://www02.so-net.or.jp/ Mozilla/3.01 [ja] (Win95; I)

19 Dan SuciuTools for XML Data Exchange Xmill Example: Weblogs weblog.dat:15.9MBweblog.dat.gz:1.6MB weblog.xml:24.2MBweblog.xml.gz:2.1MB weblog1.xmi:1.75MB weblog2.xmi:1.33MB weblog3.xmi:0.82MB xmill -p // weblog.xml weblog1.xmi xmill weblog.xml weblog2.xmi xmill -f settings.pz weblog.xml weblog3.xmi

20 Dan SuciuTools for XML Data Exchange Xmill: Fine Tuning the Compression -p//apache:host=>seqcomb(u8 "." u8 "." u8 "." u8) -p//apache:userAgent=>seq(e "/" e) -p//apache:byteCount=>u -p//apache:statusCode=>e -p//apache:contentType=>e -p//apache:requestLine=>seq("GET " rep("/" e) " HTTP/1." e) -p//apache:date=>seq(u "/" u8 "/" u8 "-" u8 ":" di ":" di) -p//apache:referer=>or(seq("file:" t) seq("http://" or(seq(rep("." e) "/" rep("/" e)) rep("." e))) t) -p//apache:host=>seqcomb(u8 "." u8 "." u8 "." u8) -p//apache:userAgent=>seq(e "/" e) -p//apache:byteCount=>u -p//apache:statusCode=>e -p//apache:contentType=>e -p//apache:requestLine=>seq("GET " rep("/" e) " HTTP/1." e) -p//apache:date=>seq(u "/" u8 "/" u8 "-" u8 ":" di ":" di) -p//apache:referer=>or(seq("file:" t) seq("http://" or(seq(rep("." e) "/" rep("/" e)) rep("." e))) t)

21 Dan SuciuTools for XML Data Exchange Storing XML Data Scenario: –receive a large XML data instance –want to store, manage it Could build an XML management system from scratch (eXcelon) Preferably: use existing database systems

22 Dan SuciuTools for XML Data Exchange &o1 &o3 &o2 &o4&o5 paper title author year &o6 “The Calculus”“…” “1986” Storing XML: Ternary Relation [Florescu, Kossman 1999] Ref Val

23 Dan SuciuTools for XML Data Exchange Storing XML: Derive Schema from DTD DTD: ODMG classes: [Christophides et al. 1994, Shanmugasundaram et al. 1999] class Employee public type tuple (name:string, address:Address, project:List(Project)) class Address public type tuple (street:string, …)

24 Dan SuciuTools for XML Data Exchange STORED Approach: Mine Data to Derive Schema paper author title year fn ln Paper1 Paper2 [Deutsch et al. 1999]

25 Dan SuciuTools for XML Data Exchange Summary XML - simple (?), lightweight syntax Challenge: build bridges to existing database tools XML in data exchange: YES XML as a new data model: NO

26 Dan SuciuTools for XML Data Exchange More Info http://www.research.att.com/~suciu Data on the Web: From Relational to Semistructured to XML Morgan Kaufmann, 1999


Download ppt "Dan SuciuTools for XML Data Exchange Dan Suciu AT&T Labs Joint work with Mary Fernandez."

Similar presentations


Ads by Google