Combined Static and Dynamic Analysis for Effective Buffer Minimization in Streaming XQuery Evaluation Michael Schmidt Stefanie Scherzinger Christoph Koch.

Slides:



Advertisements
Similar presentations
Inside an XSLT Processor Michael Kay, ICL 19 May 2000.
Advertisements

Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA.
Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
XML: Extensible Markup Language
Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.
XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
Jianxin Li, Chengfei Liu, Rui Zhou Swinburne University of Technology, Australia Wei Wang University of New South Wales, Australia Top-k Keyword Search.
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams Bernhard Stegmaier (TU München) Joint work with.
TIMBER A Native XML Database Xiali He The Overview of the TIMBER System in University of Michigan.
An Algorithm for Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T. J.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Xyleme A Dynamic Warehouse for XML Data of the Web.
RAINDROP: XML Stream Processing Engine Murali Mani, DB seminar June 08, 2006 Partially Supported by NSF grant IIS
Querying XML (cont.). Comments on XPath? What’s good about it? What can’t it do that you want it to do? How does it compare, say, to SQL?
QSX (LN 3)1 Query Languages for XML XPath XQuery XSLT (not being covered today!) (Slides courtesy Wenfei Fan, Univ Edinburgh and Bell Labs)
A Transducer-Based XML Query Processor Bertram Ludäscher, SDSC/CSE UCSD Pratik Mukhopadhyay, CSE UCSD Yannis Papakonstantinou, CSE UCSD.
1 Efficient XML Stream Processing with Automata and Query Algebra A Master Thesis Presentation Student: Advisor: Reader: Jinhui Jian Prof. Elke A. Rundensteiner.
1 COS 425: Database and Information Management Systems XML and information exchange.
Database Systems and XML David Wu CS 632 April 23, 2001.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
Buffering in Query Evaluation over XML Streams Ziv Bar-Yossef Technion Marcus Fontoura Vanja Josifovski IBM Almaden Research Center.
Xpath to XQuery February 23rd, Other Stuff HW 3 is out. Instructions for Phase 3 are out. Today: finish Xpath, start and finish Xquery. From Wednesday:
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
JSP Standard Tag Library
Πανεπιστήμιο Κρήτης Σχολή Θετικών Επιστημών Τμήμα Επιστήμης Υπολογιστών ΗΥ-561: Διαχείριση Δεδομένων στον Παγκόσμιο Ιστό Xquery Streaming à la Carte &
Xpath Query Evaluation. Goal Evaluating an Xpath query against a given document – To find all matches We will also consider the use of types Complexity.
1 Distributed Monitoring of Peer-to-Peer Systems By Serge Abiteboul, Bogdan Marinoiu Docflow meeting, Bordeaux.
XML-QL A Query Language for XML Charuta Nakhe
IBM Research © 2005 IBM Corporation XJ: Robust XML Processing in Java™ Mukund Raghavachari, Rajesh Bordawekar, Michael Burke, and Igor Peshansky IBM T.
Efficient Evaluation of XQuery over Streaming Data Xiaogang Li Gagan Agrawal The Ohio State University.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Buffering in Query Evaluation over XML Streams Ziv Bar-Yossef Technion Marcus Fontoura Vanja Josifovski IBM Almaden Research Center.
Extracting Relations from XML Documents C. T. Howard HoJoerg GerhardtEugene Agichtein*Vanja Josifovski IBM Almaden and Columbia University*
XML Prefiltering as a String Matching Problem Christoph Koch 1, Stefanie Scherzinger 2, Michael Schmidt 3 1 Cornell University 2 IBM Boeblingen 3 Freiburg.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
Pattern tree algebras: sets or sequences? Stelios Paparizos, H. V. Jagadish University of Michigan Ann Arbor, MI USA.
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
Querying Structured Text in an XML Database By Xuemei Luo.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
EXist Indexing Using the right index for you data Date: 9/29/2008 Dan McCreary President Dan McCreary & Associates (952) M.
Optimization in XSLT and XQuery Michael Kay. 2 Challenges XSLT/XQuery are high-level declarative languages: performance depends on good optimization Performance.
Database Systems Part VII: XML Querying Software School of Hunan University
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
BNCOD07Indexing & Searching XML Documents based on Content and Structure Synopses1 Indexing and Searching XML Documents based on Content and Structure.
Tree-Pattern Queries on a Lightweight XML Processor MIRELLA M. MORO Zografoula Vagena Vassilis J. Tsotras Research partially supported by CAPES, NSF grant.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
XML May 6th, Instructor AnHai Doan Brief bio –high school in Vietnam & undergrad in Hungary –M.S. at Wisconsin –Ph.D. at Washington under Alon &
Page 1 A Platform for Scalable One-pass Analytics using MapReduce Boduo Li, E. Mazur, Y. Diao, A. McGregor, P. Shenoy SIGMOD 2011 IDS Fall Seminar 2011.
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
EJBs +XML + Integrity Constraints Data-Object Modeling and Optimization (DOMO) June 2003 Rajesh Bordawekar, Michael Burke, Mukund Raghavachari, Vivek Sarkar,
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
Efficient Discovery of XML Data Redundancies Cong Yu and H. V. Jagadish University of Michigan, Ann Arbor - VLDB 2006, Seoul, Korea September 12 th, 2006.
Lecture 17: XPath and XQuery Wednesday, Nov. 7, 2001.
Processing XML Streams with Deterministic Automata Denis Mindolin Gaurav Chandalia.
Dan SuciuXML Toolkit1 XMLTK: An XML Toolkit for Scalable XML Stream Processing I. Avila-Campillo, T.J. Green, A. Gupta, M. Onizuka, D. Raven, D. Suciu.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
Querying Structured Text in an XML Database Shurug Al-Khalifa Cong Yu H. V. Jagadish (University of Michigan) Presented by Vedat Güray AFŞAR & Esra KIRBAŞ.
ADT 2010 MonetDB/XQuery (2/2): High-Performance, Purely Relational XQuery Processing Stefan Manegold.
Efficient Evaluation of XQuery over Streaming Data
Compressing XML Documents with Finite State Automata
RE-Tree: An Efficient Index Structure for Regular Expressions
OrientX: an Integrated, Schema-Based Native XML Database System
(b) Tree representation
Early Profile Pruning on XML-aware Publish-Subscribe Systems
Adaptive Query Processing (Background)
More XML XML schema, XPATH, XSLT
Presentation transcript:

Combined Static and Dynamic Analysis for Effective Buffer Minimization in Streaming XQuery Evaluation Michael Schmidt Stefanie Scherzinger Christoph Koch Saarland University Database Group Saarbrücken, Germany 2007 IEEE 23rd International Conference on Data Engineering - April 17, 2007

2 Outline I. Streaming XQuery Evaluation I. Streaming XQuery Evaluation –Motivation and Requirements –Desiderata to streaming and in-memory XQuery Engines –Existing Approaches II. Combining Static and Dynamic Buffer Minimization II. Combining Static and Dynamic Buffer Minimization –Query Normalization –The Concept of Roles –Active Garbage Collection –System Architecture –Optimizations III. The GCX XQuery Engine III. The GCX XQuery Engine –Prototype Implementation –Benchmark Result IV. Summary IV. Summary

3 Motivation and Requirements Growing importance of streaming XML processing comes along with the profileration of the WWW Growing importance of streaming XML processing comes along with the profileration of the WWW Streams may arrive at very high rates Streams may arrive at very high rates storing incoming data to disk often unfeasible Main memory DOM tree representation of XML documents very space-consuming Main memory DOM tree representation of XML documents very space-consuming buffer management becomes the key prerequisite to performance Problem becomes even more urgent when evaluating (powerful fragments of) XQuery rather than simple filters on data streams Problem becomes even more urgent when evaluating (powerful fragments of) XQuery rather than simple filters on data streams Streaming techniques very useful for in-memory XQuery enginges Streaming techniques very useful for in-memory XQuery enginges I.

4 Desiderata for in-memory XQuery Engines (1) Only buffer data that is relevant for query evaluation (2) Avoid multiple copies of the data in main memory (3) Do not keep data buffered longer than necessary Claim: Combination of static and dynamic analysis required to satisfy all desiderata I.

5 (1) Only buffer data that is relevant for query evaluation Document Projection  Statical query analysis  Detect parts of the document that are relevant to query evaluation  Project away those parts of the document that are not relevant to query evaluation Existing Approaches (1) A. Marian and J. Siméon “Projecting XML Documents” In Proc. VLDB’03, pages 213–224, S. Bréssan, B. Catania, Z. Lacroix, Y. G. Li and A. Maddalena “Accelerating Queries by Pruning XML Documents” TKDE, 54(2):211–240, V. Benzaken, G. Castagna, D. Colazzo, and K. Nguyen “Type-Based XML Projection” In Proc. VLDB’06, I.

6 Existing Approaches (2) Document Projection { for $b in /bib/book where ($b/author= “ A. Turing ” and fn:exists($b/price)) return $b/title } XQuery Projection Paths { /bib/book, /bib/book/author/ dos::node(), /bib/book/price, /bib/book/title/ dos::node() } bib book authorpricetitle book authorpricetitle … … … … article ……… isbn …… … … XML document I. dos:=descendant-or-self

7 Existing Approaches (3) (2) Avoid multiple copies of the data in main memory (3) Do not keep data buffered longer than necessary Hard to satisfy both paradigms in combination { for $x1 in //book return for $x2 in //* return for $x3 in //article return } XQuery Two approaches: (1) Single DOM-tree (2) Buffers for variables I.

8 The Big Picture II. XQuery Normalized XQuery Projection Tree Roles Buffer (nodes annotated with roles) input stream Evaluator output stream Rewritten XQuery (role updates) transformation, extraction input, output communication variable bindings role removals, active garbage collection

9 Query Normalization (1) Rewriting where-expressions to if-statements (2) Pushing down if-statements { for $b in /bib where (fn:exists($b/book)) return { $b/book } } { for $b in /bib return ( if (fn:exists($b/book)) then else (), if (fn:exists($b/book)) then $b/book else (), if (fn:exists($b/book)) then else () ) } II.

10 Deriving Roles { for $bib in /bib return (for $x in $bib/* return if (not(fn:exists($x/price))) then $x else (), for $b in $bib/book return $b/title ) } /bib /*/book / /title/dos::node()/price[1]dos::node() r1r1r1r1/ r2r2r2r2/bib$bib r3r3r3r3/bib/*$x r4r4r4r4/bib/*/price[1]$x/price r5r5r5r5/bib/*/dos::node()$x r6r6r6r6/bib/book$b r7r7r7r7/bib/book/title/dos::node()$b/title II.

11 Assigning Roles Matching document nodes get assigned roles when projected into the buffer Matching document nodes get assigned roles when projected into the buffer Roles assigned on-the-fly while reading the input Roles assigned on-the-fly while reading the input Nodes without roles and role-carrying ancestors need not to be buffered (projection) Nodes without roles and role-carrying ancestors need not to be buffered (projection) bib book author title { r 2 } { r 3, r 5, r 6 } { r 5 }{ r 5, r 7 } r 1 / r 2 /bib r 3 /bib/* r 4 /bib/*/price[1] r 5 /bib/*/dos::node() r 6 /bib/book r 7 /bib/book/title/dos::node() XML documentRoles II.

12 Inserting Role Updates { for $bib in /bib return (for $x in $bib/* return if (not(fn:exists($x/price))) then $x else (), for $b in $bib/book return $b/title) } { for $bib in /bib return ( for $x in $bib/* return ( if (not(exists($x/price))) then $x else (), signOff($x,r3), signOff($x/price[1],r4), signOff($x/dos::node(),r5) ), for $b in $bib/book return ( $b/title, signOff($b,r6), signOff($b/title/dos::node(),r7))) ), signOff($bib,r2) ) } r 1 / r 2 /bib$bib r 3 /bib/*$x r 4 /bib/*/price[1]$x/price r 5 /bib/*/dos::node() $x r 6 /bib/book$b r 7 /bib/book/title/dos::node()$b/title II.

13 Active Garbage Collection { for $bib in /bib return ( for $x in $bib/* return ( if (not(exists($x/price))) then $x else (), signOff($x,r3), signOff($x/price[1],r4), signOff($x/dos::node(),r5) ), for $b in $bib/book return ( $b/title, signOff($b,r6), signOff($b/title/dos::node(),r7))) ), signOff($bib,r2) ) } Buffer: Output stream: Input stream: … bib book title {r 2 } {r 3, r 5, r 6 } {r 5, r 7 } author {r 5 } {r 5, r 6 } {r 7 }{} {r 6 } II.

14 { for $bib in /bib return (for $_1 in $bib/book (return $_1/book, signOff($_1/book/dos::node(),r 2 )), signOff($bib,r 1 )) } { for $bib in /bib return for $_1 in $bib/book return $_1/book } Optimizations Rewrite path steps to for-expressions Rewrite path steps to for-expressions Use aggregated roles Use aggregated roles Remove redundant roles Remove redundant roles { for $bib in /bib return $bib/book } { for $bib in /bib (return $bib/book, signOff($bib,r 1 ), signOff($bib/book/dos::node(),r 2 )) } II.

15 Garbage Collected XQuery Garbage Collected XQuery Implemented in C++ for a fragment of composition-free XQuery Implemented in C++ for a fragment of composition-free XQuery –Arbitrary nested single step for-loops –FWR-expressions –Child and descendant axes –Node-tests for tags, wildcards, node(), text() –If-expressions with and, or, not, fn:exists –Let/some-expressions and aggregations not yet supported –No support for attributes (no restriction) Open Source (Berkeley Software Distribution Licence) Open Source (Berkeley Software Distribution Licence) GCX project page: GCX project page: GCX download page: GCX download page: III. The GCX XQuery Engine

16 Benchmark Results (1) Time and memory consumption Time and memory consumption Queries and documents from the XMark Benchmark Queries and documents from the XMark Benchmark Queries and documents modified to match the supported fragment Queries and documents modified to match the supported fragment 3GHz CPU Intel Pentium IV with 2GB RAM 3GHz CPU Intel Pentium IV with 2GB RAM SuSe Linux 10.0, J2RE v1.4.2 for Java-based systems SuSe Linux 10.0, J2RE v1.4.2 for Java-based systems Time limit: 1 hour Time limit: 1 hour Benchmarks against the following systems Benchmarks against the following systems –FluX Java in-memory engine for streaming XQuery evaluation. –MonetDB v4.12.0/XQuery v A secondary storage engine written in C++. Loading of the document is included in time measurements. –QizX/open v1.1 Free in-memory XQuery engine written in Java. –Saxon v8.7.1 Free in-memory XQuery engine written in Java. III.

17 Benchmark Results (2) { for $s in /site return for $p in $s/people return for $pe in $pe/person return if ($pe/person_id="person0") then { $pe/name } else () } XMark Q1: Running time (s) III.

18 Benchmark Results (3) Memory Consumption (MB) { for $s in /site return for $p in $s/people return for $pe in $pe/person return if ($pe/person_id="person0") then { $pe/name } else () } XMark Q1: III.

19 Benchmark Results (4) { for $root in (/) return for $site in $root/site return for $people in $site/people return for $person in $people/person return { ( { $person/name }, { for $site2 in $root/site return for $cas in $site2/closed_auctions return for $ca in $cas/closed_auction return for $buyer in $ca/buyer return if ($buyer/buyer_person=$person/person_id) then { $ca } else () } ) } XMark Q8: III.

20 Benchmark Results (5) XMark Q8 Running time (s) Memory Consumption (MB) Failure for 100MB: MonetDB – Failure for 200MB: GCX, FluxQuery, MonetDB III.

21 Summary Combination of static and dynamic buffer minimization Combination of static and dynamic buffer minimization Roles are derived from the XQuery and assigned to matching document nodes in the preprojection phase Roles are derived from the XQuery and assigned to matching document nodes in the preprojection phase XQuery expression statically rewritten: at runtime, signOff-statements cause buffered nodes to lose roles XQuery expression statically rewritten: at runtime, signOff-statements cause buffered nodes to lose roles An active garbage collection mechanism removes nodes from buffers that have lost their last role An active garbage collection mechanism removes nodes from buffers that have lost their last role Document projection integrated in the role concept Document projection integrated in the role concept Technique behaves very well for composition-free XQuery w.r.t. execution time and memory consumption Technique behaves very well for composition-free XQuery w.r.t. execution time and memory consumption Applicable in streaming contexts, but also useful for common in-memory XQuery engines Applicable in streaming contexts, but also useful for common in-memory XQuery engines IV.

22 Thank you for your attention!

Z. Bar-Yossef, M. Fontoura, and V. Josifovski “On the Memory Requirements of XPath Evaluation over XML Streams” In Proc. PODS’04, pages 177–188, 2004 M. Benedikt, W. Fan, and F. Geerts “XPath Satisfiability in the Presence of DTDs” In Proc. PODS, pages 25–36, 2005 V. Benzaken, G. Castagna, D. Colazzo, and K. Nguyen “Type-Based XML Projection” In Proc. VLDB’06, 2006 S. Bréssan, B. Catania, Z. Lacroix, Y. G. Li and A. Maddalena “Accelerating Queries by Pruning XML Documents” TKDE, 54(2):211–240, 2005 L. Fegaras, R. Dash, and Y. Wang “A Fully Pipelined XQuery Processor” In XIME-P, 2006 L. Fegaras, D. Levine, S. Bose, and V. Chaluvadi “Query Processing of Streamed XML Data” In Proc. CIKM 2002, pages 126–133, 2002 T. J. Green, G. Miklau, M. Onizuka, and D. Suciu “Processing XML Streams with Deterministic Automata” In Proc. ICDT’03, pages 173–189, 2003 C. Koch “On the complexity of nonrecursive XQuery and functional query languages on complex values” ACM Transactions on Database Systems, 31(4), 2006 C. Koch, S. Scherzinger, N. Schweikardt, and B. Stegmaier “Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams” In Proc. VLDB’04, pages 228–239, 2004 X. Li and G. Agrawal “Efficient evaluation of XQuery over streaming data” In Proc. VLDB’05, pages 265–276, 2005 A. Marian and J. Siméon “Projecting XML Documents” In Proc. VLDB’03, pages 213–224, 2003 D. Olteanu, H. Meuss, T. Furche, and F. Bry “XPath: Looking Forward” In EDBT 02: Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers, pages 109–127, 2002 D. Olteanu, T. Kiesling, and F. Bry “An Evaluation of Regular Path Expressions with Qualifiers against XML Streams” In Proc. ICDE’03, page 702, 2003 H. Su, E. A. Rundensteiner, and M. Mani “Semantic Query Optimization for XQuery over XML Streams” In Proc. VLDB, pages 277–288, 2005 P. R. Wilson “Uniprocessor Garbage Collection Techniques” In Proc. IWMM’92, pages 1–42, 1992

24 Additional Resources

25 Full Benchmark Results GCXFluxQueryGalaxMonetDBSaxonQizx/open Q1 10MB0.18s / 1.2MB1.59s / 50MB5.45s / 186MB0.86s / 30MB1.48s / 80MB1.20s / 38MB 50MB0.92s / 1.2MB3.96s / 111MB42.33s / 880MB3.69s / 98MB4.29s / 292MB3.74s / 195MB 100MB1.87s / 1.2MB6.94s / 111MB02:07m / 1,8GB7.19s / 225MB7.96s / 547MB6.56s / 285MB 200MB3.53s / 1.2MB12.27s / 111MBtimeout13.60s / 244MB14.30s / 973MB11.82s / 480MB Q6 10MB0.34s / 1.2MBn/a7.66s / 240MB0.98s / 29MB1.73s / 82MB1.56s / 33MB 50MB1.68s / 1.2MBn/a57.98s / 1.2GB5.06s / 111MB5.78s / 292MB6.13s / 169MB 100MB3.33s / 1.2MBn/a5:08m / 2GB9.94s / 253MB10.85s / 622MB11.74s / 484MB 200MB6.42s / 1.2MBn/atimeout19.95s / 337MB20.14s / 1.2GB20.33s / 805MB Q8 10MB13.15s / 9.8MB18.04s / 128MB01:04m / 377MB02:56m / 407MB6.61s / 145MB9.89s / 148MB 50MB05:13m / 43MB06:51m / 169MB33:08m / 1.8GB03:26m / 1.35GB02:02m / 352MB03:38m / 265MB 100MB22:07m / 86MB27:01m / 216MBtimeout-08:39m / 650MB14:27m / 397MB 200MBtimeout -32:43m / 1.15GB52:05m / 636MB Q13 10MB0.17s / 1.2MB1.60s / 52MB5.92s / 182MB0.80s / 31MB1.53s / 48MB1.26s / 28MB 50MB0.85s / 1.2MB3.98s / 111MB43.91s / 899MB3.64s / 98MB4.45s / 292MB3.85s / 195MB 100MB1.69s / 1.2MB7.00s / 111MB02:04m / 1.8GB7.34s / 224MB8.35s / 547MB6.81s / 285MB 200MB3.24s / 1.2MB12.33s / 111MBtimeout13.52s / 271MB15.02s / 1.05GB12.30s / 480MB Q20 10MB0.25s / 1.2MB1.65s / 48MB6.95s / 215MB0.85s / 34MB1.65s / 62MB1.43s / 39MB 50MB1.24s / 1.2MB4.19s / 111MB53.08s / 1,5GB4.17s / 120MB4.90s / 292MB4.18s / 195MB 100MB2.48s / 1.2MB7.37s / 111B03:14m / 2GB8.47s / 247MB9.13s / 622MB8.71s / 350MB 200MB4.74s / 1.2MB13.14s / 111MBtimeout16.40s / 296MB16.58s / 1.15GB15.80s / 628MB

26 Benchmark Queries (1) { for $s in /site return for $p in $s/people return for $pe in $pe/person return if ($pe/person_id="person0") then { $pe/name } else () } { for $site in //site return for $regions in $site/regions return $regions//item }

27 Benchmark Queries (2) { for $root in (/) return for $site in $root/site return for $people in $site/people return for $person in $people/person return { ( { $person/name }, { for $site2 in $root/site return for $cas in $site2/closed_auctions return for $ca in $cas/closed_auction return for $buyer in $ca/buyer return if ($buyer/buyer_person=$person/person_id) then { $ca } else () } ) }

28 Benchmark Queries (3) { for $site in /site return for $regions in $site/regions return for $australia in $regions/australia return for $item in $australia/item return { ( { $item/name }, { $item/description } ) } }

29 Benchmark Queries (4) { for $site in /site return for $people in $site/people return for $person in $people/person return if (fn:not(fn:exists($person/person_income))) then $person else () }

30 Buffer Plot (1) { for $site in //site return for $regions in $site/regions return $regions//item } Buffer plot for XMark Q6 on 10MB input document According to the DTD: all regions occur at the beginning of the document

31 Buffer Plot (2) { for $root in (/) return for $site in $root/site return for $people in $site/people return for $person in $people/person return { ( { $person/name }, { for $site2 in $root/site return for $cas in $site2/closed_auctions return for $ca in $cas/closed_auction return for $buyer in $ca/buyer return if ($buyer/buyer_person=$person/person_id) then { $ca } else () } ) } Buffer plot for XMark Q8 on 10MB input document first partition of join partners: persons second partition of join partners: buyers

32 Buffer Plot (3) { for $bib in /bib return (for $x in $bib/* return if (not(exists($x/price))) then $x else (), for $b in $bib/book return $b/title) } XQuery bib (book|article)* title author price 9 x article + 1 x book 9 x book + 1 x article

33 The GCX Runtime Engine Stream Preprojector Buffer Manager Evaluator XQuery input stream output stream nodes/roles node lookup garbage collection node/eos signOff($x/π,r) OK node/NULL getNext($x/π) Buffer nextNode()

34 System Architecture XQuery Normalized XQuery Evaluator Buffer (nodes & roles) role updates input input stream output stream Stream Preprojector Rewritten XQuery (role updates) Projection Paths Projection DFA ( constructed lazily, assigns roles) Roles