Download presentation
Presentation is loading. Please wait.
Published byVirgil Floyd Modified over 9 years ago
1
MonetDB/XQuery Technology Preview 1 Stefan Manegold CWI Amsterdam http://monetdb.cwi.nl/ - http://pathfinder-xquery.org/
2
European Pathfinder Team University of Konstanz (Germany) –Torsten Grust, Jens Teubner, Jan Rittinger University of Twente (Netherlands) –Maurice van Keulen, Jan Flokstra CWI, Amsterdam (Netherlands) –Peter Boncz, Stefan Manegold, Sjoerd Mullender Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
3
Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
4
Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery Results: Performance (1)
5
did not finish Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery Results: Performance (2)
6
Story XQuery Example Relational XQuery –System Architecture –XML Encoding Science & Reseach Scalability Outlook –Conclusions –Roadmaps –Release & References Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
7
For each author, return number of books and receipts for books published in the past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml”), (: Documents :) $sales := fn:doc(“www.publishersweekly.com/sales.xml”) for $author in distinct-values($cat//author) (: Grouping :) let $books := $cat//book[@year >= 2003 and author = $a], (: Sel. :) $receipts := $sales/book[@isbn = $books/@isbn]/receipts (: Join :) order by $author (: Ordering :) return (: XML Construction :) { $author } { fn:count($books) } (: Aggregation :) { fn:sum($receipts) } Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery XQuery Example
8
For each author, return number of books and receipts for books published in the past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml”), Documents $sales := fn:doc(“www.publishersweekly.com/sales.xml”) for $author in distinct-values($cat//author) Grouping let $books := $cat//book[@year >= 2003 and author = $a], Sel. $receipts := $sales/book[@isbn = $books/@isbn]/receipts Join order by $author Ordering return XML Construction { $author } { fn:count($books) } Aggregation { fn:sum($receipts) } Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery XQuery Example
9
XQuery Systems: 2 Approaches Existing “native” XML/XQuery systems are built from scratch –Galax, Saxon, … –X-Hive, Tamino, … –(Still have to) re-invent optimization technology Our approach: –Build XQuery system on top of an RDBMS –Leverage mature relational technology to achieve efficient XQuery processing Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
10
Architecture Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
11
XML in an RDBMS: XPath Accelerator Node-based relational encoding of XQuery's data model Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery 1.f /descendant:SELECT * FROM pre_post WHERE pre > f.pre AND post < f.post 2.f /ancester:SELECT * FROM pre_post WHERE pre f.post 3.f /preceeding:SELECT * FROM pre_post WHERE pre < f.pre AND post < f.post 4.f /following:SELECT * FROM pre_post WHERE pre > f.pre AND post > f.post
12
Science & Research More research lead to more optimization –Join Recognition –Embedded XPath processing –Order Awareness Various scientific publications Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
13
Results: Scalability (3) Unsurpassed scalability Standard Opteron PC, 8GB RAM, 64-bit Linux Can process 11GB documents! Mostly linear scaling with document size Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
14
Conclusions Relational approach Works Is fast Is scalable Crucial Optimizations –Join recognition –Embedded XPath processing –Order awareness Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
15
Roadmap 30-05-05: MonetDB/XQuery 4.8/0.8 “Mercurius” –Developers Release / Technology Preview 1 30-09-05: MonetDB/XQuery 4.10/0.10 “Venus” –Student Release / Technology Preview 2 –XUpdate, Algebraic Query Optimization 30-12-05: MonetDB/XQuery 4.12/1.12 “Mars” –Final Release –Application Programming Interfaces –End-User Front-Ends Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
16
Open Source Release MonetDB + Pathfinder on SourceForge –Mozilla-like License MonetDB homepage –http://monetdb.cwi.nl/ Pathfinder homepage –http://pathfinder-xquery.org/ Developers website –http://sf.net/projects/monetdb/ Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
17
Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
18
Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
19
Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
20
Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
21
Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
22
Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
23
Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
24
Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
25
XML Standard, flexible syntax for data exchange –Regular, structured data Database content of all kinds: Inventory, billing, orders, … “Small” typed values –Irregular, unstructured text Documents of all kinds: Transcripts, books, legal briefs, … “Large” untyped values Lingua franca of B2B Applications… –Increase access to products & services –Integrate disparate data sources –Automate business processes … and numerous other application domains –Bio-informatics, library science, … Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
26
XML : A First Look XML document describing catalog of books No Such Thing as a Bad Day Hamilton Jordan Longstreet Press, Inc. 17.60 Publisher : This book is the moving account of one man's successful battles against three cancers... No Such Thing as a Bad Day is warmly recommended. Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
27
XQuery 1.0 Functional, strongly-typed query language XQuery 1.0 = XPath 2.0 for navigation, selection, extraction + A few more expressions For-Let-Where-Order By-Return (FLWOR) XML construction Operators on types + User-defined functions & modules + Strong typing Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
28
XSLT vs. XQuery XSLT 1.0: XML XML, HTML, Text –Loosely-typed scripting language –Format XML in HTML for display in browser –Must be highly tolerant of variability/errors in data XQuery 1.0: XML XML –Strongly-typed query language –Large-scale database access –Must guarantee safety/correctness of operations on data Over time, XSLT & XQuery may both serve needs of many application domains XQuery will become a hidden, commodity language Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
29
XQuery Example For each author, return number of books and receipts books published in past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml“), Joinwww.bn.com/catalog.xml $sales := fn:doc(“www.publishersweekly.com/sales.xml“)www.publishersweekly.com/sales.xml for $author in distinct-values($cat//author) Grouping let $books := $cat//book[@year >= 2000 and author = $a], S.J. $receipts := $sales/book[@isbn = $books/@isbn]/receipts order by $author Ordering return XML Construction { $author } { fn:count($books) } Aggregation { fn:sum($receipts) } Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
30
Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
31
XQuery Systems: 2 Approaches Tree-based –Tree is basic data structure Also on disk (if an XQuery DBMS) –Navigational Approach Galax [Simeon..], Flux [Koch..], X-Hive –Tree Algebra Approach TIMBER [Jagadish..] Relational –Data shredded in relational tables –XQuery translated into database query (e.g. SQL) Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
32
The Pathfinder Project Challenge / Goal: –Turn RDBMSs into efficient XQuery engines People: –Torsten Grust, Jens Teubner University of Konstanz (June 2005: Technical University of Munich) –Maurice van Keulen University of Twente –Jan Rittinger University of Konstanz & CWI Task: generate code for MonetDB Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
33
The Pathfinder Project Challenge / Goal: –Turn RDBMSs into efficient XQuery engines People: –Torsten Grust, Jens Teubner,... University of Konstanz (June 2005: Technical University of Munich) –Maurice van Keulen, Jan Flokstra,... University of Twente –Jan Rittinger University of Konstanz & CWI Task: generate code for MonetDB –Peter Boncz, Stefan Manegold, Sjoerd Mullender,... CWI, Amsterdam Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
34
MonetDB: Applied CS Research at CWI a decade of “query-intensive” application experience image retrieval: Peter Bosch ImageSpotter audio/video retrieval: Alex van Ballegooij RAM XML text retrieval: de Vries / Hiemstra TIJAH biological sequences: Arno Siebes BRICKS XML databases: Albrecht Schmidt XMark Grust / vKeulen Pathfinder GIS: Wilco Quak MAGNUM data warehousing / OLAP / data mining SPSS DataDistilleries Univ. Massachussetts PROXIMITY CWI research group successfully spun off DataDistilleries (now SPSS) Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
35
Pathfinder — MonetDB Pathfinder MonetDB Parser Sem. Analysis Core Translation Typechecking Relational Algebra Database MIL SQL Parser Sem. Analysis Core Translation Typechecking Database MIL Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
36
Pathfinder — MonetDB Pathfinder MonetDB Parser Sem. Analysis Core Translation Typechecking Relational Algebra Database MIL SQL Core to MIL Translation Parser Sem. Analysis Core Translation Typechecking Database MIL Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
37
Open Source MonetDB + Pathfinder on Sourceforge – Mozilla License MonetDB Homepage – http://monetdb.cwi.nl/ http://monetdb.cwi.nl Pathfinder Homepage – http://pathfinder-xquery.org/ Developers website: – http://sf.net/projects/monetdb/ RoadMap 14-apr-04: initial Beta release MonetDB/SQL 30-sep-04: first official release MonetDB/SQL 30-may-05: Developer release of MonetDB/XQuery (i.e. Pathfinder) 30-sep-05: Student release of MonetDB/XQuery (incl. XUpdate) 30-dec-05: Users release of MonetDB/XQuery (?) Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
38
MonetDB: extensible architecture Front-end/back-end: support multiple data models support multiple end- user languages support diverse application domains Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
39
Front-end/back-end: support multiple data models support multiple end- user languages support diverse application domains Pathfinder XQuery Frontend MonetDB: extensible architecture Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
40
Architecture Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
41
The Architecture Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
42
Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
43
XPath on an RDBMS Node-based relational encoding of XQuery's data model Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
44
Pre/Post Pre/Level/Size done for better skipping and updates Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
45
Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
46
Order Prevention [VLDB03 Wang&Cherniack] define: Order properties of relations Order propagation rules for relational operators Decoration of physical plans with order properties eliminate sort Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
47
–For loop map with all combinations O(N*N) –If `simple’ condition exist on two loop variables join –Only make a map with the matching combinations –E.g. with Hash-Table O(N) Performed on the XCore tree Recognize if-then expressions Open question: where to optimize best?? Join Recognition for $p in $auction/site/people/person for $t in $auction/site/closed_auctions/closed_auction where $t/buyer/@person = $p/@id return $t Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
48
Loop-lifted staircase join Staircase join [VLDB03]: –Single-pass for a *set* of context nodes Loop-lifting multiple iters multiple sets of context nodes –elaborate skipping! –Loop-Lifted Staircase Join In a single pass: process multiple input context node lists –Use a stack –Exploit axis properties for pruning Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
49
Scalability Test platform Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3) Can process 11GB document! Mostly linear scaling with document size Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
50
Scalability Test platform Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3) Can process 11GB document! Mostly linear scaling with document size Some swapping in the join queries Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
51
Scalability Test platform Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3) Can process 11GB document! Mostly linear scaling with document size Some swapping in the join-queries Q11 + Q12 generate quadratic result Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
52
XMark 10MB : Pathfinder vs XHive & Galax Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
53
XMark 1GB: Pathfinder vs X-Hive did not finish Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
54
Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
55
Conclusions Relational approach can be scalable & fast Crucial Optimizations –Join recognition –Loop-lifted XPath steps –Order awareness Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
56
Conclusions Relational approach can be scalable & fast Crucial Optimizations –Join recognition –Loop-lifted XPath steps –Order awareness Future Roadmap (Scientific/Research): Algebraic Query Optimization Updates Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
57
Product Roadmap 30-05-05: MonetDB/XQuery 4.8/0.8 “Mercurius” –Developers Release / Technology Preview 1 30-09-05: MonetDB/XQuery 4.10/0.10 “Venus” –Student Release / Technology Preview 2 –XUpdate, Algebraic Query Optimization 30-12-05: MonetDB/XQuery 4.12/1.12 “Mars” –Final Release –Application Programming Interfaces –End-User Front-Ends Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
58
Loop-lifted staircase join document List of context nodesActive stack Multiple lists of context nodes Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
59
Staircase join document List of context nodes Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.