MonetDB/XQuery Technology Preview 1 Stefan Manegold Centrum voor Wiskunde en Informatica Amsterdam -
European Pathfinder Team CWI, Amsterdam (Netherlands) – Peter Boncz, Stefan Manegold, Sjoerd Mullender University of Twente (Netherlands) – Maurice van Keulen, Jan Flokstra University of Konstanz (Germany) – Torsten Grust, Jens Teubner, Jan Rittinger Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
Results: Performance (1) Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery XMark benchmark, 110 MB: MonetDB/XQuery vs. X-Hive & Galax
Results: Performance (2) Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery XMark benchmark, 1.1 GB: MonetDB/XQuery vs. X-Hive
Story XQuery Example Relational XQuery –System Architecture –XML Encoding Science & Reseach Scalability Outlook –Conclusions –Roadmaps –Release & References Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
For each author, return number of books and receipts for books published in the past 2 years, ordered by name let $cat := fn:doc(“ (: Documents :) $sales := fn:doc(“ for $author in distinct-values($cat//author) (: Grouping :) let $books := >= 2003 and author = $author],(: Sel. :) $receipts := = (: Join :) order by $author (: Ordering :) return (: XML Construction :) { $author } { fn:count($books) } (: Aggregation :) { fn:sum($receipts) } XQuery Example Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
For each author, return number of books and receipts for books published in the past 2 years, ordered by name let $cat := fn:doc(“ Documents $sales := fn:doc(“ for $author in distinct-values($cat//author) Grouping let $books := >= 2003 and author = $author], Sel. $receipts := = Join order by $author Ordering return XML Construction { $author } { fn:count($books) } Aggregation { fn:sum($receipts) } XQuery Example Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
XQuery Systems: 2 Approaches Existing “native” XML/XQuery systems are built from scratch –Galax, Saxon, … –X-Hive, Tamino, … –(Still have to) re-invent optimization technology Our approach: –Build XQuery system on top of an RDBMS –Leverage mature relational technology to achieve efficient XQuery processing Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
Architecture Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
xx XML in an RDBMS: XPath Accelerator Node-based relational encoding of XQuery's data model f /following: SELECT * FROM pre_post WHERE pre > f.pre AND post > f.post f /descendant: SELECT * FROM pre_post WHERE pre > f.pre AND post < f.post f /preceeding: SELECT * FROM pre_post WHERE pre < f.pre AND post < f.post f /ancester: SELECT * FROM pre_post WHERE pre f.post Similar queries for all 13 XPath axes Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
Science & Research More research lead to more optimization –Join Recognition –Embedded XPath processing –Order Awareness Various scientific publications Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
Results: Scalability (3) Unsurpassed scalability Standard Opteron PC, 8GB RAM, 64-bit Linux Can process 11GB documents! Mostly linear scaling with document size Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
Conclusions Relational approach Works Is fast Is scalable Crucial Optimizations –Join recognition –Embedded XPath processing –Order awareness Research turned into open-source release Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
Roadmap : MonetDB/XQuery 4.8/0.8 “Mercurius” –Developers Release / Technology Preview : MonetDB/XQuery 4.10/0.10 “Venus” –Student Release / Technology Preview 2 –XUpdate, More Optimization : MonetDB/XQuery 4.12/1.12 “Mars” –Final Release –Application Programming Interfaces –End-User Front-Ends Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
Open Source Release & References MonetDB + Pathfinder on SourceForge –Mozilla-like License MonetDB homepage – Pathfinder homepage – Developers website – You are welcome to join the MonetDB/XQuery team! Stefan Manegold HollandOpen, Amsterdam MonetDB/XQuery
Stefan ManegoldHollandOpen, Amsterdam MonetDB/XQuery Results: Performance (4) XMark performance in seconds: MonetDB/XQuery vs. Galax & X-Hive