M.Kersten 2008 1 MonetDB, a Column-Store in Midflight Martin Kersten CWI Amsterdam.

Slides:



Advertisements
Similar presentations
Part IV: Memory Management
Advertisements

C-Store: Self-Organizing Tuple Reconstruction Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Apr. 17, 2009.
Ingres/VectorWise Doug Inkster – Ingres Development.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
B+-tree and Hashing.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
M.Kersten The MonetDB Architecture Martin Kersten CWI Amsterdam.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
MonetDB/SQL Meets SkyServer: the Challenges of a Scientific Database Milena Ivanova, Niels Nes, Romulo Goncalves, Martin Kersten CWI, Amsterdam Presented.
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
ObjectStore Martin Wasiak. ObjectStore Overview Object-oriented database system Can use normal C++ code to access tuples Easily add persistence to existing.
Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.
Introduction to Column-Oriented Databases Seminar: Columnar Databases, Nov 2012, Univ. Helsinki.
Ekrem Kocaguneli 11/29/2010. Introduction CLISSPE and its background Application to be Modeled Steps of the Model Assessment of Performance Interpretation.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
Index tuning Performance Tuning.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
1 Physical Data Organization and Indexing Lecture 14.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
July, 2001 High-dimensional indexing techniques Kesheng John Wu Ekow Otoo Arie Shoshani.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
1 © 2012 OpenLink Software, All rights reserved. Virtuoso - Column Store, Adaptive Techniques for RDF Orri Erling Program Manager, Virtuoso Openlink Software.
MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.
Efficient and Flexible Information Retrieval Using MonetDB/X100 Sándor Héman CWI, Amsterdam Marcin Zukowski, Arjen de Vries, Peter Boncz January 08, 2007.
Improving Efficiency of I/O Bound Systems More Memory, Better Caching Newer and Faster Disk Drives Set Object Access (SETOBJACC) Reorganize (RGZPFM) w/
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
DBMS Implementation Chapter 6.4 V3.0 Napier University Dr Gordon Russell.
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
C-Store: How Different are Column-Stores and Row-Stores? Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 8, 2009.
M.Kersten Dec 31, Cracking the database store The far side of the Moon Martin Kersten, Stefan Manegold Centre for Mathematics and Computer Science.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
M.Kersten MonetDB/SQL : the Challenges of a Scientific Database, Milena Ivanova, Niels Nes, Romulo Goncalves, Martin Kersten CWI, Amsterdam.
M.Kersten The MonetDB Architecture Martin Kersten CWI Amsterdam.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
C-Store: Data Model and Data Organization Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May 17, 2010.
Storage Structures. Memory Hierarchies Primary Storage –Registers –Cache memory –RAM Secondary Storage –Magnetic disks –Magnetic tape –CDROM (read-only.
Distributed Query Processing. Agenda Recap of query optimization Transformation rules for P&D systems Memoization Queries in heterogeneous systems Query.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
M.Kersten MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam.
Spring 2004 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
ICOM 5016 – Introduction to Database Systems Lecture 13- File Structures Dr. Bienvenido Vélez Electrical and Computer Engineering Department Slides by.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
1 Updates ADT 2010 ADT 2010 XQuery Updates in MonetDB/XQuery Stefan Manegold
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.
Introduction to Database Programming with Python Gary Stewart
Database cracking Stratos Idreos, Martin Kersten and Stefan Manegold
CHAPTER 19 Query Optimization. CHAPTER 19 Query Optimization.
Tuning Transact-SQL Queries
COMP 430 Intro. to Database Systems
Indices in a DBMS.
Database Management Systems (CS 564)
Database Performance Tuning and Query Optimization
SQL 2014 In-Memory OLTP What, Why, and How
Physical Database Design
(A Research Proposal for Optimizing DBMS on CMP)
Chapter 11 Database Performance Tuning and Query Optimization
Self-organizing Tuple Reconstruction in Column-stores
Presentation transcript:

M.Kersten MonetDB, a Column-Store in Midflight Martin Kersten CWI Amsterdam

M.Kersten Database view of the world 25 years ago… Design and Analysis of a Relational Join Operation for VLSI The "Group By" Operation in Relational Algebra Implementing Unknown and Imprecise Values in Databases Finding an Optimal Search Sequence of Files. Modelling Events a Data Base Application Design Deferring Updates in a Relational Data Base System

M.Kersten Where is the field heading to Mike Stonebraker, VLDB 2007: One size fits all: A concept whose time has come and gone Martin Kersten, ICDE 2008: Mike is wrong….. Every size fits him always… Even a Euro solution.

M.Kersten If you want to play with a generic column-store, we recommend you download the academic version of Vertica, the commercialization of the C-Store project, or download MonetDB, an open-source column-store.

M.Kersten Paste Present Potency Cracking Columns Chaos

M.Kersten Paste Present Potency PAX stores N-ary stores Column stores

M.Kersten John32Houston OK Early 80s: tuple storage structures for PCs were simple Mary31Houston OK Easy to access at the cost of wasted space Try to keep things simple

M.Kersten Slotted pages Logical pages equated physical pages 32John Houston 31 Mary Houston Try to keep things simple

M.Kersten Slotted pages Logical pages equated multiple physical pages 32John Houston 31 Mary Houston Try to keep things simple

M.Kersten Not all attributes are equally important Avoid things you don’t always need

M.Kersten A column orientation is as simple and acts like an array Attributes of a tuple are correlated by offset Avoid moving too much around

M.Kersten MonetDB Binary Association Tables Try to keep things simple

M.Kersten Physical data organization Binary Association Tables Bat Unit fixed size Dense sequence Memory mapped files Try to avoid doing things twice

M.Kersten Binary Association Tables accelerators Hash-based access Try to avoid doing things twice Column properties: key-ness non-null dense ordered

M.Kersten Binary Association Tables storage control A BAT can be used as an encoding table A VID datatype can be used to represent dense enumerations Type remappings are used to squeeze space 100 Try to avoid doing things twice

M.Kersten Column orientation benefits datawarehousing Brings a much tighter packaging and improves transport through the memory hierarchy Each column can be more easily optimized for storage using compression schemes Each column can be replicated for read-only access Mantra: Try to keep things simple

M.Kersten Try to maximize performance Paste Present Potency Materialize All Model Vectorized model Volcano model

M.Kersten Volcano Refresher Query SELECT name, salary*.19 AS tax FROM employee WHERE age > 25 Try to maximize performance

M.Kersten Volcano Refresher Operators Iterator interface -open() -next(): tuple -close() Try to maximize performance

M.Kersten The Volcano model is based on a simple pull- based iterator model for programming relational operators. The Volcano model minimizes the amount of intermediate store The Volcano model is CPU intensive and inefficient Try to maximize performance Volcano paradigm

M.Kersten MonetDB paradigm The MonetDB kernel is a programmable relational algebra machine Relational operators operate on ‘array’-like structures Based on experiences in database machines 25 years ago, RAP, CASSM,…ICL RAP… PRISMA… IDIOMS (J. Kerridge)… Try to use simple a software pattern

M.Kersten SQL MonetDB Server MonetDB Kernel XQuery MAL function user.s3_1():void; X1:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",0); X6:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",1); X9:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",2); X13:bat[:oid,:oid] := sql.bind_dbat("sys","photoobjall",1); X8 := algebra.kunion(X1,X6); X11 := algebra.kdifference(X8,X9); X12 := algebra.kunion(X11,X9); X14 := bat.reverse(X13); X15 := algebra.kdifference(X12,X14); X16 := X18 := algebra.markT(X15,X16); X19 := bat.reverse(X18); X20 := aggr.count(X19); sql.exportValue(1,"sys.","count_","int",32,0,6,X20,""); end s3_1; select count(*) from photoobjall; Try to use simple a software pattern

M.Kersten Operator implementation All algebraic operators materialize their result Local optimization decisions Heavy use of code expansion to reduce cost 55 selection routines 149 unary operations 335 join/group operations 134 multi-join operations 72 aggregate operations Try to use simple a software pattern

M.Kersten DBtapestry tables The first column is an ordered sequence All other columns are permutations Organize them as a column store A0 A1 A2 A3 A4 A5 A6 A7 A8 A

M.Kersten Micro-benchmark MonetDB/SQL0.34 N44 MySQL25.1 N238 PostgreSQL10.6 N1230 Commercial N800 Commercial 217 N150 In milliseconds/10K Fixed cost in ms Keeping the query result in a new table is often too expensive select * into tmp from tapestry where attr1>=0 and attr1 create table tmp( attr0 int, attr1 int); insert into tmp select * from tapestry where attr1>=0 and attr1 ;

M.Kersten Multi-column tapestry Experiments ran on Athlon 1.4, Linux commercial MonetDB/SQL #joins ms

M.Kersten A column store should be designed from scratch to benefit from its characteristics Simulation of a column store on top of an n- ary system using the Volcano model does not work

M.Kersten Try to maximize performance Paste Present Potency Execution Paradigm Database Structures Query optimizer

M.Kersten Applications have different characteristics Platforms have different characteristics The actual state of computation is crucial A generic all-encompassing optimizer cost- model does not work Try to avoid the search space trap

M.Kersten SQL MonetDB Server MonetDB Kernel XQuery MAL Operational optimizer: – Exploit everything you know at runtime – Re-organize if necessary Try to disambiguate decisions

M.Kersten SQL MonetDB Server MonetDB Kernel XQuery MAL Strategic optimizer: – Exploit the semantics of the language – Rely on heuristics Operational optimizer: – Exploit everything you know at runtime – Re-organize if necessary Try to disambiguate decisions

M.Kersten SQL MonetDB Server Tactical Optimizer MonetDB Kernel XQuery MAL y1:bat[:oid,:dbl]:= bpm.take("sys_photoobjall_ra"); y2 := bpm.new(:oid,:oid); barrier rs:= bpm.newIterator(y1,A0,A1); t1:= algebra.uselect(rs,A0,A1); bpm.addSegment(y2,t1); redo rs:= bpm.hasMoreElements(y1,A0,A1); exit rs; x1:bat[:oid,:dbl]:= sql.bind("sys","photoobjall","ra",0); x14:= algebra.uselect(x1,A0,A1); Tactical MAL optimizer: – No changes in front-ends and no direct human guidance – Minimal changes in the engine Try to disambiguate decisions

M.Kersten Code Inliner. Constant Expression Evaluator. Accumulator Evaluations. Strength Reduction. Common Term Optimizer. Join Path Optimizer. Ranges Propagation. Operator Cost Reduction. Foreign Key handling. Aggregate Groups. Code Parallizer. Replication Manager. Result Recycler. MAL Compiler. Dynamic Query Scheduler. Memo-based Execution. Vector Execution. Alias Removal. Dead Code Removal. Garbage Collector. Try to disambiguate decisions

M.Kersten Try to maximize performance Paste Present Potency Execution Paradigm Database Structures Query optimizer

M.Kersten Execution paradigms The MonetDB kernel is set up to accommodate different execution engines The MonetDB assembler program is Interpreted in the order presented Interpreted in a dataflow driven manner Compiled into a C program Vectorised processing X100 project No data from persistent store to the memory trash

M.Kersten MonetDB/x100 Combine Volcano model with vector processing. All vectors together should fit the CPU cache Vectors are compressed Optimizer should tune this, given the query characteristics. ColumnBM (buffer manager) X100 query engine CPU cache networked ColumnBM-s RAM

M.Kersten Varying the vector size on TPC-H query 1 mysql, oracle, db2 X100 MonetDB low IPC, overhead RAM bandwidth bound No data from persistent store to the memory trash

M.Kersten Vectorized-Volcano processing can be used for both multi-core and distributed processing The architecture and the parameters are influenced heavily by Hardware characteristics Data distribution to compress columns No data from persistent store to the memory trash

M.Kersten Does MonetDB stand a ‘real’ test? Is the main memory orientation a bottleneck? Is it functionally complete? The proof of the pudding is in the eating

M.Kersten TPC-H ATHLON X (2000mhz) 2 disks in raid 0, 2G main memory TPC-H 60K rows line_item table Comfortably fit in memory Performance in milliseconds

M.Kersten TPC-H ATHLON X (2000mhz) 2 disks in raid 0, 2G main memory Scale-factor 1 6M row line-item table Out of the box performance Queries produce empty or erroneous results

M.Kersten TPC-H ATHLON X (2000mhz) 2 disks in raid 0, 2G main memory

M.Kersten TPC-H ATHLON X (2000mhz) 2 disks in raid 0, 2G main memory

M.Kersten Code base for MonetDB/SQL is 1.2M lines of C Nightly regression testing on 17 platforms

M.Kersten Try to maximize performance Paste Present Potency Cracking B-tree, Hash Indices Materialized Views

M.Kersten Indices in database systems focus on: All tuples are equally important for fast retrieval There are ample resources to maintain indices MonetDB cracks the database into pieces based on actual query load Find a trusted fortune teller

M.Kersten 2008 Cracking algorithms Physical reorganization happens per column based on selection predicates. Split a piece of a column in two new pieces A<10 A>=10 A<10

M.Kersten 2008 Cracking algorithms Physical reorganization happens per column Split a piece of a column in two new pieces Split a piece of a column in three new pieces A<10 A>=10 A<10 5<A<10 A>=10 5<A<10 A<5

M.Kersten 2008 Cracking example select A>5 and A<10

M.Kersten 2008 Cracking example select A>5 and A<

M.Kersten 2008 Cracking example select A>5 and A< >=10

M.Kersten 2008 Cracking example select A>5 and A< >=10

M.Kersten 2008 Cracking example select A>5 and A< >=10 <=5

M.Kersten 2008 Cracking example select A>5 and A< >=10 <=5

M.Kersten 2008 Cracking example select A>5 and A< >=10 <=5

M.Kersten 2008 Cracking example select A>5 and A< >=10 <=5

M.Kersten 2008 Cracking example select A>5 and A< >=10

M.Kersten 2008 Cracking example select A>5 and A< >=10

M.Kersten 2008 Cracking example select A>5 and A< <=5

M.Kersten 2008 Cracking example select A>5 and A< <=5

M.Kersten 2008 Cracking example select A>5 and A< <=5 >5 and <10

M.Kersten 2008 Cracking example select A>5 and A< <=5 >5 and <10

M.Kersten 2008 Cracking example select A>5 and A< <=5 >5 and <10

M.Kersten 2008 Cracking example select A>5 and A< <=5 >5 and <10

M.Kersten 2008 Cracking example select A>5 and A< >5 and <10

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 15

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< <= 5 >= 10 > 5

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< <= 5 >= 10 > 5

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< <= 5 >= 10 > 5

M.Kersten 2008 racking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< <= 5 >= 10 > 5 >3 and <14 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< <= 5 >= 10 > 5 >3 and <14 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< <= 5 >= 10 > 5 >3 and <14 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< <= 5 >= 10 > 5 >3 and <14 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< <= 5 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 10 > 5 <=3

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 14 > 5 <=3 >=10

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< > 3 >= 14 > 5 <=3 >=10

M.Kersten 2008 Cracking example select A>5 and A< <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A< >3 >= 14 > 5 <=3 >=10 The more we crack the more we learn

M.Kersten 2008 Design The first time a range query is posed on an attribute A, a cracking DBMS makes a copy of column A, called the cracker column of A A cracker column is continuously physically reorganized based on queries that need to touch attribute such as the result is in a contiguous space For each cracker column, there is a cracker index Cracker Index Cracker Column

M.Kersten 2008 A simple range query Try to avoid useless investments

M.Kersten 2008 TPC-H query 6 Try to avoid useless investments

M.Kersten Cracking is easy in a column store and is part of the critical execution path Cracking works under high volume updates Try to avoid useless investments

M.Kersten 2008 Updates Base columns are updated as normally We need to update the cracker column and the cracker index Efficiently Maintain the self-organization properties Two issues: When How

M.Kersten 2008 When to propagate updates in cracking Follow the workload to maintain self-organization Updates become part of query processing When an update arrives, it is not applied For each cracker column there is a pending insertions column and a pending deletions column Pending updates are applied only when a query needs the specific values

M.Kersten 2008 Updates aware select We extended the cracker select operator to apply the needed updates before cracking The select operator: 1. Search the pending insertions column 2. Search the pending deletions column 3. If Steps 1 or 2 find tuples run an update algorithm 4. Search the cracker index 5. Physically reorganize the cracker column 6. Update the cracker index 7. Return a slice of the cracker column

M.Kersten 2008 Merging Start position: 7 values: >35 Start position: 4 values: >12 Start position: 1 values: >1 Insert a new tuple with value 9 The new tuple belongs to the blue piece 9

M.Kersten 2008 Merging Start position: 8 values: >35 Start position: 5 values: >12 Start position: 1 values: >1 Insert a new tuple with value 9 The new tuple belongs to the blue piece 9 Pieces in the cracker column are ordered Tuples inside a piece are not ordered Shifting is not a viable solution

M.Kersten 2008 Merging by Hopping Start position: 8 values: >35 Start position: 4 values: >12 Start position: 1 values: > Insert a new tuple with value 9 We need to make enough room to fit the new tuples

M.Kersten 2008 Merge Gradually A query merges only the qualifying values, i.e., only the values that it needs for a correct and complete result Average cost increases significantly We avoid the large peaks but... Merge CompletelyMerge Gradually

M.Kersten 2008 The Ripple Touch only the pieces that are relevant for the current query

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 Touch only the pieces that are relevant for the current query

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 Select 7<= A< 15 Touch only the pieces that are relevant for the current query

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 Select 7<= A< Pending insertions Touch only the pieces that are relevant for the current query

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions Touch only the pieces that are relevant for the current query Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions Touch only the pieces that are relevant for the current query Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions Touch only the pieces that are relevant for the current query Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions Touch only the pieces that are relevant for the current query Avoid shifting down non interesting pieces Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions Touch only the pieces that are relevant for the current query Avoid shifting down non interesting pieces Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: > Pending insertions 29 Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 5 values: >22 Start position: 1 values: > Pending insertions 29 Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

M.Kersten 2008 The Ripple Start position: 7 values: >35 Start position: 5 values: >22 Start position: 1 values: > Pending insertions 29 Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

M.Kersten 2008 The Ripple Maintain high performance through the whole query sequence in a self-organizing way

M.Kersten 2008 The Ripple Maintain high performance through the whole query sequence in a self-organizing way Merge GraduallyMerge Completely Merge Ripple

M.Kersten MonetDB Research & Development line XQuery and Information Retrieval Astronomy databases (SkyServer) Event Stream Engine Relational Cache Effectiveness Multi-core Parallelism DataStorage Rings RDF Engine …. Array Databases become en vogue

M.Kersten Summary MonetDB is a mature column store Cracking based on physical re-organization in the critical path can be made to work There is work for another decade

M.Kersten Cstore SQLserver DB2 PostgreSQL MySQL Whoa MonetDB ! Speed lines !

M.Kersten Martin Kersten Peter Boncz Niels Nes Stefan Manegold Fabian Groffen Sjoerd Mullender Steffen Goeldner Arjen de Vries Menzo Windhouwer Tim Ruhl Romulo Goncalves Jan Rittinger Wouter Alink Jennie Zhang Stratos Idreos Erietta Liarou Lefteris Sidirourgos Florian Waas Albrecht Schmidt Jonas Karlsson Martin van Dinther Peter Bosch Carel van den Berg Wilco Quak Acknowledgements Alex van Ballegooij Johan List Georgina Ramirez Marcin Zukowski Roberto Cornacchia Sandor Heman Torsten Grust Jens Teubner Maurice van Keulen Jan Flokstra Milena Ivanova MonetDB/{SQL,XQuery} open-source platform MonetDB/SQL Version 5 experimentation platform MonetDB/X100 aimed at cpu & IO squeezing MonetDB/RAM aimed at IR and science DB MonetDB/Armada aimed at evolving databases MonetDB/SkyServer portal for Astronomy