Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands Stefan.

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
6.830 Lecture 9 10/1/2014 Join Algorithms. Database Internals Outline Front End Admission Control Connection Management (sql) Parser (parse tree) Rewriter.
CS 540 Database Management Systems
Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
MonetDB: A column-oriented DBMS Ryan Johnson CSC2531.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Changkyu Kim1, Eric Sedlar2, Jatin Chhugani1,
Parallel Database Systems
OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.
1  Simple Nested Loops Join:  Block Nested Loops Join  Index Nested Loops Join  Sort Merge Join  Hash Join  Hybrid Hash Join Evaluation of Relational.
SPRING 2004CENG 3521 Join Algorithms Chapter 14. SPRING 2004CENG 3522 Schema for Examples Similar to old schema; rname added for variations. Reserves:
Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:
Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:
Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara.
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Inspector Joins IC-65 Advances in Data Management Systems 1 Inspector Joins By Shimin Chen, Anastassia Ailamaki, Phillip, and Todd C. Mowry VLDB 2005 Rammohan.
Introduction to Database Systems 1 Join Algorithms Query Processing: Lecture 1.
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
CS 4432query processing - lecture 171 CS4432: Database Systems II Lecture #17 Join Processing Algorithms (cont). Professor Elke A. Rundensteiner.
Evaluation of Relational Operations. Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
C-Store: Column Stores over Solid State Drives Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 19, 2009.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
DBMSs On A Modern Processor: Where Does Time Go? by A. Ailamaki, D.J. DeWitt, M.D. Hill, and D. Wood University of Wisconsin-Madison Computer Science Dept.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Breaking the Memory Wall in MonetDB
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 12: Overview.
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
Author : Ozgun Erdogan and Pei Cao Publisher : IEEE Globecom 2005 (IJSN 2007) Presenter : Zong-Lin Sie Date : 2010/12/08 1.
MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.
Bringing Value of Big Data to Business: SAP's Integrated Strategy [1] Group 6 - Ziqi Fan, Sheng Chen.
Dr. N. MamoulisAdvanced Database Technologies1 Topics 9-10: Database optimization for modern machines Computer architecture is changing over the years,
@ Carnegie Mellon Databases Inspector Joins Shimin Chen Phillip B. Gibbons Todd C. Mowry Anastassia Ailamaki 2 Carnegie Mellon University Intel Research.
RELATIONAL JOIN Advanced Data Structures. Equality Joins With One Join Column External Sorting 2 SELECT * FROM Reserves R1, Sailors S1 WHERE R1.sid=S1.sid.
Online aggregation Joseph M. Hellerstein University of California, Berkley Peter J. Haas IBM Research Division Helen J. Wang University of California,
Moving Point Type OTB Research Institute for Housing, Urban and Mobility Studies Dagstuhl 1 A ‘movingpoint’ type for a DBMS Wilko Quak - TUDelft.
Database Architecture Optimized for the new Bottleneck: Memory Access Chau Man Hau Wong Suet Fai.
Parallel Databases 77. Introduction 4 Basic idea: use multiple disks, memory and/or processors to speed up querying. 4 Measures –Throughput – how many.
CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Multi pass algorithms. Nested-Loop joins Tuple-Based Nested-loop Join Algorithm: FOR each tuple s in S DO FOR each tuple r in R DO IF r and s join to.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations – Join Chapter 14 Ramakrishnan and Gehrke (Section 14.4)
1 Parallel Applications Computer Architecture Ning Hu, Stefan Niculescu & Vahe Poladian November 22, 2002.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
Implementation of Database Systems, Jarek Gryz1 Evaluation of Relational Operations Chapter 12, Part A.
CS 540 Database Management Systems
REED : Robust, Efficient Filtering and Event Detection in Sensor Network Daniel J. Abadi, Samuel Madden, Wolfgang Lindner Proceedings of the 31st VLDB.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 14, Part A (Joins)
CS4432: Database Systems II Query Processing- Part 1 1.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
The Memory System (Chapter 5)
Parallel Data Laboratory, Carnegie Mellon University
Cache Memory Presentation I
Database Performance Tuning and Query Optimization
Evaluation of Relational Operations
R*: An Overview of the Architecture
Evaluation of Relational Operations: Other Operations
RC6—The elegant AES choice
(A Research Proposal for Optimizing DBMS on CMP)
RC6—The elegant AES choice
Chapter 11 Database Performance Tuning and Query Optimization
CACHE-CONSCIOUS DATABASES
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands Stefan Manegold Martin Kersten CWI Amsterdam The Netherlands

2 Contents How Memory Access works Simple Scan Experiment Consequences for DBMS –Data Structures: vertical decomposition –Algorithms: tune random memory access Partitioned Join Algorithms –Monet Experiments –Accurate Cost Models Conclusion

3 CPU Speed vs. Memory Speed Moore’s Law: CPU speed doubles every 3 years

4 Memory Access in Hierarchical Systems

5 Simple Scan Experiment

6 Consequences for DBMS Memory access is a bottleneck Prevent cache & TLB misses Cache lines must be used fully DBMS must optimize –Data structures –Algorithms (focus: join)

7 Vertical Decomposition in Monet

8 Partitioned Joins Cluster both input relations Create clusters that fit in memory cache Join matching clusters Two algorithms: –Partitioned hash-join –Radix-Join (partitioned nested-loop)

9 Partitioned Joins: Straightforward Clustering Problem: Number of clusters exceeds number of –TLB entries ==> TLB trashing –Cache lines ==> cache trashing Solution: Multi-pass radix-cluster

10 Partitioned Joins: Multi-Pass Radix-Cluster Multiple clustering passes Limit number of clusters per pass Avoid cache/TLB trashing Trade memory cost for CPU cost Any data type (hashing)

11 Monet Experiments: Setup Platform: –SGI Origin2000 (MIPS R10000, 250 MHz) System: –Monet DBMS Data sets: –Integer join columns –Join hit-rate of 1 –Cardinalities: 15, ,000,000 Hardware event counters –to analyze cache & TLB misses

12 Monet Experiments: Radix-Cluster (64,000,000 tuples)

13 Accurate Cost Modeling: Radix-Cluster

14 Monet Experiments: Partitioned Hash-Join

15 Monet Experiments: Radix-Join

16 Monet Experiments: Overall Performance (64,000,000 tuples)

17 Conclusion Problem: –Memory access is increasingly the most important bottleneck for database performance Solutions: –Vertical decomposition improves column-wise data access –Radix-algorithms optimize join performance General: –Algorithms can be tuned to achieve optimal memory access –Detailed and accurate estimation of memory cost is possible Monet homepage:

18 Introduction: Hardware Trends CPU speed has been, is, and will be growing rapidly Main-memory access latency has hardly improved over the last decade Wider busses and new DRAM standards improve only the memory bandwidth Cache memories reduce the access latencies only if the accessed data is in the cache There is a main-memory access bottleneck and it will remain in the foreseeable future

19 Consequences for MM-DBMS: Overview Data structures: full vertical table fragmentation –Reduce record width, and thus –Optimize column-wise data access Query processing algorithms –Avoid random memory access pattern beyond cache limits –Minimize number of cache & TLB misses Example: partitioned hash-join –Create clusters that fit in memory cache –Perform hash-join on matching clusters