Breaking the Memory Wall in MonetDB

Slides:



Advertisements
Similar presentations
Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There.
Advertisements

1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.
By Daniela Floresu Donald Kossmann
Composite Subset Measures Lei Chen, Paul Barford, Bee-Chung Chen, Vinod Yegneswaran University of Wisconsin - Madison Raghu Ramakrishnan Yahoo! Research.
OS Fall’02 Virtual Memory Operating Systems Fall 2002.
MonetDB: A column-oriented DBMS Ryan Johnson CSC2531.
1 HYRISE – A Main Memory Hybrid Storage Engine By: Martin Grund, Jens Krüger, Hasso Plattner, Alexander Zeier, Philippe Cudre-Mauroux, Samuel Madden, VLDB.
Caching IV Andreas Klappenecker CPSC321 Computer Architecture.
Using one level of Cache:
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Cache effective mergesort and quicksort Nir Zepkowitz Based on: “Improving Memory Performance of Sorting Algorithms” by Li Xiao, Xiaodong Zhang, Stefan.
Paging and Virtual Memory. Memory management: Review  Fixed partitioning, dynamic partitioning  Problems Internal/external fragmentation A process can.
Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:
Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara.
Memory Management 2010.
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Flexicache: Software-based Instruction Caching for Embedded Processors Jason E Miller and Anant Agarwal Raw Group - MIT CSAIL.
An Evaluation of Using Deduplication in Swappers Weiyan Wang, Chen Zeng.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands Stefan.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Top Main Memory DBs FastDB MonetDB eXtremeDB ERDB DataBlitz Times Ten.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
CSCE Database Systems Chapter 15: Query Execution 1.
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.
1 Advance Computer Architecture CSE 8383 Ranya Alawadhi.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 34 Paging Implementation.
Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating Kernel.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Database Architecture Optimized for the new Bottleneck: Memory Access Chau Man Hau Wong Suet Fai.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-8 Memory Management (2) Department of Computer Science and Software.
Computer Architecture Lecture 26 Fasih ur Rehman.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Super computers Parallel Processing By Lecturer: Aisha Dawood.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
16.7 Completing the Physical- Query-Plan By Aniket Mulye CS257 Prof: Dr. T. Y. Lin.
Radix Sort and Hash-Join for Vector Computers Ripal Nathuji 6.893: Advanced VLSI Computer Architecture 10/12/00.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 15 A External Methods. © 2004 Pearson Addison-Wesley. All rights reserved 15 A-2 A Look At External Storage External storage –Exists beyond the.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
©Brooks/Cole, 2003 Chapter 1 Introduction. ©Brooks/Cole, 2003 Figure 1-1 Data processor model This model represents a specific-purpose computer not a.
© 2006 Pearson Addison-Wesley. All rights reserved15 A-1 Chapter 15 External Methods.
CDA 5155 Virtual Memory Lecture 27. Memory Hierarchy Cache (SRAM) Main Memory (DRAM) Disk Storage (Magnetic media) CostLatencyAccess.
Cache Advanced Higher.
CPSC-310 Database Systems
The Memory System (Chapter 5)
Memory COMPUTER ARCHITECTURE
Database Management System
Physical Database Design and Performance
Cache Memory Presentation I
Database Performance Tuning and Query Optimization
Lecture#12: External Sorting (R&G, Ch13)
Physical Database Design
Virtual Memory فصل هشتم.
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Chapter 11 Database Performance Tuning and Query Optimization
Virtual Memory: Working Sets
Presentation transcript:

Breaking the Memory Wall in MonetDB Presented By Janhavi Digraskar CSE 704

Memory Wall Growing disparity of speed between CPU and memory outside CPU chip 20-40 % of instructions in a program reference memory It is a performance bottleneck to access main memory CSE 704

Want to break the Memory Wall? Average time to access memory can be written as: tavg=p x tc + (1-p) x tm where p = probability of a cache hit tc = time to access cache tm = DRAM access time To reduce tavg, increase p (or reduce tc) Exploit cache CSE 704

Exploit the power of cache by : Vertical Storage Bulk query Algebra Cache conscious algorithms Memory Access cost modeling CSE 704

MonetDB 1994 - Now Open source Column Store Has been successfully applied in high- performance applications for data mining, OLAP, GIS, XML Query, text and multimedia retrieval. CSE 704

MonetDB Architecture CSE 704

Exploit the power of cache by : Vertical Storage Bulk query Algebra Cache conscious algorithms Memory Access cost modeling CSE 704

MonetDB- Vertical storage CSE 704

MonetDB-Vertical Storage Uses Decomposed Storage Model (DSM) Each column stored in a separate Binary Association Table (BAT) BAT <surrogate,value> <head, tail> Two simple memory arrays Internally, columns are stored using memory-mapped files Provides O(1) database lookup mechanism CSE 704

Exploit the power of cache by : Vertical Storage Bulk query Algebra Cache conscious algorithms Memory Access cost modeling CSE 704

Bulk Query Algebra Low-level algebra called BAT Algebra Breaks complex expressions into a sequence of BAT algebra operators Implementing these operations are simple array operations No need of Expression Interpreting engine CSE 704

Example BAT algebra expression R:bat[:oid, :oid]:=select(B:bat[:oid,:int], V:int) C code implementing the above expression for (i = j = 0; i <n; i++) if (B.tail[i] == V) R.tail[j++] = i; CSE 704

MonetDB Vertical Storage Advantage: For loop creates high instruction locality and thus eliminates the cache miss problem Disadvantage: Materialize intermediate results CSE 704

Exploit the power of cache by : Vertical Storage Bulk query Algebra Cache conscious algorithms Memory Access cost modeling CSE 704

MonetDB- Cache concious algorithms Partitioned Hash Join Relations partitioned into H separate clusters Each cluster fits in L2 memory cache Problem Number of clusters (H) exceeds the number of - TLB entries  TLB thrashing - Cache lines  Cache thrashing Solution -Multi-pass radix cluster CSE 704

Multipass radix cluster Multiple clustering passes Limit the number of clusters per pass Avoids TLB/Cache thrashing CSE 704

Exploit the power of cache by : Vertical Storage Bulk query Algebra Cache conscious algorithms Memory Access cost modeling CSE 704

Memory Access cost modeling Cost models are created that take the cost of memory access into account Main goal is to predict the number and kind of cache misses for all cache levels. Complex data access patterns of database algorithms are studied. CSE 704

Basic Access Patterns Single sequential traversal s_trav(R) Single random traversal r_trav(R) Repetitive sequential traversal rs_trav(r,d,R) Repetitive random traversal rr_trav(r,d,R) R= region representing a database table CSE 704

Compound Access Patterns Database operations accessing more than one data region Can be created by combining basic access patterns Basic access pattern can be combined either Sequentially Concurrently Example: s_trav(U)  s_trav(W) for W<- select(U) CSE 704

Sequential Execution Concurrent Execution Total number of cache misses is at most the sum of the cache misses of all patterns However if the data region for patterns is same, cache miss can be saved Concurrent Execution Patterns are competing for the same cache Considering this, total number of cache misses are calculated CSE 704

Conclusion MonetDB has redefined database architecture in the face of an ever-changing computer architecture Principle of Vertical storage is currently being adopted by many commercial systems HBase, C-Store, Apache Cassandra are some of the other column stores CSE 704