Flash research report Da Zhou 2009-7-4. Outline Query Processing Techniques for Solid St ate Drives (Research Paper) Join Processing for Flash SSDs: Rememb.

Slides:



Advertisements
Similar presentations
Arjun Suresh S7, R College of Engineering Trivandrum.
Advertisements

Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
CS 245Notes 71 CS 245: Database System Principles Notes 7: Query Optimization Hector Garcia-Molina.
Query Processing and Optimizing on SSDs Flash Group Qingling Cao
6.830 Lecture 9 10/1/2014 Join Algorithms. Database Internals Outline Front End Admission Control Connection Management (sql) Parser (parse tree) Rewriter.
Join Processing in Database Systems with Large Main Memories ACM Transactions on Database Systems Vol. 11, No. 3, Sep 1986 Leonard D. Shapiro Donghui Zhang,
CS CS4432: Database Systems II Operator Algorithms Chapter 15.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
External Sorting CS634 Lecture 10, Mar 5, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
1 HYRISE – A Main Memory Hybrid Storage Engine By: Martin Grund, Jens Krüger, Hasso Plattner, Alexander Zeier, Philippe Cudre-Mauroux, Samuel Madden, VLDB.
Shimin Chen Big Data Reading Group Presented and modified by Randall Parabicoli.
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
1 Overview of Storage and Indexing Chapter 8 (part 1)
Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:
1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
Introduction to Database Systems 1 Join Algorithms Query Processing: Lecture 1.
External Sorting 198:541. Why Sort?  A classic problem in computer science!  Data requested in sorted order e.g., find students in increasing gpa order.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 Query Processing: The Basics Chapter Topics How does DBMS compute the result of a SQL queries? The most often executed operations: –Sort –Projection,
Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.
CS 4432query processing - lecture 171 CS4432: Database Systems II Lecture #17 Join Processing Algorithms (cont). Professor Elke A. Rundensteiner.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
Lecture 11: DMBS Internals
C-Store: Column Stores over Solid State Drives Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 19, 2009.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 14 – Join Processing.
MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
A Case for Flash Memory SSD in Enterprise Database Applications Authors: Sang-Won Lee, Bongki Moon, Chanik Park, Jae-Myung Kim, Sang-Woo Kim Published.
Sorting.
1 Database Systems ( 資料庫系統 ) December 7, 2011 Lecture #11.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Weaving Relations for Cache Performance Anastassia Ailamaki Carnegie Mellon David DeWitt, Mark Hill, and Marios Skounakis University of Wisconsin-Madison.
Lecture 24 Query Execution Monday, November 28, 2005.
연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin
B+ Trees: An IO-Aware Index Structure Lecture 13.
Computing & Information Sciences Kansas State University Monday, 03 Nov 2008CIS 560: Database System Concepts Lecture 27 of 42 Monday, 03 November 2008.
Query Processing CS 405G Introduction to Database Systems.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.
More Optimization Exercises. Block Nested Loops Join Suppose there are B buffer pages Cost: M + ceil (M/(B-2))*N where –M is the number of pages of R.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
CS 540 Database Management Systems
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
DMBS Architecture May 15 th, Generic Architecture Query compiler/optimizer Execution engine Index/record mgr. Buffer manager Storage manager storage.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
Chapter 10 The Basics of Query Processing. Copyright © 2005 Pearson Addison-Wesley. All rights reserved External Sorting Sorting is used in implementing.
What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
DATABASE OPERATORS AND SOLID STATE DRIVES Geetali Tyagi ( ) Mahima Malik ( ) Shrey Gupta ( ) Vedanshi Kataria ( )
Join Processing for Flash SSDs: Remembering Past Lessons
Database Management System
Lecture 16: Data Storage Wednesday, November 6, 2006.
Database Management Systems (CS 564)
Database Performance Tuning and Query Optimization
Join Processing for Flash SSDs: Remembering Past Lessons
Lecture 11: DMBS Internals
Selected Topics: External Sorting, Join Algorithms, …
Lecture 2- Query Processing (continued)
Slides adapted from Donghui Zhang, UC Riverside
Chapter 11 Database Performance Tuning and Query Optimization
Presentation transcript:

Flash research report Da Zhou

Outline Query Processing Techniques for Solid St ate Drives (Research Paper) Join Processing for Flash SSDs: Rememb ering Past Lessons (DaMoN) Evaluating and Repairing Write Performan ce on Flash Devices (DaMoN) Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Outline Query Processing Techniques for Solid St ate Drives (Research Paper) Join Processing for Flash SSDs: Rememb ering Past Lessons (DaMoN) Evaluating and Repairing Write Performan ce on Flash Devices (DaMoN) Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Query Processing Techniques for S olid State Drives Dimitris Tsirogiannis –University of Toronto, Toronto, ON, Canada Stavros Harizopoulos, Mehul A. Shah, Janet L. Wiener, Goetz Graefe –HP Labs, Palo Alto, CA, USA

Motivation Although SSD may benefit applications that stress random reads immediately, they may not improve database applications, especially those running long data analysis queries. Database query processing engines have been designed around the speed mismatch between random and sequential I/O on hard disks and their algorithms currently emphasize sequential accesses for disk-resident data.

Contributions Column-based layout: PAX FlashScan FlashJoin

PAX traditional row-based (NSM) and column-based (PAX) layouts

FlashScan FlashScan takes advantage of the small transfer unit of SSDs to read only the minipages of the attributes that it needs.

FlashScan(Opt) FlashScan can improve performance even further by reading only the minipages that contribute to the final result.

FlashScan

When applying the predicate on a sorted attribute, however, FlashScanOpt outperforms plain Flash- Scan for all selectivities below 100%: only a few pages contain the contiguous matching tuples and all other minipages can be skipped.

FlashJoin The join kernel computes the join and outputs a join index. Each join index tuple consists of the join attributes as well as the row-ids (RIDs) of the participating rows from base relations. The fetch kernel retrieves the needed attributes using the RIDs specied in the join index.

Outline Query Processing Techniques for Solid St ate Drives (Research Paper) Join Processing for Flash SSDs: Rememb ering Past Lessons (DaMoN) Evaluating and Repairing Write Performan ce on Flash Devices (DaMoN) Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Join Processing for Flash SSDs: R emembering Past Lessons Jaeyoung Do, Jignesh M. Patel –Univ. of Wisconsin-Madison My current interests are: energy-efficient data processing, multi-core query processing, methods for searching and mining large graph and sequence/string data sets, and spatial data management. Towards Eco-friendly Database Management Systems, Willis Lang, Jignesh M. Patel, CIDR 2009Towards Eco-friendly Database Management Systems Data Morphing: An Adaptive, Cache-Conscious Storage Technique, R. A. Hankins and J. M. Patel, VLDB 2003.Data Morphing: An Adaptive, Cache-Conscious Storage Technique Effect of Node Size on the Performance of Cache- Conscious B+-trees, R. A. Hankins and J. M. Patel, SIGMETRICS 2003.Effect of Node Size on the Performance of Cache- Conscious B+-trees

Motivation We must carefully consider the lessons that we have learnt from over three decades of designing and tuning algorithms for magnetic HDD-based systems, so that we continue to reuse techniques that worked for magnetic HDDs and also work with flash SSDs.

Four classic ad hoc join algorithms Block Nested Loops Join –Block nested loops join first logically splits the smaller relation R into same size chunks. For each chunk of R that is read, a hash table is built to efficiently find matching pairs of tuples. Then, all of S is scanned, and the hash table is probed with the tuples. Sort-Merge Join –Sort-merge join starts by producing sorted runs of each R and S. After R and S are sorted into runs on disk, sort-merge join reads the runs of both relations and merges/joins them.

Four classic ad hoc join algorithms Grace Hash Join –Grace hash join has two phases. In the first phase, hashes tuples into buckets. –In the second phase, the first bucket of R is loaded into the buffer pool, and a hash table is built on it. Then, the corresponding bucket of S is read and used to probe the hash table. Hybrid Hash Join –Since a portion of the buffer pool is reserved for an in- memory hash bucket for R –Furthermore, as S is read and hashed, tuples of S matching with the in-memory R bucket can be joined immediately, and need not be written to disk.

Experimental Setup DB: SQLite3, Our experiments were performed on a Dual Core 3.2GHz Intel Pentium machine with 1 GB of RAM running Red HatEnterprise 5. For the comparison, we used a 5400 RPM TOSHIBA 320 GB external HDD and a OCZ Core Series60GB SATA II 2.5 inch flash SSD. As our test query, we used a primary/foreign key join between the TPC-H customer and the orders tables, generated with a scale factor of 30. The customer table contains 4,500,000 tuples (730 MB), and the orders table has 45,000,000 (5 GB).

Effect of Varying the Buffer Pool Size The block nested loops join whose I/O pattern is sequential reads shows the biggest performance improvement, with speedup factors between 1.59X to 1.73X. Other join algorithms also performed better on the flash SSD compared to the magnetic HDD, with smaller speedup improvements than the block nested loops join. This is because the write transfer rate is slower than the read transfer rate on the flash SSD, and unexpected erase operations might degrade write performance further.

Effect of Varying the Buffer Pool Size While the I/O speedup of the second phase was between 2.63X and 3.0X due to faster random reads, the I/O speedup in the first phase (that has sequential writes as the dominant I/O pattern), was only between 1.52X and 2.0X. Note that the dominant I/O pattern of Grace hash join is random writes in the first phase, followed by sequential reads in the second phase.

Summary 1.Joins on flash SSDs have a greater tendency to become CPU-bound (rather than I/O-bound), so ways to improve the CPU performance, such as better cache utilization, is of greater importance with flash SSDs. 2. Trading random reads for random writes is likely a good design choice for flash SSDs. 3. Compared to sequential writes, random writes produce more I/O variations with flash SSDs, which makes the join performance less predictable.

Effect of Varying the Page Size As can be seen from Figure 2, when blocked I/O is used, the page size has a small impact on the join performance in both the magnetic HDD and the flash SSD cases.

Effect of Varying the Page Size When the I/O size is less than the flash page size (4 KB), every write operation is likely to generate an erase operation, which severely degrades performance.

Summary 1. Using blocked I/O significantly improves the join performance on flash SSDs over magnetic HDDs. 2. The I/O size should be a multiple of the flash page size.

Outline Query Processing Techniques for Solid St ate Drives (Research Paper) Join Processing for Flash SSDs: Rememb ering Past Lessons (DaMoN) Evaluating and Repairing Write Performan ce on Flash Devices (DaMoN) Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Evaluating and Repairing Write Performance on Flash Devices Anastasia Ailamaki EPFL, VD, Switzerland CMU, PA, USA In 2001, she joined the Computer Science Department at Carnegie Mellon University, where she is currently an Associate Professor. In February 2007, she joined EPFL as a visiting professor. S. Harizopoulos and A. Ailamaki. Improving instruction cache performance in OLTP. ACM Transactions on Database Systems, 31(3): , 2006.

An Append and Pack Data Layout The layer always writes dirty pages, flushed by the buffer manager of the overlying DBMS, sequentially and in multiples of the erase block size. From a conceptual point of view, the physical database representation is an append-only structure. As a result, our writing mechanism benefits from optimal flash memory performance as long as enough space is available.

An Append and Pack Data Layout The proposed layer consolidates the least recently updated logical pages, starting from the head of the append structure, packs them together, then writes them back sequentially to the tail. We append them to the write-cold dataset because pages which reach the beginning of the hot dataset have gone the longest without being updated and are therefore likely to be write-cold. We read data from the head of the cold log structure and write them to the end

Outline Query Processing Techniques for Solid St ate Drives (Research Paper) Join Processing for Flash SSDs: Rememb ering Past Lessons (DaMoN) Evaluating and Repairing Write Performan ce on Flash Devices (DaMoN) Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices Yanlei Diao Department of Computer Science University of Massachusetts Amherst

Motivation They present significant challenges in designing tree indexes due to their fundamentally different read and write characteristics in comparison to magnetic disks.

Key Features Cascaded Buffers Adaptive Buffering

The scan cost of lookup L1 is 75, while that of lookup L2 is 90. Each of the three lookups after L1 saves s1. Hence the benefit of emptying at lookup L1, denoted by payoff p1, is given by p1 = 3 · s1 = 225.

Raw Flash Memory

SSD

Outline Query Processing Techniques for Solid St ate Drives (Research Paper) Join Processing for Flash SSDs: Rememb ering Past Lessons (DaMoN) Evaluating and Repairing Write Performan ce on Flash Devices (DaMoN) Lazy- Adaptive Tree: An Optimized Index Structure for Flash Devices (VLDB 2009)

Thank You