Bits Eugene Wu, Carlo Curino, Sam Madden

Slides:

Advertisements

Similar presentations

Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.

Advertisements

CpSc 3220 File and Database Processing Lecture 17 Indexed Files.

Database Management Systems, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 7.

File Processing - Organizing file for Performance MVNC1 Organizing Files for Performance Chapter 6 Jim Skon.

Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.

B+-trees and Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: page = b bytes or B records (or block) If r is the size of a.

1 Storing Data: Disks and Files Yanlei Diao UMass Amherst Feb 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.

Database Implementation Issues CPSC 315 – Programming Studio Spring 2008 Project 1, Lecture 5 Slides adapted from those used by Jennifer Welch.

Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:

Murali Mani Overview of Storage and Indexing (based on slides from Wisconsin)

Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.

Computer Organization and Architecture

The Relational Model (cont’d) Introduction to Disks and Storage CS 186, Spring 2007, Lecture 3 Cow book Section 1.5, Chapter 3 (cont’d) Cow book Chapter.

File Organizations and Indexing Lecture 4 R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears,

1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.

1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.

Chapter 15.7 Buffer Management ID: 219 Name: Qun Yu Class: CS Spring 2009 Instructor: Dr. T.Y.Lin.

Lecture 6 Indexing Part 2 Column Stores. Indexes Recap Heap FileBitmapHash FileB+Tree InsertO(1) O( log B n ) DeleteO(P)O(1) O( log B n ) Range Scan O(P)--

Indexing structures for files D ƯƠ NG ANH KHOA-QLU13082.

IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.

1 Physical Data Organization and Indexing Lecture 14.

1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.

Physical Storage Susan B. Davidson University of Pennsylvania CIS330 – Database Management Systems November 20, 2007.

Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.

Architecture Rajesh. Components of Database Engine.

1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.

Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.

The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.

Lecture 5 Cost Estimation and Data Access Methods.

CS399 New Beginnings Jonathan Walpole. Virtual Memory (1)

Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.

Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.

M.Kersten MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam.

Chapter 5 Index and Clustering

CS 440 Database Management Systems Lecture 6: Data storage & access methods 1.

CS 540 Database Management Systems

CS411 Database Systems Kazuhiro Minami 09: Storage.

1 Contents Memory types & memory hierarchy Virtual memory (VM) Page replacement algorithms in case of VM.

What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.

Select Operation Strategies And Indexing (Chapter 8)

CS222: Principles of Data Management Lecture #4 Catalogs, Buffer Manager, File Organizations Instructor: Chen Li.

Module 11: File Structure

CS 540 Database Management Systems

Lecture 16: Data Storage Wednesday, November 6, 2006.

CS222/CS122C: Principles of Data Management Lecture #3 Heap Files, Page Formats, Buffer Manager Instructor: Chen Li.

Database Management Systems (CS 564)

CS510 Operating System Foundations

Lecture 10: Buffer Manager and File Organization

Database Implementation Issues

Andy Wang Operating Systems COP 4610 / CGS 5765

ECE 445 – Computer Organization

Introduction to Database Systems

Andy Wang Operating Systems COP 4610 / CGS 5765

Performance metrics for caches

Translation Buffers (TLB’s)

Chapter 13: Data Storage Structures

Basics Storing Data on Disks and Files

DATABASE IMPLEMENTATION ISSUES

Contents Memory types & memory hierarchy Virtual memory (VM)

Database Systems (資料庫系統)

Database Implementation Issues

Sarah Diesburg Operating Systems CS 3430

Andy Wang Operating Systems COP 4610 / CGS 5765

Chapter 13: Data Storage Structures

Chapter 13: Data Storage Structures

CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #03 Row/Column Stores, Heap Files, Buffer Manager, Catalogs Instructor: Chen Li.

Database Implementation Issues

Sarah Diesburg Operating Systems COP 4610

Presentation transcript:

Bits Eugene Wu, Carlo Curino, Sam Madden

Memory Hierarchy Registers.3ns Cache 10ns RAM100ns Disk 10 7 ns

Ideal Memory Hierarchy Registers.3ns Cache 10ns RAM 100ns Disk 10 7 ns Hottest Coldest

The Hierarchy is Lukewarm Registers.3ns Cache 10ns RAM100ns Disk 10 7 ns

Why so cold?

Why So Cold? Cache misses Inefficient data structures Replacement policies Picking wrong data structures Poor locality On and on…..

Waste. dfn: Bits not useful to the application right now

Bits that don’t contain any data Unused Space Locality Waste Encoding Waste

Bits of data that’s not useful now Unused Space Locality Waste Encoding Waste

Bits of inefficiently represented data Unused Space Locality Waste Encoding Waste

Fundamental to system design Unused Space Locality Waste Encoding Waste

Result of poorly defined schemas Unused Space Locality Waste Encoding Waste

Can we find ways to reduce or reuse the waste? Unused Space Locality Waste Encoding Waste

Standing On Tall Shoulders Self tuning databases Database cracking Clustering Index selection Optimal database parameters LRU Horizontal Partitioning Column Stores Compression 2Q Vertical Partitioning Indexes

But The Clouds Are Higher Self tuning databases Database cracking Clustering Index selection Optimal database parameters LRU Horizontal Partitioning Column Stores Compression 2Q Vertical Partitioning Indexes Huge amount of room for interesting work

Outrageous Results Awesome ly (Preliminary!)

Unused Space Allocated bits that do not contain any data

B-Tree Indexes

Why?

1. Lots of Unused Space In an insert workload – 32% of unused space* – 42% in Cartel centroidlocations index Waste is designed into the data structure – Amortize insert costs * A. C.-C. Yao. On random 2-3 trees

2. Indexes are Large Wikipedia’s revision table – Data33 GB – Indexes27 GB Cartel’s centroidlocations table – Data13 GB – Indexes8.24 GB

Life Of A Wikipedia Query SELECT page_id FROM page WHERE title = “Donkey” Leaf Pages Buffer Pool Disk 17 | “Donkey” | 1/10/2009 | true

Life Of A Wikipedia Query Leaf Pages Buffer Pool Disk | “Donkey” | 1/10/2009 | true SELECT page_id FROM page WHERE title = “Donkey”

Life Of A Wikipedia Query Leaf Pages Buffer Pool Disk | “Donkey” | 1/10/2009 | true SELECT page_id FROM page WHERE title = “Donkey”

Index Caching Free space in secondary index leaf pages to cache hot data on access path – Improve locality – Save memory for other queries Guarantee: – Never increase Disk IO – Minimal overhead – Consistency

Anatomy of an Index Page Fixed Size Header Fixed Size Footer Free Space Index Keys Directory Entries

Anatomy of an Index Page Fixed Size Header Fixed Size Footer Cache Slot Index Keys Directory Entries Hottest Data Coldest Data Cache Slot

Cache Replacement Policy Hotter entries move to stable locations – New data replace coldest entries – On cache hit, swap entry into hotter slot

What? Does This Work!? 44% of Wikipedia queries – Use name_title index – Read ≤ 4 out of 11 fields – name_title can store data for 70% of tuples Using above workload (zipf α=0.5) – 98% hit rate on read-only workload – 95% hit rate on insert/read mix

Up To Order-of-magnitude Wins

100x

2X Win Even If Data Is In Memory

Locality Waste When cold data pollutes higher level of memory hierarchy due to ineffective placement

SELECT * FROM T WHERE id =

Locality Waste In Wikipedia 99.9% queries access latest revision of article – 5% of revision tuples – As low as 2% utilized Can’t optimize based on data values alone – Traditional clustering/partitioning doesn’t work

Workload-Specific Data Placement

Access Frequency Clustering Cluster accessed tuples at end of table – Insert new copy of tuple – Invalidate old copy – Update pointers to new copy

Access Frequency Clustering Cluster accessed tuples at end of table – Insert new copy of tuple – Invalidate old copy – Update pointers to new copy

Access Frequency Clustering Cluster accessed tuples at end of table – Insert new copy of tuple – Invalidate old copy – Update pointers to new copy

Access Frequency Clustering Cluster accessed tuples at end of table – Insert new copy of tuple – Invalidate old copy – Update pointers to new copy Hot Partition

revision table experiments Clustering – 2.2X query performance win Partitioning – 8X query performance win – Reduces effective index size from 27GB to 1.4GB – Smaller index size fits in memory

Encoding Waste Data that is inefficiently represented or at higher granularity than the application needs

Poorly Defined Schema CREATE TABLE taxi_pings{ macvarchar(17),“02:22:28:E0..” tstampvarchar(14),“ …” flagsint, 0,1,2,3 yearint2011 } Schemas must be defined upfront Little knowledge about data

Poorly Defined Schema CREATE TABLE taxi_pings{ macvarchar(17),“02:22:28:E0..” tstampvarchar(14),“ …” flagsint, 0,1,2,3 yearint2011 } Schemas must be defined upfront – Little knowledge about data – Design to be conservative and overallocated

Tools We Need Now Automatically detect the waste – Field’s true value range – Field’s true data type Surface waste information to developer – Impact on performance

Interpret schemas as hints DB should.. – Detect field storage sizes – Replace bits with CPU Cycles – Store fields based on semantic property Flexible field storage

Research Opportunities Tools that detect waste – Amount of memory my workload really needs? – How inefficient is my schema? Workload specific data structures & policies – Index caching algorithms – Storage layout techniques Automatic optimization as workload changes – Can this field be dropped? – Runtime repartitioning

Unused space Locality waste Encoding waste Thanks!

Additional Slides

How Can You Just Re-insert Tuples? Experiments specific to Wikipedia – ID fields are opaque to the application – App looks up ID by article name Update foreign key references Lookup Table – old ID  new ID – Query rewrites

Index Caching Assumptions Fixed size cache entries Set of fields is fixed over the index Secondary index We know what fields to cache ahead of time

Index Cache Consistency Page Cache-LSN (PCLSN) in each index page Global Cache-LSN (GCLSN) An index page’s cache is valid IFF: – PCLSN == GCLSN GCLSN++ invalidates entire cache – Finer grained invalidation in paper

Alternative: Fillfactor = 100% Specify the amount of free space in index pages on creation Problems – Trade read speed + waste for insert performance – Fillfactor is only initial setting

Alternative: Covering Indexes Store all the queried fields in the index Queries will be index-only Problems – Utilization is still 68% – Creates more waste because index is larger!

Cache Replacement Experiments

Cache Replacement - Occasional

Index Caching Performance

Clustering Experiment Setup revision table queries – Derived from sample of wikipedia’s http requests – 800K accesses of 5.2M 10% of 2 hours

Access Frequency Clustering

Overhead of Clustering

Benefit of Clustering Over Time