SILT: A Memory-Efficient, High-Performance Key-Value Store

Slides:

Advertisements

Similar presentations

Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.

Advertisements

M AINTAINING L ARGE A ND F AST S TREAMING I NDEXES O N F LASH Aditya Akella, UW-Madison First GENI Measurement Workshop Joint work with Ashok Anand, Steven.

Removing the I/O Bottleneck with Virident PCIe Solid State Storage Solutions Jan Silverman VP Operations.

Query Processing and Optimizing on SSDs Flash Group Qingling Cao

Snapshots in a Flash with ioSnap TM Sriram Subramanian, Swami Sundararaman, Nisha Talagala, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau Copyright © 2014.

Memory –efficient Data Management Policy for Flash-based Key-Value Store Wang Jiangtao

FAWN: Fast Array of Wimpy Nodes Developed By D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, V. Vasudevan Presented by Peter O. Oliha.

Cuckoo Filter: Practically Better Than Bloom

MICA: A Holistic Approach to Fast In-Memory Key-Value Storage

FAWN: Fast Array of Wimpy Nodes A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott.

Ashok Anand, Aaron Gember-Jacobson, Collin Engstrom, Aditya Akella 1 Design Patterns for Tunable and Efficient SSD-based Indexes.

Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy University of California at Santa Barbara.

Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.

COMP 451/651 Indexes Chapter 1.

B+-tree and Hashing.

FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri.

Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.

1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.

CS 4432lecture #10 - indexing & hashing1 CS4432: Database Systems II Lecture #10 Professor Elke A. Rundensteiner.

FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

1 Overview of Storage and Indexing Chapter 8 1. Basics about file management 2. Introduction to indexing 3. First glimpse at indices and workloads.

Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.

Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.

External Sorting Chapter 13.. Why Sort? A classic problem in computer science! Data requested in sorted order  e.g., find students in increasing gpa.

Flash-based (cloud) storage systems Lecture 25 Aditya Akella.

A Locality Preserving Decentralized File System Jeffrey Pang, Suman Nath, Srini Seshan Carnegie Mellon University Haifeng Yu, Phil Gibbons, Michael Kaminsky.

1 Physical Data Organization and Indexing Lecture 14.

Bin Fan, David G. Andersen, Michael Kaminsky

RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.

Oracle Index study for Event TAG DB M. Boschini S. Della Torre

Efficient Minimal Perfect Hash Language Models David Guthrie, Mark Hepple, Wei Liu University of Sheffield.

1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.

« Performance of Compressed Inverted List Caching in Search Engines » Proceedings of the International World Wide Web Conference Commitee, Beijing 2008)

File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.

Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.

Author: Abhishek Das Google Inc., USA Ankit Jain Google Inc., USA Presented By: Anamika Mukherji 13/26/2013Indexing The World Wide Web.

The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.

Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.

Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.

Cheap and Large CAMs for High Performance Data-Intensive Networked Systems Ashok Anand, Chitra Muthukrishnan, Steven Kappes, and Aditya Akella University.

Amar Phanishayee,LawrenceTan,Vijay Vasudevan

RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups Chun-Ho Ng, Patrick P. C. Lee The Chinese University of Hong Kong.

Collections Data structures in Java. OBJECTIVE “ WHEN TO USE WHICH DATA STRUCTURE ” D e b u g.

Evidence from Content INST 734 Module 2 Doug Oard.

Chapter 15 A External Methods. © 2004 Pearson Addison-Wesley. All rights reserved 15 A-2 A Look At External Storage External storage –Exists beyond the.

1 Lecture 20: Big Data, Memristors Today: architectures for big data, memristors.

Cuckoo Filter: Practically Better Than Bloom Author: Bin Fan, David G. Andersen, Michael Kaminsky, Michael D. Mitzenmacher Publisher: ACM CoNEXT 2014 Presenter:

SILT: A Memory-Efficient, High-Performance Key-Value Store

FaRM: Fast Remote Memory Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson and Miguel Castro, Microsoft Research NSDI’14 January 5 th, 2016 Cho,

Log-Structured Memory for DRAM-Based Storage Stephen Rumble and John Ousterhout Stanford University.

Hathi: Durable Transactions for Memory using Flash

COMP261 Lecture 23 B Trees.

Algorithmic Improvements for Fast Concurrent Cuckoo Hashing

CPS216: Data-intensive Computing Systems

CSE-291 Cloud Computing, Fall 2016 Kesden

FAWN: A Fast Array of Wimpy Nodes

BitWarp Energy Efficient Analytic Data Processing on Next Generation General Purpose GPUs Jason Power || Yinan Li || Mark D. Hill || Jignesh M. Patel.

HashKV: Enabling Efficient Updates in KV Storage via Hashing

Be Fast, Cheap and in Control

Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

Database Management Systems (CS 564)

KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures

Hybrid Indexes Reducing the Storage Overhead of

FAWN: A Fast Array of Wimpy Nodes

RUM Conjecture of Database Access Method

Chapter 14: File-System Implementation

CSE 373 Data Structures and Algorithms

LSbM-tree:一个读写兼优的大数据存储结构

Fast Accesses to Big Data in Memory and Storage Systems

Presentation transcript:

SILT: A Memory-Efficient, High-Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen Michael Kaminsky† Carnegie Mellon University †Intel Labs 2011-10-24

Key-Value Store Cluster Clients Key-Value Store Cluster PUT(key, value) value = GET(key) DELETE(key) E-commerce (Amazon) Web server acceleration (Memcached) Data deduplication indexes Photo storage (Facebook)

Many projects have examined flash memory-based key-value stores Faster than disk, cheaper than DRAM This talk will introduce SILT, which uses drastically less memory than previous systems while retaining high performance.

Flash Must be Used Carefully Random reads / sec 48,000 Fast, but not THAT fast $ / GB 1.83 Space is precious Another long-standing problem： random writes are slow and bad for flash life (wearout)

DRAM Must be Used Efficiently DRAM used for index (locate) items on flash 1 TB of data to store on flash 4 bytes of DRAM for key-value pair (previous state-of-the-art) 32 B: Data deduplication => 125 GB! 168 B: Tweet => 24 GB Index size (GB) 1 KB: Small image => 4 GB Key-value pair size (bytes)

Three Metrics to Minimize Memory overhead = Index size per entry Ideally 0 (no memory overhead) Read amplification = Flash reads per query Limits query throughput Ideally 1 (no wasted flash reads) Write amplification = Flash writes per entry Limits insert throughput Also reduces flash life expectancy Must be small enough for flash to last a few years

Landscape: Where We Were Read amplification SkimpyStash HashCache BufferHash FlashStore FAWN-DS ? Memory overhead (bytes/entry)

Seesaw Game? FAWN-DS How can we improve? FlashStore HashCache SkimpyStash BufferHash Memory efficiency High performance

Solution Preview: (1) Three Stores with (2) New Index Data Structures Queries look up stores in sequence (from new to old) Inserts only go to Log Data are moved in background SILT Sorted Index (Memory efficient) SILT Filter SILT Log Index (Write friendly) Memory Flash

LogStore: No Control over Data Layout Naive Hashtable (48+ B/entry) SILT Log Index (6.5+ B/entry) Still need pointers: size ≥ log N bits/entry Memory Flash Inserted entries are appended (Older) (Newer) On-flash log Memory overhead Write amplification 6.5+ bytes/entry 1

SortedStore: Space-Optimized Layout SILT Sorted Index (0.4 B/entry) Memory Flash Need to perform bulk-insert to amortize cost On-flash sorted array Memory overhead Write amplification 0.4 bytes/entry High

Combining SortedStore and LogStore SILT Sorted Index SILT Log Index Merge On-flash sorted array On-flash log

Achieving both Low Memory Overhead and Low Write Amplification High write amplification SortedStore High memory overhead Low write amplification LogStore SortedStore LogStore Now we can achieve simultaneously: Write amplification = 5.4 = 3 year flash life Memory overhead = 1.3 B/entry With “HashStores”, memory overhead = 0.7 B/entry! (see paper)

SILT’s Design (Recap) <SortedStore> <HashStore> <LogStore> SILT Sorted Index SILT Filter SILT Log Index Merge Conversion On-flash sorted array On-flash hashtables On-flash log Memory overhead Read amplification Write amplification 0.7 bytes/entry 1.01 5.4

Review on New Index Data Structures in SILT SILT Sorted Index SILT Filter & Log Index Entropy-coded tries Partial-key cuckoo hashing For SortedStore Highly compressed (0.4 B/entry) For HashStore & LogStore Compact (2.2 & 6.5 B/entry) Very fast (> 1.8 M lookups/sec)

Compression in Entropy-Coded Tries 1 1 1 1 1 1 1 Hashed keys (bits are random) # red (or blue) leaves ~ Binomial(# all leaves, 0.5) Entropy coding (Huffman coding and more) (More details of the new indexing schemes in paper)

Landscape: Where We Are Read amplification SkimpyStash HashCache BufferHash FlashStore FAWN-DS SILT Memory overhead (bytes/entry)

Evaluation Various combinations of indexing schemes Background operations (merge/conversion) Query latency Experiment Setup CPU 2.80 GHz (4 cores) Flash drive SATA 256 GB (48 K random 1024-byte reads/sec) Workload size 20-byte key, 1000-byte value, ≥ 50 M keys Query pattern Uniformly distributed (worst for SILT)

LogStore Alone: Too Much Memory Workload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)

LogStore+SortedStore: Still Much Memory Workload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)

Full SILT: Very Memory Efficient Workload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)

Small Impact from Background Operations Workload: 90% GET (100~ M keys) + 10% PUT 40 K Oops! bursty TRIM by ext4 FS 33 K

Low Query Latency Best tput @ 16 threads Workload: 100% GET (100 M keys) Best tput @ 16 threads Median = 330 μs 99.9 = 1510 μs # of I/O threads

Conclusion SILT provides both memory-efficient and high-performance key-value store Multi-store approach Entropy-coded tries Partial-key cuckoo hashing Full source code is available https://github.com/silt/silt

Thanks!