Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.

Slides:

Advertisements

Similar presentations

Fast Crash Recovery in RAMCloud

Advertisements

1 Log-Structured File Systems Hank Levy. 2 Basic Problem Most file systems now have large memory caches (buffers) to hold recently-accessed blocks Most.

The google file system Cs 595 Lecture 9.

Mendel Rosenblum and John K. Ousterhout Presented by Travis Bale 1.

Memory and Object Management in a Distributed RAM-based Storage System Thesis Proposal Steve Rumble April 23 rd, 2012.

John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Retreat June, 2014.

RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University (with Nandu Jayakumar, Diego Ongaro, Mendel Rosenblum,

The Zebra Striped Network Filesystem. Approach Increase throughput, reliability by striping file data across multiple servers Data from each client is.

11-May-15CSE 542: Operating Systems1 File system trace papers The Zebra striped network file system. Hartman, J. H. and Ousterhout, J. K. SOSP '93. (ACM.

RAMCloud Scalable High-Performance Storage Entirely in DRAM John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières,

RAMCloud 1.0 John Ousterhout Stanford University (with Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin.

Chapter 11: File System Implementation

G Robert Grimm New York University Sprite LFS or Let’s Log Everything.

The design and implementation of a log-structured file system The design and implementation of a log-structured file system M. Rosenblum and J.K. Ousterhout.

CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.

File Systems: Designs Kamen Yotov CS 614 Lecture, 04/26/2001.

G Robert Grimm New York University Sprite LFS or Let’s Log Everything.

CS 142 Lecture Notes: Large-Scale Web ApplicationsSlide 1 RAMCloud Overview ● Storage for datacenters ● commodity servers ● GB DRAM/server.

THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.

Metrics for RAMCloud Recovery John Ousterhout Stanford University.

Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗

The RAMCloud Storage System

Memory and Object Management in RAMCloud Steve Rumble December 3 rd, 2013.

RAMCloud Design Review Recovery Ryan Stutsman April 1,

What We Have Learned From RAMCloud John Ousterhout Stanford University (with Asaf Cidon, Ankita Kejriwal, Diego Ongaro, Mendel Rosenblum, Stephen Rumble,

1 The Google File System Reporter: You-Wei Zhang.

RAMCloud Overview John Ousterhout Stanford University.

RAMCloud: a Low-Latency Datacenter Storage System John Ousterhout Stanford University

RAMCloud: Concept and Challenges John Ousterhout Stanford University.

RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.

Some key-value stores using log-structure Zhichao Liang LevelDB Riak.

Cool ideas from RAMCloud Diego Ongaro Stanford University Joint work with Asaf Cidon, Ankita Kejriwal, John Ousterhout, Mendel Rosenblum, Stephen Rumble,

John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Forum January, 2015.

THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.

Durability and Crash Recovery for Distributed In-Memory Storage Ryan Stutsman, Asaf Cidon, Ankita Kejriwal, Ali Mashtizadeh, Aravind Narayanan, Diego Ongaro,

RAMCloud: Low-latency DRAM-based storage Jonathan Ellithorpe, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro,

RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University (with Christos Kozyrakis, David Mazières, Aravind Narayanan,

Log-structured File System Sriram Govindan

The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.

26-Oct-15CSE 542: Operating Systems1 File system trace papers The Design and Implementation of a Log- Structured File System. M. Rosenblum, and J.K. Ousterhout.

1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.

CS 140 Lecture Notes: Technology and Operating Systems Slide 1 Technology Changes Mid-1980’s2012Change CPU speed15 MHz2.5 GHz167x Memory size8 MB4 GB500x.

Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.

Log-Structured File Systems

CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.

RAMCloud Overview and Status John Ousterhout Stanford University.

Embedded System Lab. 서동화 The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout.

Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.

CS333 Intro to Operating Systems Jonathan Walpole.

Implementing Linearizability at Large Scale and Low Latency Collin Lee, Seo Jin Park, Ankita Kejriwal, † Satoshi Matsushita, John Ousterhout Platform Lab.

Local Filesystems (part 1) CPS210 Spring Papers  The Design and Implementation of a Log- Structured File System  Mendel Rosenblum  File System.

Embedded System Lab. 정영진 The Design and Implementation of a Log-Structured File System Mendel Rosenblum and John K. Ousterhout ACM Transactions.

CSci8211: Distributed Systems: RAMCloud 1 Distributed Shared Memory/Storage Case Study: RAMCloud Developed by Stanford Platform Lab  Key Idea: Scalable.

John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Retreat June, 2013.

RAMCloud and the Low-Latency Datacenter John Ousterhout Stanford Platform Laboratory.

CPSC 426: Building Decentralized Systems Persistence

Log-Structured Memory for DRAM-Based Storage Stephen Rumble and John Ousterhout Stanford University.

Jonathan Walpole Computer Science Portland State University

Java 9: The Quest for Very Large Heaps

Printed on Monday, December 31, 2018 at 2:03 PM.

Log-Structured File Systems

M. Rosenblum and J.K. Ousterhout The design and implementation of a log-structured file system Proceedings of the 13th ACM Symposium on Operating.

Log-Structured File Systems

Log-Structured File Systems

Log-Structured File Systems

The Design and Implementation of a Log-Structured File System

Presentation transcript:

Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University

● Traditional memory allocators can’t provide all of  Fast allocation/deallocation  Handle changing workloads  Efficient use of memory ● RAMCloud: log-structured allocator  Incremental copying garbage collector  Two-level approach to cleaning (separate policies for disk and DRAM)  Concurrent cleaning (no pauses) ● Results:  High performance even at 80-90% memory utilization  Handles changing workloads  Makes sense for any DRAM-based storage system February 18, 2014Log-Structured MemorySlide 2 Introduction

February 18, 2014Log-Structured MemorySlide 3 RAMCloud Overview Appl. Library … Datacenter Network Coordinator 1000 – 10,000 Storage Servers 1000 – 100,000 Application Servers Master Backup … Master Backup Master Backup Master Backup Appl. Library Appl. Library Appl. Library Key-value store GB DRAM 5µs RTT for small RPCs Durable replica storage for crash recovery All data in DRAM at all times

● 7 memory allocators, 8 workloads  Total live data constant (10 GB)  But workload changes (except W1) ● All allocators waste at least 50% of memory in some situations February 18, 2014Log-Structured MemorySlide 4 Workload Sensitivities glibc malloc: 20 GB memory to hold 10 GB data under workload W8: Allocate many B objects Then delete 90%, write new 5-15KB objects

February 18, 2014Log-Structured MemorySlide 5 Non-Copying Allocators Free areas ● Blocks cannot be moved once allocated ● Result: fragmentation

February 18, 2014Log-Structured Memory Copying Garbage Collectors ● Must scan all memory to update pointers  Expensive, scales poorly  Wait for lots of free space before running GC ● State of the art: 3-5x overallocation of memory ● Long pauses: 3+ seconds for full GC Before collection: After collection: Slide 6

● Requirements:  Must use copying approach  Must collect free space incrementally ● Storage system advantage: pointers restricted  Pointers stored in index structures  Easy to locate pointers for a given memory block  Enables incremental copying ● Can achieve overall goals:  Fast allocation/deallocation  Insensitive to workload changes  80-90% memory utilization February 18, 2014Log-Structured MemorySlide 7 Allocator for RAMCloud

February 18, 2014Log-Structured MemorySlide 8 Log-Structured Storage Hash Table {table id, object key} Immutable Log 8 MB Segments B17B86B22B3B72B66B49B3B16 Log head: add next object here Each segment replicated on disks of 3 backup servers Master Server

1. Pick segments with lots of free space: 2. Copy live objects (survivors): 3. Free cleaned segments (and backup replicas) Cleaning is incremental February 18, 2014Log-Structured MemorySlide 9 Log Cleaning Log

Need different policies for cleaning disk and memory February 18, 2014Log-Structured MemorySlide 10 Cleaning Cost U: fraction of live bytes in cleaned segments Bytes copied by cleaner (U) Bytes copied/byte freed (U/(1-U)) Bytes freed (1-U) Disk Memory CapacityBandwidth cheap expensive cheap Conflicting Needs:

Two-Level Cleaning Combined Cleaning:  Clean multiple segments  Free old segments (disk & memory) Compaction:  Clean single segment in memory  No change to replicas on backups February 18, 2014Log-Structured MemorySlide 11 Backups DRAM Backups DRAM Backups DRAM

● Best of both worlds:  Optimize utilization of memory (can afford high bandwidth cost for compaction)  Optimize disk bandwidth (can afford extra disk space to reduce cleaning cost) February 18, 2014Log-Structured MemorySlide 12 Two-Level Cleaning, cont’d

Parallel Cleaning ● Survivor data written to “side log”  No competition for log head  Different backups for replicas ● Synchronization points:  Updates to hash table  Adding survivor segments to log  Freeing cleaned segments February 18, 2014Log-Structured MemorySlide Log Head Survivor Segments Log Head...

February 18, 2014Log-Structured MemorySlide 14 Throughput vs. Memory Utilization MemoryPerformance UtilizationDegradation 80%17-27% 90%26-49% 80%14-15% 90%30-42% 80%3-4% 90%3-6% 1 master, 3 backups, 1 client, concurrent multi-writes

February 18, 2014Log-Structured MemorySlide 15 1-Level vs. 2-Level Cleaning One-level Cleaning

February 18, 2014Log-Structured MemorySlide 16 Cleaner’s Impact on Latency Median: With cleaning: 16.70µs No cleaner:16.35µs 99.9th %ile: With cleaning: 900µs No cleaner:115µs 1 client, sequential 100B overwrites, no locality, 90% utilization

● Tombstones: log entries to mark deleted objects  Mixed blessing: impact cleaner performance ● Preventing memory deadlock  Need space to free space ● Fixed segment selection defect in LFS ● Modified memcached to use log-structured memory:  15-30% better memory utilization  3% higher throughput  Negligible cleaning cost (5% CPU utilization) ● YCSB benchmarks vs. HyperDex and Redis:  RAMCloud better except vs. Redis under write-heavy workloads with slow RPC. February 18, 2014Log-Structured MemorySlide 17 Additional Material in Paper

● Storage allocators and garbage collectors  Performance limited by lack of control over pointers  Some slab allocators almost log-like (slab segment) ● Log-structured file systems  All info in DRAM in RAMCloud (faster, more efficient cleaning) ● Other large-scale storage systems  Increasing use of DRAM: Bigtable/LevelDB, Redis, memcached, H-Store,...  Log-structured mechanisms for distributed replication  Tombstone-like objects for deletion  Most use traditional memory allocators February 18, 2014Log-Structured MemorySlide 18 Related Work

● Logging approach is an efficient way to allocate memory (if pointers are restricted)  Allows 80-90% memory utilization  Good performance (no pauses)  Tolerates workload changes ● Works particularly well in RAMCloud  Manage both disk and DRAM with same mechanism ● Also makes sense for other DRAM-based storage systems February 18, 2014Log-Structured MemorySlide 19 Conclusion

● Server crash? Replay log on other servers to reconstruct lost data ● Tombstones identify deleted objects:  Written into log when object deleted or overwritten  Info in tombstone: ● Table id ● Object key ● Version of dead object ● Id of segment where object stored ● When can tombstones be deleted?  After segment containing object has been cleaned (and replicas deleted on backups) ● Note: tombstones are a mixed blessing February 18, 2014Log-Structured MemorySlide 20 Tombstones