Log-Structured Memory for DRAM-Based Storage Stephen Rumble and John Ousterhout Stanford University.

Slides:



Advertisements
Similar presentations
Fast Crash Recovery in RAMCloud
Advertisements

The google file system Cs 595 Lecture 9.
Mendel Rosenblum and John K. Ousterhout Presented by Travis Bale 1.
Memory and Object Management in a Distributed RAM-based Storage System Thesis Proposal Steve Rumble April 23 rd, 2012.
RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University (with Nandu Jayakumar, Diego Ongaro, Mendel Rosenblum,
The Zebra Striped Network Filesystem. Approach Increase throughput, reliability by striping file data across multiple servers Data from each client is.
The Zebra Striped Network File System Presentation by Joseph Thompson.
The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Hewlett-Packard Laboratories Presented by Sri.
11-May-15CSE 542: Operating Systems1 File system trace papers The Zebra striped network file system. Hartman, J. H. and Ousterhout, J. K. SOSP '93. (ACM.
RAMCloud Scalable High-Performance Storage Entirely in DRAM John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières,
RAMCloud 1.0 John Ousterhout Stanford University (with Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin.
Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.
Chapter 11: File System Implementation
G Robert Grimm New York University Sprite LFS or Let’s Log Everything.
File System Implementation
The design and implementation of a log-structured file system The design and implementation of a log-structured file system M. Rosenblum and J.K. Ousterhout.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
G Robert Grimm New York University Sprite LFS or Let’s Log Everything.
File System Reliability. Main Points Problem posed by machine/disk failures Transaction concept Reliability – Careful sequencing of file system operations.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
The Design and Implementation of a Log-Structured File System Presented by Carl Yao.
Metrics for RAMCloud Recovery John Ousterhout Stanford University.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
The RAMCloud Storage System
Memory and Object Management in RAMCloud Steve Rumble December 3 rd, 2013.
RAMCloud Design Review Recovery Ryan Stutsman April 1,
1 The Google File System Reporter: You-Wei Zhang.
RAMCloud: a Low-Latency Datacenter Storage System John Ousterhout Stanford University
RAMCloud: Concept and Challenges John Ousterhout Stanford University.
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.
Cool ideas from RAMCloud Diego Ongaro Stanford University Joint work with Asaf Cidon, Ankita Kejriwal, John Ousterhout, Mendel Rosenblum, Stephen Rumble,
John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Forum January, 2015.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
Durability and Crash Recovery for Distributed In-Memory Storage Ryan Stutsman, Asaf Cidon, Ankita Kejriwal, Ali Mashtizadeh, Aravind Narayanan, Diego Ongaro,
RAMCloud: Low-latency DRAM-based storage Jonathan Ellithorpe, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro,
Log-structured File System Sriram Govindan
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
26-Oct-15CSE 542: Operating Systems1 File system trace papers The Design and Implementation of a Log- Structured File System. M. Rosenblum, and J.K. Ousterhout.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
CS 140 Lecture Notes: Technology and Operating Systems Slide 1 Technology Changes Mid-1980’s2012Change CPU speed15 MHz2.5 GHz167x Memory size8 MB4 GB500x.
Serverless Network File Systems Overview by Joseph Thompson.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Log-Structured File Systems
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
RAMCloud Overview and Status John Ousterhout Stanford University.
Embedded System Lab. 서동화 The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout.
Fast File System 2/17/2006. Introduction Paper talked about changes to old BSD 4.2 File System (FS) Motivation - Applications require greater throughput.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
CS333 Intro to Operating Systems Jonathan Walpole.
Local Filesystems (part 1) CPS210 Spring Papers  The Design and Implementation of a Log- Structured File System  Mendel Rosenblum  File System.
Embedded System Lab. 정영진 The Design and Implementation of a Log-Structured File System Mendel Rosenblum and John K. Ousterhout ACM Transactions.
CSci8211: Distributed Systems: RAMCloud 1 Distributed Shared Memory/Storage Case Study: RAMCloud Developed by Stanford Platform Lab  Key Idea: Scalable.
John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Retreat June, 2013.
RAMCloud and the Low-Latency Datacenter John Ousterhout Stanford Platform Laboratory.
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Remote Backup Systems.
Jonathan Walpole Computer Science Portland State University
Java 9: The Quest for Very Large Heaps
File System Implementation
Hash-Based Indexes Chapter 11
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
HashKV: Enabling Efficient Updates in KV Storage via Hashing
Filesystems 2 Adapted from slides of Hank Levy
KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures
Overview: File system implementation (cont)
Remote Backup Systems.
The Design and Implementation of a Log-Structured File System
Presentation transcript:

Log-Structured Memory for DRAM-Based Storage Stephen Rumble and John Ousterhout Stanford University

● Storage allocation: the impossible dream  Fast allocation/deallocation  Efficient use of memory  Handle changing workloads ● RAMCloud: log-structured allocator  Incremental garbage collector for both disk and DRAM  Two-level approach to cleaning (separate policies for disk and DRAM)  Concurrent cleaning ● Results:  High performance even at 80-90% memory utilization  No pauses  Handles changing workloads October 3, 2013Log-Structured MemorySlide 2 Introduction

● Datacenter storage system ● Key-value store ● All data in DRAM at all times  Disk/flash for backup only ● Large scale:  servers  100TB - 10PB capacity ● Low latency:  5µs remote access May 23, 2013RAMCloud: Building a Real SystemSlide 3 RAMCloud Overview Master Backup Master Backup Master Backup Master Backup … Appl. Library Appl. Library Appl. Library Appl. Library … Datacenter Network Coordinator 1000 – 100,000 Application Servers 1000 – 10,000 Storage Servers

October 3, 2013Log-Structured MemorySlide 4 Workload Sensitivities ● Allocators waste memory if workloads change:  E.g., W2 (simulates schema change): ● Allocate 100B objects ● Gradually overwrite with 130B objects ● All existing allocators waste at least 50% of memory under some conditions

October 3, 2013Log-Structured MemorySlide 5 Non-Copying Allocators ● Blocks cannot be moved once allocated ● Result: fragmentation Current memory layout: Free areas Space needed

Copying Allocators ● Garbage collector moves objects, coalesces free space ● Expensive, scales poorly:  Must scan all memory to find and update pointers  Only collect when there is lots of free space ● State of the art: 3-5x overallocation of memory ● Long pauses: 3+ seconds for full GC Slide 6 Before collection: After collection:

● Use memory efficiently even with workload changes (80-90% utilization) ● Must use a copying approach ● Must be able to collect free space incrementally:  Pick areas with most free space  Avoid long pauses ● Key advantage: restricted use of pointers  Pointers stored in index structures  Easy to locate pointers for a given memory block October 3, 2013Log-Structured MemorySlide 7 Storage System Goals

October 3, 2013Log-Structured MemorySlide 8 Log-Structured Storage Hash Table {table id, object key} Log 8 MB Segments B17B86B22B3B72B66B49B3B16 Log head: add next object here Each segment replicated on disks of 3 backup servers Master Server

● Pick segments with lots of free space ● Copy live objects (survivors) into new segment(s) ● Free cleaned segments (use for new objects) Cleaning is incremental October 3, 2013Log-Structured MemorySlide 9 Log Cleaning

● Cleaning cost increases with memory utilization  U: fraction of bytes still live in cleaned segments ● Conflict between disk and memory  Initial RAMCloud implementation: clean disk and memory together  Better to run disk at low utilization to reduce cleaning costs  But, this would mean low utilization of DRAM too October 3, 2013Log-Structured MemorySlide 10 Cleaning Cost Bytes copied by cleanerU Bytes freed1-U Bytes copied/byte freedU/(1-U)

Two-Level Cleaning Combined Cleaning:  Clean multiple segments  Write new survivor segments (disk & memory)  Free old segments (disk & memory) Compaction:  Clean single segment in memory  Free unused memory space  No change to disk log October 3, 2013Log-Structured MemorySlide 11 Disk DRAM Disk DRAM Disk DRAM

● Best of both worlds:  Optimize utilization of memory (can afford high bandwidth cost for compaction)  Optimize disk bandwidth (can afford extra disk space to reduce cleaning cost) October 3, 2013Log-Structured MemorySlide 12 Two-Level Cleaning, cont’d

Parallel Cleaning ● Cleaner runs concurrently with normal requests ● Log is immutable (no updates in place) ● Survivor data written to “side log”  No competition for log head ● Synchronization points:  Updates to hash table (cleaner moves object while being read/written)  Adding survivor segments to log  Freeing cleaned segments (wait for active requests to complete) October 3, 2013Log-Structured MemorySlide Log Head Survivor Segments Log Head

October 3, 2013Log-Structured MemorySlide 14 Client Write Throughput MemoryPerformance UtilizationDegradation 80%17-27% 90%26-49% 80%14-15% 90%30-42% 80%3-4% 90%3-6%

October 3, 2013Log-Structured MemorySlide 15 1-Level vs. 2-Level Cleaning One-Level Cleaning

October 3, 2013Log-Structured MemorySlide 16 Cleaner’s Impact on Latency Median: With cleaning: 16.70µs No cleaner:16.35µs 99.9%: With cleaning: 900µs No cleaner:115µs 100B overwrites, no locality

● Logging approach to storage allocation works well if pointers are restricted  Allows 80-90% memory utilization  Good performance independent of workload  Supports concurrent cleaning: no pauses ● Works particularly well in RAMCloud  Manage both disk and DRAM with same mechanism ● Also makes sense for other DRAM-based storage systems (see paper for details) October 3, 2013Log-Structured MemorySlide 17 Conclusion

October 3, 2013Log-Structured MemorySlide 18 Client Throughput 80%: 17-27% degradation 90%: 26-49% degradation 80%: 14-15% degradation 90%: 30-42% degradation 80%: 3-4% degradation 90%: 3-6% degradation

● Server crash? Replay log on other servers to reconstruct lost data ● Tombstones identify deleted objects:  Written into log when object deleted or overwritten  Info in tombstone: ● Table id ● Object key ● Version of dead object ● Id of segment where object stored ● When can tombstones be deleted?  After segment containing object has been cleaned (and replicas deleted on backups) ● Note: tombstones are a mixed blessing October 3, 2013Log-Structured MemorySlide 19 Tombstones