RAMCloud: a Low-Latency Datacenter Storage System John Ousterhout Stanford University

Slides:

Advertisements

Similar presentations

MinCopysets: Derandomizing Replication in Cloud Storage

Advertisements

Fast Crash Recovery in RAMCloud

Large Scale Computing Systems

Memory and Object Management in a Distributed RAM-based Storage System Thesis Proposal Steve Rumble April 23 rd, 2012.

John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Retreat June, 2014.

RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University (with Nandu Jayakumar, Diego Ongaro, Mendel Rosenblum,

Cache Craftiness for Fast Multicore Key-Value Storage

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.

RAMCloud Scalable High-Performance Storage Entirely in DRAM John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières,

RAMCloud 1.0 John Ousterhout Stanford University (with Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin.

Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.

Symantec De-Duplication Solutions Complete Protection for your Information Driven Enterprise Richard Hobkirk Sr. Pre-Sales Consultant.

Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.

SEDCL: Stanford Experimental Data Center Laboratory.

CS 140 Lecture Notes: Technology and Operating SystemsSlide 1 Technology Changes Mid-1980’s2009Change CPU speed15 MHz2 GHz133x Memory size8 MB4 GB500x.

CS 142 Lecture Notes: Large-Scale Web ApplicationsSlide 1 RAMCloud Overview ● Storage for datacenters ● commodity servers ● GB DRAM/server.

Servers Redundant Array of Inexpensive Disks (RAID) –A group of hard disks is called a disk array FIGURE Server with redundant NICs.

Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.

RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.

Metrics for RAMCloud Recovery John Ousterhout Stanford University.

Windows Azure Storage Services Saranya Sriram, Technology Evangelist, Microsoft, India.

Infiniband enables scalable Real Application Clusters – Update Spring 2008 Sumanta Chatterjee, Oracle Richard Frank, Oracle.

Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗

© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.

The RAMCloud Storage System

RAMCloud Design Review Recovery Ryan Stutsman April 1,

A Compilation of RAMCloud Slides (through Oct. 2012) John Ousterhout Stanford University.

It’s Time for Low Latency Steve Rumble, Diego Ongaro, Ryan Stutsman, Mendel Rosenblum, John Ousterhout Stanford University 報告者 : 厲秉忠

What We Have Learned From RAMCloud John Ousterhout Stanford University (with Asaf Cidon, Ankita Kejriwal, Diego Ongaro, Mendel Rosenblum, Stephen Rumble,

1 The Google File System Reporter: You-Wei Zhang.

RAMCloud Overview John Ousterhout Stanford University.

RAMCloud: Concept and Challenges John Ousterhout Stanford University.

RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.

Some key-value stores using log-structure Zhichao Liang LevelDB Riak.

MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.

HPCS Lab. High Throughput, Low latency and Reliable Remote File Access Hiroki Ohtsuji and Osamu Tatebe University of Tsukuba, Japan / JST CREST.

Cool ideas from RAMCloud Diego Ongaro Stanford University Joint work with Asaf Cidon, Ankita Kejriwal, John Ousterhout, Mendel Rosenblum, Stephen Rumble,

John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Forum January, 2015.

Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.

VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California,

RAMCloud: System Performance Measurements (Jun ‘11) Nandu Jayakumar

Low-Latency Datacenters John Ousterhout Platform Lab Retreat May 29, 2015.

Durability and Crash Recovery for Distributed In-Memory Storage Ryan Stutsman, Asaf Cidon, Ankita Kejriwal, Ali Mashtizadeh, Aravind Narayanan, Diego Ongaro,

RAMCloud: Low-latency DRAM-based storage Jonathan Ellithorpe, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro,

RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University (with Christos Kozyrakis, David Mazières, Aravind Narayanan,

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.

CS 140 Lecture Notes: Technology and Operating Systems Slide 1 Technology Changes Mid-1980’s2012Change CPU speed15 MHz2.5 GHz167x Memory size8 MB4 GB500x.

Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.

Experiences With RAMCloud Applications John Ousterhout, Jonathan Ellithorpe, Bob Brown.

GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.

RAMCloud Overview and Status John Ousterhout Stanford University.

MDC323B SMB 3 is the answer Ned Pyle Sr. PM, Windows Server

Implementing Linearizability at Large Scale and Low Latency Collin Lee, Seo Jin Park, Ankita Kejriwal, † Satoshi Matsushita, John Ousterhout Platform Lab.

CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.

Click to edit Master title style Literature Review Interconnection Architectures for Petabye-Scale High-Performance Storage Systems Andy D. Hospodor, Ethan.

CSci8211: Distributed Systems: RAMCloud 1 Distributed Shared Memory/Storage Case Study: RAMCloud Developed by Stanford Platform Lab  Key Idea: Scalable.

John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Retreat June, 2013.

Technology Drill Down: Windows Azure Platform Eric Nelson | ISV Application Architect | Microsoft UK |

PetaCache: Data Access Unleashed Tofigh Azemoon, Jacek Becla, Chuck Boeheim, Andy Hanushevsky, David Leith, Randy Melen, Richard P. Mount, Teela Pulliam,

CubicRing ENABLING ONE-HOP FAILURE DETECTION AND RECOVERY FOR DISTRIBUTED IN- MEMORY STORAGE SYSTEMS Yiming Zhang, Chuanxiong Guo, Dongsheng Li, Rui Chu,

RAMCloud and the Low-Latency Datacenter John Ousterhout Stanford Platform Laboratory.

The Stanford Platform Laboratory John Ousterhout and Guru Parulkar Stanford University

Gorilla: A Fast, Scalable, In-Memory Time Series Database

St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:

Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore

FaRM: Fast Remote Memory Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson and Miguel Castro, Microsoft Research NSDI’14 January 5 th, 2016 Cho,

Log-Structured Memory for DRAM-Based Storage Stephen Rumble and John Ousterhout Stanford University.

RAMCloud Architecture

RAMCloud Architecture

Presentation transcript:

RAMCloud: a Low-Latency Datacenter Storage System John Ousterhout Stanford University

● RAMCloud: new class of datacenter storage All data always in DRAM ● Large scale:  ,000 storage servers  100 TB - 1 PB total capacity ● Low latency:  5 µs remote access time from anywhere in datacenter ● Durable/available ● Overall goal: enable a new class of applications Does this make sense for Radio Astronomy apps? April 3, 2014Exascale Radio Astronomy ConferenceSlide 2 Introduction

April 3, 2014Exascale Radio Astronomy ConferenceSlide 3 Traditional Storage Choices Local DRAM Local Disk Local Flash Network Disk Latency50-100ns5-10ms50-200µs5-10ms Bandwidth10-50 GB/s100 MB/s (per disk) 250 MB/s (per drive) 1 GB/s/node (network) Maximum capacity 1 TB5-20TB1-5 TB10-100PB Maximum cores 24 scalable Durable and fault tolerant? nopartially yes

April 3, 2014Exascale Radio Astronomy ConferenceSlide 4 RAMCloud Architecture Master Backup Master Backup Master Backup Master Backup … Appl. Library Appl. Library Appl. Library Appl. Library … Datacenter Network Coordinator 1000 – 10,000 Storage Servers 1000 – 100,000 Application Servers Commodity Servers GB per server High-speed networking: ● 5 µs round-trip ● Full bisection bwidth

Data Model: Key-Value Store ● Basic operations:  read(tableId, key) => blob, version  write(tableId, key, blob) => version  delete(tableId, key) ● Other operations:  cwrite(tableId, key, blob, version) => version  Enumerate objects in table  Efficient multi-read, multi-write  Atomic increment ● Under development:  Secondary indexes  Atomic updates of multiple objects April 3, 2014Exascale Radio Astronomy ConferenceSlide 5 Tables (Only overwrite if version matches) Key (≤ 64KB) Version (64b) Blob (≤ 1MB) Object

● One copy of data in DRAM ● Multiple copies on disk/flash  Each master’s backup data scattered across cluster ● Fast crash recovery  Remaining servers work together to recover lost data  Typical recovery time: 1-2 seconds April 3, 2014Exascale Radio Astronomy ConferenceSlide 6 Data Durability

● Using Infiniband networking (24 Gb/s, kernel bypass)  Other networking also supported, but slower ● Reads:  100B objects: 5µs  10KB objects: 10µs  Single-server throughput (100B objects): 700 Kops/sec.  Small-object multi-reads: 1-2M objects/sec. ● Durable writes:  100B objects: 16µs  10KB objects: 40µs  Small-object multi-writes: K objects/sec. April 3, 2014Exascale Radio Astronomy ConferenceSlide 7 RAMCloud Performance 1 client, 1 server

April 3, 2014Exascale Radio Astronomy ConferenceSlide 8 Comparisons Local DRAM Network Disk RAMCloud Latency50-100ns5-10ms5µs Bandwidth10-50 GB/s1 GB/s/node (network) Maximum capacity 1 TB10-100PB1-5PB Maximum cores 24scalable Durable and fault tolerant? noyes

● Ongoing research project at Stanford ● Goal: production-quality system  Source code freely available  Version 1.0 tagged in January 2014 (first version suitable for real applications)  Starting to work with early adopters ● System requirements:  x86 servers (minimum cluster size: servers)  Linux operating system  Need networking with kernel-bypass NICs ● Built-in support for Mellanox Infiniband ● Driver for SolarFlare 10 Gbs Ethernet NICs under development April 3, 2014Exascale Radio Astronomy ConferenceSlide 9 RAMCloud Status

Issues to consider: ● Remote access data model  Sparse vs. bulk ● Key-value store ● Durability April 3, 2014Exascale Radio Astronomy ConferenceSlide 10 Is RAMCloud Right for You?

DDDDDDDDDD Network Large-Scale Applications ● Computation, data colocated ● Works best for:  Bulk processing (touch all data)  High locality of access ● Performance dominated by bandwidth ● Examples: analytics ● Remote data access ● Works best for:  Sparse and unpredictable data accesses  No locality ● Performance dominated by latency ● Example: transactional Web applications (Facebook) April 3, 2014Exascale Radio Astronomy ConferenceSlide 11 CCCCCCCCCCD Network CCCCCCCCCC DDDDDDDDD Computation Nodes Storage Nodes

● RAMCloud: general-purpose DRAM-based storage  Scale  Latency ● Goals:  Harness full performance potential of DRAM-based storage  Enable new applications: intensive manipulation of large-scale data ● What could you do with:  1M cores  1 petabyte data  5-10µs access time April 3, 2014Exascale Radio Astronomy ConferenceSlide 12 Conclusion