Storage: Scaling Out > Scaling Up? Ankit Singla Chi-Yao Hong.

Slides:



Advertisements
Similar presentations
A Case for Redundant Arrays Of Inexpensive Disks Paper By David A Patterson Garth Gibson Randy H Katz University of California Berkeley.
Advertisements

Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
RAID Redundant Array of Independent Disks
CSCE430/830 Computer Architecture
FAWN: Fast Array of Wimpy Nodes Developed By D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, V. Vasudevan Presented by Peter O. Oliha.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems New Trends in Distributed Storage Steve Ko Computer Sciences and Engineering University at Buffalo.
FAWN: Fast Array of Wimpy Nodes A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
R.A.I.D. Copyright © 2005 by James Hug Redundant Array of Independent (or Inexpensive) Disks.
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Facebook f4 Steve Ko Computer Sciences and Engineering University at Buffalo.
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
CSCE 212 Chapter 8 Storage, Networks, and Other Peripherals Instructor: Jason D. Bakos.
Department of Electronics Advanced Information Storage 18 Atsufumi Hirohata 17:00 05/December/2013 Thursday (P/T 006)
1 Recap (RAID and Storage Architectures). 2 RAID To increase the availability and the performance (bandwidth) of a storage system, instead of a single.
FAWN: A Fast Array of Wimpy Nodes Authors: David G. Andersen et al. Offence: Jaime Espinosa Chunjing Xiao.
FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri.
Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.
FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.
I/O Systems and Storage Systems May 22, 2000 Instructor: Gary Kimura.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Computer Organization CS224 Fall 2012 Lesson 51. Measuring I/O Performance  I/O performance depends on l Hardware: CPU, memory, controllers, buses l.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
Two or more disks Capacity is the same as the total capacity of the drives in the array No fault tolerance-risk of data loss is proportional to the number.
Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of.
N-Tier Client/Server Architectures Chapter 4 Server - RAID Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept RAID – Redundant Array.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
RAID COP 5611 Advanced Operating Systems Adapted from Andy Wang’s slides at FSU.
Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Redundant Array of Independent Disks.  Many systems today need to store many terabytes of data.  Don’t want to use single, large disk  too expensive.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Presenters: Rezan Amiri Sahar Delroshan
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
Amar Phanishayee,LawrenceTan,Vijay Vasudevan
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
RAID Disk Arrays Hank Levy. 212/5/2015 Basic Problems Disks are improving, but much less fast than CPUs We can use multiple disks for improving performance.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
Abstract Increases in CPU and memory will be wasted if not matched by similar performance in I/O SLED vs. RAID 5 levels of RAID and respective cost/performance.
1 Efficient Mixed-Platform Clouds Phillip B. Gibbons, Intel Labs Michael Kaminsky, Michael Kozuch, Padmanabhan Pillai (Intel Labs) Gregory Ganger, David.
Lecture Topics: 11/22 HW 7 File systems –block allocation Unix and NT –disk scheduling –file caches –RAID.
Cloud Computing Vs RAID Group 21 Fangfei Li John Soh Course: CSCI4707.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
Lecture 17 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
CMPE Database Systems Workshop June 16 Class Meeting
A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988
A Case for Redundant Arrays of Inexpensive Disks (RAID)
RAID Redundant Arrays of Independent Disks
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
Vladimir Stojanovic & Nicholas Weaver
FAWN: A Fast Array of Wimpy Nodes
RAID Disk Arrays Hank Levy 1.
ICOM 6005 – Database Management Systems Design
RAID Disk Arrays Hank Levy 1.
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
UNIT IV RAID.
Mark Zbikowski and Gary Kimura
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
FAWN: A Fast Array of Wimpy Nodes
RAID Disk Arrays Hank Levy 1.
Presentation transcript:

Storage: Scaling Out > Scaling Up? Ankit Singla Chi-Yao Hong

How Can Great Firms Fail? C. M. Christensen

Sustain vs. Disrupt Sustaining technologies – Customer demand driven – Pioneered by incumbent big firms Disruptive technologies – Often convince people: “You want this!” – Create new markets, new customers – Often pioneered by new firms L OWER O N T ECH B IGGER O N O PPORTUNITY 3

Sustain vs. Disrupt Performance Time High-end Demand Low-end Demand Sustaining Tech Disrupting Tech 4

A Case for Redundant Arrays of Inexpensive Disks (RAID) David Patterson, Garth Gibson and Randy Katz

Why RAID? Performance of SLEDs has increased slowly Amdahl’s law and the I/O crisis – ; 100x faster CPU, 10% I/O time, S? Large memories / SSDs as buffers don’t work – Transaction processing Large volume of small requests – Supercomputing-like environments Small volume of large requests 6

Inexpensive Disks Lower cost, lower performance Better price/MB, smaller size, power footprint Also, lower reliability for same capacity  A DD R EDUNDANCY ! 7

RAID 1 Idea: Just mirror each disk Better throughput for reads Huge MTTF Inefficient / costly 8

RAID 2 Hamming codes for error detection/correction Hamming (7, 4) example So use 3 parity disks with 4 data disks Inefficient for small data transfers High cost for reliability P1P2D1P3D2D3D4 P1XXXX P2XXXX P3XXXX 9

RAID 3 Disk controllers know which disk failed – Just use 1 parity disk to fix that disk Low-enough cost of reliability RAID 4, 5 improve throughput for small accesses 10

RAID 4 Exploit parallel I/O across disks in a group – Interleave data at sector-level instead of bit-level Small write to one disk doesn’t access other disks But the check-disk is the bottleneck S ECTORS, N OT B ITS 11

RAID 5 Distribute the parity blocks across disks! Good performance across small/large reads/writes 12

RAID vs. SLED 10x performance, reliability, power efficiency Modular growth – Add groups or even individual disks – Also makes hot-standbys practical Lower power/MB makes battery-backup for the whole disk array feasible 13

Comparison of RAID Levels 14

Discussion What’s wrong with the below organization? Other applications for the RAID ideas? – Across data centers or cloud providers? Better metrics for reliability than MTTF? 15 File System RAID Disk Driver

FAWN: A Fast Array of Wimpy Nodes David Anderson, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan (figures adapted from the slides by Vijay Vasudevan)

17 Facebook Relies on Coal Power in Data Center

Energy in Computing is a Real Issue “Google’s power consumption … would incur an annual electricity bill of nearly $38 million” [1] “Energy consumption by … data centers could nearly double... (by 2011) to more than 100 billion kWh, representing a $7.4 billion annual electricity cost” [2] [1] Cutting the Electric Bill for Internet-Scale Systems, SIGCOMM'09 [2] Environmental Protection Agency (EPA) Report

FAWN: A Fast Array of Wimpy Nodes 19

FAWN-KV: An Energy-efficient Cluster for Key-value Store Goal: Improve Query/Joule Prototype: PCEngine Alix 3c2 devices – Single-core 500 MHz AMD Geode LX – 256 MB DDR SDRAM (freq. = 400 MHz) – 100 Mbps Ethernet – 4 GB Sandisk Extreme IV CompactFlash device Challenges: – Efficient and fast failover (to achieve SLA) – Wimpy CPUs, limited DRAM – Flash poor at small random access 20

FAWN-KV: Architecture Manages backends Acts as Gateway Routes Requests Consistent Hashing 21

FAWN-KV Uses DRAM Hash Index to Map a Key to a Value Low probability of multiple Flash reads 22 Get -> Random Read

FAWN-DS Uses Log-structure Datastore to Avoid Small Random Writes 23 Log-structure FS: All updates to data and metadata are written sequentially to a continuous stream, called a log Put: Append an entry to the log Delete: Append an entry to the log

Front-end Manages the VID Membership List A Consistent Hashing Node additions, failures require transfer of key- ranges Back-end node looks at front-end ID list to choose one to contact to join the ring Back-end node Efficient failover 24 O H N G

Maintenance Example: How Does FAWN Transfer Data? 25

FAWN Uses Replication for Fault Tolerance 26

FAWN: Join Protocol 27 0 head 18 midtail 9 Log flush

Performance: FAWN Throughput

Performance: FAWN-DS Lookups System & Storage Query per Second (QPS) Watts QPS/ Watts Alix3cs Sandisk(CF) 1, Desktop w/ Mobi (SSD) 4, Desktop w/ HD Bytes lookup 29

Cost Comparison: FAWN vs. Traditional Servers Total Cost = Capital Cost + 3 yr Power Cost (0.1$/kWh)

Discussion Does FAWN work well for any workload? Scalability issues? – Query latency could degrade – Potential bottleneck: Networking / CPU 31