Exploiting Flash for Energy Efficient Disk Arrays Shimin Chen (Intel Labs) Panos K. Chrysanthis (University of Pittsburgh) Alexandros Labrinidis (University.

Slides:



Advertisements
Similar presentations
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
Advertisements

Triple-Parity RAID and Beyond Hai Lu. RAID RAID, an acronym for redundant array of independent disks or also known as redundant array of inexpensive disks,
Raid dr. Patrick De Causmaecker What is RAID Redundant Array of Independent (Inexpensive) Disks A set of disk stations treated as one.
Storing Data: Disks and Files: Chapter 9
Write off-loading: Practical power management for enterprise storage D. Narayanan, A. Donnelly, A. Rowstron Microsoft Research, Cambridge, UK.
FAWN: Fast Array of Wimpy Nodes Developed By D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, V. Vasudevan Presented by Peter O. Oliha.
MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES D. Colarelli D. Grunwald U. Colorado, Boulder.
Chapter 3 Presented by: Anupam Mittal.  Data protection: Concept of RAID and its Components Data Protection: RAID - 2.
CSE521: Introduction to Computer Architecture Mazin Yousif I/O Subsystem RAID (Redundant Array of Independent Disks)
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Facebook f4 Steve Ko Computer Sciences and Engineering University at Buffalo.
Storage 2: RAID Learning Objectives – To understand the technology drivers leading to RAID arrays – To understand the principles of common RAID configurations.
RIMAC: Redundancy-based hierarchical I/O cache architecture for energy-efficient, high- performance storage systems Xiaoyu Yao and Jun Wang Computer Architecture.
Shimin Chen Big Data Reading Group Presented and modified by Randall Parabicoli.
Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.
Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:
Recap of Feb 25: Physical Storage Media Issues are speed, cost, reliability Media types: –Primary storage (volatile): Cache, Main Memory –Secondary or.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.
Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Lecture 1: Introduction CS170 Spring 2015 Chapter 1, the text book. T. Yang.
OS and Hardware Tuning. Tuning Considerations Hardware  Storage subsystem Configuring the disk array Using the controller cache  Components upgrades.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Ruston Panabaker Architect Windows Hardware Innovation Group
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Usage Centric Green Metrics for Storage Doron Chen, Ealan Henis, Ronen Kat and Dmitry Sotnikov IBM Haifa Research Lab Most of the metrics defined today.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 5 – Storage Organization.
Cloud Data Center/Storage Power Efficiency Solutions Junyao Zhang 1.
Comp 1001: IT & Architecture - Joe Carthy 1 Information Representation: Summary All Information is stored and transmitted in digital form in a computer.
Slide 1 Windows PC Accelerators Reporter :吳柏良. Slide 2 Outline l Introduction l Windows SuperFetch l Windows ReadyBoost l Windows ReadyDrive l Conclusion.
Storage Systems CSE 598d, Spring 2007 Lecture 5: Redundant Arrays of Inexpensive Disks Feb 8, 2007.
1 Storage Refinement. Outline Disk failures To attack Intermittent failures To attack Media Decay and Write failure –Checksum To attack Disk crash –RAID.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
Lecture 11: DMBS Internals
Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.
PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University RuGang Xu, Peter Reiher – University.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland.
PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University Peter Reiher – University of California,
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
FlashSystem family 2014 © 2014 IBM Corporation IBM® FlashSystem™ V840 Product Overview.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings A. gupta, Y. Kim, B. Urgaonkar, Penn State ASPLOS.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
The Sort Benchmark AlgorithmsSolid State Disks External Memory Multiway Mergesort  Phase 1: Run Formation  Phase 2: Merge Runs  Careful parameter selection.
Storage 2: RAID Learning Objectives –To understand the technology drivers leading to RAID arrays –To understand the principles of common RAID configurations.
Best Available Technologies: External Storage Overview of Opportunities and Impacts November 18, 2015.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Jiahao Chen, Yuhui Deng, Zhan Huang 1 ICA3PP2015: The 15th International Conference on Algorithms and Architectures for Parallel Processing. zhangjiajie,
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
XIP – eXecute In Place Jiyong Park. 2 Contents Flash Memory How to Use Flash Memory Flash Translation Layers (Traditional) JFFS JFFS2 eXecute.
Rethinking RAID for SSD based HPC Systems Yugendra R. Guvvala, Yong Chen, and Yu Zhuang Department of Computer Science, Texas Tech University, Lubbock,
W4118 Operating Systems Instructor: Junfeng Yang.
The Sort Benchmark AlgorithmsSolid State Disks External Memory Multiway Mergesort  Phase 1: Run Formation  Phase 2: Merge Runs  Careful parameter selection.
Lecture 17 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.
Elastic Parity Logging for SSD RAID Arrays Yongkun Li*, Helen Chan #, Patrick P. C. Lee #, Yinlong Xu* *University of Science and Technology of China #
Motivation Energy costs are rising –An increasing concern for servers –No longer limited to laptops Energy consumption of disk drives –24% of the power.
Database Applications (15-415) DBMS Internals- Part I Lecture 11, February 16, 2016 Mohammad Hammoud.
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
Mass-Storage Systems.
CSE 451: Operating Systems Autumn 2009 Module 17 Berkeley Log-Structured File System Ed Lazowska Allen Center
Presentation transcript:

Exploiting Flash for Energy Efficient Disk Arrays Shimin Chen (Intel Labs) Panos K. Chrysanthis (University of Pittsburgh) Alexandros Labrinidis (University of Pittsburgh)

Motivation Growing concern on data center energy consumption Energy consumption of data storage:  Fastest annual growth among data center components  20% between 2000 and 2006 [EPA report] Goal: energy proportional data storage  i.e. Energy consumption  system utilization Challenging: HDD is dominant technology  HDD idle power is often 80% of active power  Transition to/from standby mode takes ~10 seconds  Could incur significant application slowdowns 2

Previous Approach: Exploit Redundancy and NVRAM Most storage systems today employ redundancy  High reliability, availability, performance for applications  E.g. TPC-E requires redundancy in both data and logs Idea: spin down disks containing redundant copies of data when system is under low load  Mirror-based (e.g., RAID 10): one disk active per mirror  Parity-based (e.g., RAID 5): use parity reconstruction NVRAM (battery-backed RAM):  Maintain redundancy for writes  Spin up disks to apply buffered writes when NVRAM is full 3 [Li & Wang’04] [Pinheiro et al. ’06] [Yao & Wang’06]

Limitations of Previous Approach NVRAM size vs. HDD spin up/down wear cycles  Server-class HDDs: ~50,000 spin-up/down cycles  5-year life time means ~1.1 spin-up/down per hour  NVRAM is expensive and thus small  Often hundreds of MB per disk array  Requires frequent disk spin up/down to flush NVRAM- buffered writes, reducing disk life time Exploiting redundancy alone cannot achieve energy proportionality goal  E.g., for mirrored-disks, 50% disks are active when system is 1% utilized 4

Proposal 1: Exploit Flash as Write Buffer Desirable properties of flash:  Nonvolatile:  Maintain redundancy  Much cheaper and much larger capacity:  Reduce spin-up/down cycles  Good performance for sequential writes and random reads  Can be efficiently used as write buffer under low load Flash-based cache products with hundreds of GB capacity are already available for storage systems Our proposal shares the flash resource:  Existing use: improve performance under high load  New use: reduce energy consumption under low load 5 Flash

Proposal 2: Applications and Storage Collaborate to Further Save Energy Energy proportionality goal implies spinning down more disks when system is under very low load  Not all data are immediately available  Potentially incurs large application slow down! Applications (e.g., DBMS) and storage collaborate:  DBMS specifies hot and cold address ranges on RAID volume  DBMS chooses object temperature based on user requests and usage patterns in observed workloads  Storage guarantees data in hot address range are always available  Opportunities for data movement and replication in storage  DBMS can query storage to see if there will be a spin-up delay to access data in cold address ranges  DBMS may schedule work differently for tolerating such delays 6