1 PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.

Slides:



Advertisements
Similar presentations
Conserving Disk Energy in Network Servers ACM 17th annual international conference on Supercomputing Presented by Hsu Hao Chen.
Advertisements

1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.
Pelican: A Building Block for Exascale Cold Data Storage
1 Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems Brian Forney Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau Wisconsin Network Disks University.
Write off-loading: Practical power management for enterprise storage D. Narayanan, A. Donnelly, A. Rowstron Microsoft Research, Cambridge, UK.
Walter Binder University of Lugano, Switzerland Niranjan Suri IHMC, Florida, USA Green Computing: Energy Consumption Optimized Service Hosting.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
1 Conserving Energy in RAID Systems with Conventional Disks Dong Li, Jun Wang Dept. of Computer Science & Engineering University of Nebraska-Lincoln Peter.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
RIMAC: Redundancy-based hierarchical I/O cache architecture for energy-efficient, high- performance storage systems Xiaoyu Yao and Jun Wang Computer Architecture.
Shimin Chen Big Data Reading Group Presented and modified by Randall Parabicoli.
Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
Accurate and Efficient Replaying of File System Traces Nikolai Joukov, TimothyWong, and Erez Zadok Stony Brook University (FAST 2005) USENIX Conference.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software.
Everest: scaling down peak loads through I/O off-loading D. Narayanan, A. Donnelly, E. Thereska, S. Elnikety, A. Rowstron Microsoft Research Cambridge,
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM J. Wilkes, R. Golding, C. Staelin T. Sullivan HP Laboratories, Palo Alto, CA.
Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives Junghee Lee, Youngjae Kim, Sarp Oral, Galen M. Shipman, David A. Dillow,
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
Reliability Analysis of An Energy-Aware RAID System Shu Yin Xiao Qin Auburn University.
Cloud Data Center/Storage Power Efficiency Solutions Junyao Zhang 1.
1 Storage Refinement. Outline Disk failures To attack Intermittent failures To attack Media Decay and Write failure –Checksum To attack Disk crash –RAID.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University RuGang Xu, Peter Reiher – University.
PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University Peter Reiher – University of California,
Exploiting Flash for Energy Efficient Disk Arrays Shimin Chen (Intel Labs) Panos K. Chrysanthis (University of Pittsburgh) Alexandros Labrinidis (University.
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Low-Power Wireless Sensor Networks
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
1 PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
MSN 数学媒体与信息存储 1/27 Zhuo Liu, Fei Wu, Xiao Qin, Changsheng Xie, Jian Zhou, and Jianzong Wang TRACER: A Trace Replay Tool to Evaluate Energy-Efficiency of.
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Example: Rumor Performance Evaluation Andy Wang CIS 5930 Computer Systems Performance Analysis.
Conquest-2: Improving Energy Efficiency and Performance Through a Disk/RAM Hybrid File System An-I Andy Wang Florida State University (NSF CCR ,
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
File management and Performance. File Systems Architecture device drivers physical I/O (PIOCS) logical I/O (LIOCS) access methods File organization and.
Best Available Technologies: External Storage Overview of Opportunities and Impacts November 18, 2015.
Dynamic Placement of Virtual Machines for Managing SLA Violations NORMAN BOBROFF, ANDRZEJ KOCHUT, KIRK BEATY SOME SLIDE CONTENT ADAPTED FROM ALEXANDER.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Department of Computer Sciences, University of Wisconsin Madison DADA – Dynamic Allocation of Disk Area Jayaram Bobba Vivek Shrivastava.
MEMS and Caching for File Systems Andy Wang COP 5611 Advanced Operating Systems.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Maximizing Performance – Why is the disk subsystem crucial to console performance and what’s the best disk configuration. Extending Performance – How.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
1 PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
Motivation Energy costs are rising –An increasing concern for servers –No longer limited to laptops Energy consumption of disk drives –24% of the power.
Input and Output Optimization in Linux for Appropriate Resource Allocation and Management James Avery King.
PARAID: A Gear-Shifting Power-Aware RAID
Green cloud computing 2 Cs 595 Lecture 15.
The Composite-File File System: Decoupling the One-to-one Mapping of Files and Metadata for Better Performance Shuanglong Zhang, Helen Catanese, Andy An-I.
PARAID: A Gear-Shifting Power-Aware RAID
THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM
Overview Continuation from Monday (File system implementation)
Qingbo Zhu, Asim Shankar and Yuanyuan Zhou
Dong Hyun Kang, Changwoo Min, Young Ik Eom
Presentation transcript:

1 PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of California, Los Angeles Geoff Kuenning – Harvey Mudd College

2 Motivation Energy costs are rising Energy costs are rising An increasing concern for servers An increasing concern for servers No longer limited to laptops No longer limited to laptops Energy consumption of disk drives Energy consumption of disk drives 24% of the power usage in web servers 24% of the power usage in web servers 27% of electricity cost for data centers 27% of electricity cost for data centers More energy  more heat  more cooling  lower computational density  more space  more costs More energy  more heat  more cooling  lower computational density  more space  more costs Is it possible to reduce energy consumption without degrading performance while maintaining reliability? Is it possible to reduce energy consumption without degrading performance while maintaining reliability? PARAID: A Gear-Shifting Power-Aware RAID

3 Challenges Energy Energy Not enough opportunities to spin down RAIDs Not enough opportunities to spin down RAIDs Performance Performance Essential for peak loads Essential for peak loads Reliability Reliability Server-class drives are not designed for frequent power switching Server-class drives are not designed for frequent power switching PARAID: A Gear-Shifting Power-Aware RAID

4 Existing Work Most trade performance for energy savings directly Most trade performance for energy savings directly e.g. vary speed of disks e.g. vary speed of disks Most are simulated results Most are simulated results PARAID: A Gear-Shifting Power-Aware RAID

5 Observations RAID is configured for peak performance RAID is configured for peak performance RAID keeps all drives spinning for light loads RAID keeps all drives spinning for light loads Unused storage capacity Unused storage capacity Over-provision of storage capacity Over-provision of storage capacity Unused storage can be traded for energy savings Unused storage can be traded for energy savings Fluctuating load Fluctuating load Cyclic fluctuation of loads Cyclic fluctuation of loads Infrequent on-off power transitions can be effective Infrequent on-off power transitions can be effective PARAID: A Gear-Shifting Power-Aware RAID

6 Performance vs. Energy Optimizations Performance benefits Performance benefits Realized under heavy loads Realized under heavy loads Energy benefits Energy benefits Realized instantaneously Realized instantaneously

7 Power-Aware RAID Skewed striping for energy savings Skewed striping for energy savings Preserving peak performance Preserving peak performance Maintaining reliability Maintaining reliability Evaluation Evaluation Conclusion Conclusion Questions Questions PARAID: A Gear-Shifting Power-Aware RAID

8 Skewed Striping for Energy Saving Use over-provisioned spare storage Use over-provisioned spare storage Organized into hierarchical overlapping subsets Organized into hierarchical overlapping subsets PARAID: A Gear-Shifting Power-Aware RAID RAID 12345

9 Skewed Striping for Energy Saving Each set analogous to gears in automobiles Each set analogous to gears in automobiles PARAID: A Gear-Shifting Power-Aware RAID RAID Gears

10 Skewed Striping for Energy Saving Soft states can be reclaimed for space Soft states can be reclaimed for space Persist across reboots Persist across reboots PARAID: A Gear-Shifting Power-Aware RAID RAID Soft States Gears

11 Skewed Striping for Energy Saving Operate in gear 1 Operate in gear 1 Disks 4 and 5 are powered off Disks 4 and 5 are powered off PARAID: A Gear-Shifting Power-Aware RAID RAID Gears Soft States

12 Skewed Striping for Energy Saving Approximate the workload Approximate the workload Gear shift into most appropriate gear Gear shift into most appropriate gear Minimize the opportunity lost to save power Minimize the opportunity lost to save power Energy ( Powered On Disks ) Workload ( Disk Parallelism ) Conventional RAIDPARAID workload PARAID: A Gear-Shifting Power-Aware RAID

13 Skewed Striping for Energy Saving Adapt to cyclic fluctuating workload Adapt to cyclic fluctuating workload Gear shift when gear utilization threshold is met Gear shift when gear utilization threshold is met time load utilization threshold gear shift PARAID: A Gear-Shifting Power-Aware RAID

14 Preserving Peak Performance Operate in the highest gear Operate in the highest gear When the system demands peak performance When the system demands peak performance Uses the same disk layout Uses the same disk layout Maximize parallelism within each gear Maximize parallelism within each gear Load is balanced Load is balanced Uniform striping pattern Uniform striping pattern Delay block replication until gear shifts Delay block replication until gear shifts Capture block writes Capture block writes PARAID: A Gear-Shifting Power-Aware RAID

15 Maintaining Reliability Reuse existing RAID levels (RAID-5) Reuse existing RAID levels (RAID-5) Also used in various gears Also used in various gears Drives have a limited number of power cycles Drives have a limited number of power cycles Ration number of power cycles Ration number of power cycles PARAID: A Gear-Shifting Power-Aware RAID

16 Maintaining Reliability Busy disk stay powered on, idle disks stay powered off Busy disk stay powered on, idle disks stay powered off Outside disks are role exchanged with middle disks Outside disks are role exchanged with middle disks busy disks power cycled disks idle disks role exchange Disk 1 Gear 1 Gear 2 Gear 3 Disk 2Disk 3Disk 4Disk 5Disk 6 PARAID: A Gear-Shifting Power-Aware RAID

17 File system RAID PARAID block mapping Disk device driver User space Linux kernel Soft RAID Reliability manager Load monitor Gear manager Admin tool Logical Component Design PARAID: A Gear-Shifting Power-Aware RAID

18 Data Layout PARAID: A Gear-Shifting Power-Aware RAID Disk 1Disk 2Disk 3Disk 4Disk 5 Gear 1 RAID-5 (1-4)812((1-4),8,12) 1620(16,20,_)_ Gear 2 RAID (1-4) 567(5-8)8 910(9-12) (13-16) (17-20) Resembles the data flow of RAID 1+0 Resembles the data flow of RAID 1+0 Parity for 5 disks does not work for 4 disks Parity for 5 disks does not work for 4 disks For example, replicated block 12 on disk 3 For example, replicated block 12 on disk 3

19 Data Layout PARAID: A Gear-Shifting Power-Aware RAID Disk 1Disk 2Disk 3Disk 4Disk 5 Gear 1 RAID-5 (1-4)812((1-4),8,12) 1620(16,20,_)_ Gear 2 RAID (1-4) 567(5-8)8 910(9-12) (13-16) (17-20) Cascading parity updates Cascading parity updates For example, updating block 8 on disk 5 For example, updating block 8 on disk 5

20 Update Propagation Up-shift propagation (e.g. shifting from 3 to 5 disks) Up-shift propagation (e.g. shifting from 3 to 5 disks) Full synchronization Full synchronization On-demand synchronization On-demand synchronization Need to respect block dependency Need to respect block dependency Downshift propagation Downshift propagation Full synchronization Full synchronization PARAID: A Gear-Shifting Power-Aware RAID

21 Asymmetric Gear-Shifting Policies Up-shift (aggressive) Up-shift (aggressive) Moving utilization average + moving standard deviation > utilization threshold Moving utilization average + moving standard deviation > utilization threshold Downshift (conservative) Downshift (conservative) Modified utilization moving average + moving standard deviation < utilization threshold Modified utilization moving average + moving standard deviation < utilization threshold Moving average modified to account for fewer drives and extra parity updates Moving average modified to account for fewer drives and extra parity updates PARAID: A Gear-Shifting Power-Aware RAID

22 Implementation Prototyped in Linux Prototyped in Linux Open source, software RAID Open source, software RAID Implemented block I/O handler, monitor, disk manager Implemented block I/O handler, monitor, disk manager Implemented user admin tool to configure device Implemented user admin tool to configure device Updated Raid Tools to recognize PARAID level Updated Raid Tools to recognize PARAID level PARAID: A Gear-Shifting Power-Aware RAID

23 Evaluation Challenges Challenges Prototyping PARAID Prototyping PARAID Commercial machines Commercial machines Conceptual barriers Conceptual barriers Benchmarks designed to measure peak performance Benchmarks designed to measure peak performance Trace replay Trace replay Time consuming Time consuming PARAID: A Gear-Shifting Power-Aware RAID

24 Evaluation multimeter USB cable client server power supply 12v & 5v power lines power measurement probes SCSI cable crossover cable Xeon 2.8 Ghz, 512 MB RAM 36.7 GB 15k RPM SCSI P4 2.8 Ghz, 1 GB RAM 160 GB 7200 RPM SATA RAID BOOT PARAID: A Gear-Shifting Power-Aware RAID Measurement framework Measurement framework

25 Evaluation Three different workloads using two different RAID settings Three different workloads using two different RAID settings Web trace - RAID level 0 (2-disk gear 1, 5-disk gear 2) Web trace - RAID level 0 (2-disk gear 1, 5-disk gear 2) Mostly read activity Mostly read activity Cello99 - RAID level 5 (3-disk gear 1, 5-disk gear 2) Cello99 - RAID level 5 (3-disk gear 1, 5-disk gear 2) I/O-intensive workload with writes I/O-intensive workload with writes PostMark - RAID level 5 PostMark - RAID level 5 Measure peak performance and gear shifting overhead Measure peak performance and gear shifting overhead Speed up trace playback Speed up trace playback To match hardware To match hardware Explore range of speed up factors and power savings Explore range of speed up factors and power savings PARAID: A Gear-Shifting Power-Aware RAID

26 Web Trace UCLA CS Dept Web Servers (8/11/2006 – 8/14/2006) UCLA CS Dept Web Servers (8/11/2006 – 8/14/2006) File system: ~32 GB (~500k files) File system: ~32 GB (~500k files) Trace replay: ~95k requests with ~4 GB data (~260 MB unique) Trace replay: ~95k requests with ~4 GB data (~260 MB unique) PARAID: A Gear-Shifting Power-Aware RAID

27 Web Trace Power Savings PARAID: A Gear-Shifting Power-Aware RAID 64x – 60 requests/sec 128x – 120 requests/sec256x – 240 requests/sec 64x - 34% 128x - 28% 256x - 10% Energy Savings

28 Web Trace Latency PARAID: A Gear-Shifting Power-Aware RAID 256x 128x64x 256x - within 2.7% 64x - 240% 80ms vs. 33ms Overhead

29 Web Trace Bandwidth PARAID: A Gear-Shifting Power-Aware RAID 256x 128x64x 256x - within 1.3% in high gear Overhead

30 Cello99 Trace Cello99 Workload Cello99 Workload HP Storage Research Labs HP Storage Research Labs 50 hours beginning on 9/12/ hours beginning on 9/12/ million requests (12 GB) to 440MB of unique blocks 1.5 million requests (12 GB) to 440MB of unique blocks I/O-intensive with 42% writes I/O-intensive with 42% writes PARAID: A Gear-Shifting Power-Aware RAID

31 Cello99 Power Savings PARAID: A Gear-Shifting Power-Aware RAID 128x – 1000 requests/sec 32x – 270 requests/sec 64x – 550 requests/sec 32x - 13% 64x - 8.2% 128x - 3.5% Energy Savings

32 Cello99 Completion Time PARAID: A Gear-Shifting Power-Aware RAID 128x 64x32x 32x - 1.8ms, 26% slower due to time spent in low gear Overhead

33 Cello99 Bandwidth PARAID: A Gear-Shifting Power-Aware RAID 64x32x 128x Overhead < 1% degra- dation during peak hours

34 PostMark Benchmark Popular synthetic benchmark Popular synthetic benchmark Generates ISP-style workloads Generates ISP-style workloads Stresses peak read/write performance of storage device Stresses peak read/write performance of storage device PARAID: A Gear-Shifting Power-Aware RAID

35 Postmark Performance PARAID: A Gear-Shifting Power-Aware RAID

36 Postmark Power Savings PARAID: A Gear-Shifting Power-Aware RAID

37 Related Work Pergamum Pergamum EERAID EERAID RIMAC RIMAC Hibernator Hibernator MAID MAID PDC PDC BlueFS BlueFS PARAID: A Gear-Shifting Power-Aware RAID

38 Ongoing Work Try more workloads Try more workloads Optimize PARAID gear configuration Optimize PARAID gear configuration Explore asynchronous update propagation Explore asynchronous update propagation Speed up recovery Speed up recovery Live testing Live testing PARAID: A Gear-Shifting Power-Aware RAID

39 Lessons Learned Third version of design, early design too complicated Third version of design, early design too complicated Data alignment problems Data alignment problems Difficult to measure system under normal load Difficult to measure system under normal load Hard to predict workload transformations due to complex system optimizations Hard to predict workload transformations due to complex system optimizations Challenging to match trace environments Challenging to match trace environments PARAID: A Gear-Shifting Power-Aware RAID

40 Conclusion PARAID reuses standard RAID-levels without special hardware while decreasing their energy use by 34%. PARAID reuses standard RAID-levels without special hardware while decreasing their energy use by 34%. Optimized version can save even more energy Optimized version can save even more energy Empirical evaluation important Empirical evaluation important PARAID: A Gear-Shifting Power-Aware RAID

41 PARAID Recovery 2.7 times slower than conventional raid 2.7 times slower than conventional raid For example, 2 gear PARAID device For example, 2 gear PARAID device First, the soft state must recover First, the soft state must recover Second, data must be propagated Second, data must be propagated Third, conventional raid must recover Third, conventional raid must recover Recovery not as bad for read intensive workloads Recovery not as bad for read intensive workloads

42 PARAID Gear-Shifting 256x128x64x Number of gear switches % time spent in low gear52%88%98% % extra I/Os for update propagations 0.63%0.37%0.21% 128x64x32x Number of gear switches % time spent in low gear47%74%88% % extra I/Os for update propagations 8.0%15%21% Web Trace Gear-Shifting Stats Cello99 Gear-Shifting Stats