Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin,

Slides:



Advertisements
Similar presentations
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
Advertisements

Chapter 11 – Virtual Memory Management
CSCE430/830 Computer Architecture
Information and Control in Gray-Box Systems Arpaci-Dusseau and Arpaci-Dusseau SOSP 18, 2001 John Otto Wi06 CS 395/495 Autonomic Computing Systems.
Predictable Computer Systems Remzi Arpaci-Dusseau University of Wisconsin, Madison.
Discovering Computers Fundamentals, Third Edition CGS 1000 Introduction to Computers and Technology Fall 2006.
Parity Declustering for Continous Operation in Redundant Disk Arrays Mark Holland, Garth A. Gibson.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Exploiting Gray-Box Knowledge of Buffer Cache Management Nathan C. Burnett, John Bent, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of.
Chapter 8 Operating Systems and Utility Programs.
1 Storage (cont’d) Disk scheduling Reducing seek time (cont’d) Reducing rotational latency RAIDs.
Chapter 11 – Virtual Memory Management Outline 11.1 Introduction 11.2Locality 11.3Demand Paging 11.4Anticipatory Paging 11.5Page Replacement 11.6Page Replacement.
Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.
Secondary Storage CSCI 444/544 Operating Systems Fall 2008.
1 Reliable Adaptive Distributed Systems Armando Fox, Michael Jordan, Randy H. Katz, David Patterson, George Necula, Ion Stoica, Doug Tygar.
I/O Systems and Storage Systems May 22, 2000 Instructor: Gary Kimura.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
12.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 12: Mass-Storage Systems.
By : Nabeel Ahmed Superior University Grw Campus.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
RAID Ref: Stallings. Introduction The rate in improvement in secondary storage performance has been considerably less than the rate for processors and.
RAID Shuli Han COSC 573 Presentation.
Redundant Array of Independent Disks
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
Two or more disks Capacity is the same as the total capacity of the drives in the array No fault tolerance-risk of data loss is proportional to the number.
CSI-09 COMMUNICATION TECHNOLOGY FAULT TOLERANCE AUTHOR: V.V. SUBRAHMANYAM.
Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.
TRACK-ALIGNED EXTENTS: MATCHING ACCESS PATTERNS TO DISK DRIVE CHARACTERISTICS J. Schindler J.-L.Griffin C. R. Lumb G. R. Ganger Carnegie Mellon University.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Page Replacement in Real Systems Questions answered in this lecture: How can the LRU page be approximated efficiently? How can users discover the page.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
RAID SECTION (2.3.5) ASHLEY BAILEY SEYEDFARAZ YASROBI GOKUL SHANKAR.
X-RAY: A Non-Invasive Exclusive Caching Mechanism for RAIDs Lakshmi N. Bairavasundaram Muthian Sivathanu Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau.
A Software Layer for Disk Fault Injection Jake Adriaens Dan Gibson CS 736 Spring 2005 Instructor: Remzi Arpaci-Dusseau.
Journal-guided Resynchronization for Software RAID
Multi-level Raid Multi-level Raid 2 Agenda Background -Definitions -What is it? -Why would anyone want it? Design Issues -Configuration and.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
Exploiting Gray-Box Knowledge of Buffer Cache Management Nathan C. Burnett, John Bent, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of.
Abdullah Aldahami ( ) March 23, Introduction 2. Background 3. Simulation Techniques a.Experimental Settings b.Model Description c.Methodology.
Semantically-Smart Disk Systems Muthian Sivathanu, Vijayan Prabhakaran, Florentina Popovici, Tim Denehy, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau University.
Storage Research Meets The Grid Remzi Arpaci-Dusseau.
Fast File System 2/17/2006. Introduction Paper talked about changes to old BSD 4.2 File System (FS) Motivation - Applications require greater throughput.
Operating Systems: Wrap-Up Questions answered in this lecture: What is an Operating System? Why are operating systems so interesting? What techniques can.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Transforming Policies into Mechanisms with Infokernel Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Nathan C. Burnett, Timothy E. Denehy, Thomas J.
Improving System Availability in Distributed Environments Sam Malek with Marija Mikic-Rakic Nels.
1 Cache-Oblivious Query Processing Bingsheng He, Qiong Luo {saven, Department of Computer Science & Engineering Hong Kong University of.
Bridging the Information Gap in Storage Protocol Stacks Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin,
Database Laboratory Regular Seminar TaeHoon Kim Article.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
W4118 Operating Systems Instructor: Junfeng Yang.
RAID TECHNOLOGY RASHMI ACHARYA CSE(A) RG NO
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 10: Mass-Storage Systems.
CS161 – Design and Architecture of Computer
Chapter 13: I/O Systems Modified by Dr. Neerja Mhaskar for CS 3SH3.
CS161 – Design and Architecture of Computer
Disks and RAID.
Operating System I/O System Monday, August 11, 2008.
Introduction to Operating System (OS)
RAID RAID Mukesh N Tekwani
(Architectural Support for) Semantically-Smart Disk Systems
Preventing Performance Degradation on Operating System Reboots
Bridging the Information Gap in Storage Protocol Stacks
UNIT IV RAID.
RAID RAID Mukesh N Tekwani April 23, 2019
Virtual Memory: Working Sets
Presentation transcript:

Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin, Madison

Gray-box Research Computer systems becoming more complex Transistors Lines of code Each component is becoming more complex Interactions between subsystems can affect Performance Reliability Power Security

Gray-box Research Interfaces remain the same Changes can be difficult and impractical Support multiple platforms or legacy systems Commercial acceptance for wide-spread adoption Hardware and software phenomenon IA-32 instruction set, POSIX OS, SCSI storage Problem: lack of information

Gray-box Solution Treat target system as a gray-box General characteristics are known Extract information from an existing interface e.g. determine cache contents Exploit information to control system behavior e.g. access cached data first

Gray-box Information Techniques Make assumptions about target system Observe system inputs and outputs Statistical methods Draw inferences about internal structure Microbenchmarks and probes Parameterize system components Observe system under controlled input

Gray-box Applications Gray-box techniques have been used to identify Memory hierarchy parameters [Saavedra and Smith] Processor cycle time [Staelin and McVoy] Low-level disk characteristics [Worthington et al.] Buffer cache replacement algorithms [Burnett et al.] File system data structures [Sivathanu et al.] storage array characteristics: Shear

Shear Software tool that automatically determines the important properties of a storage array Enables file system performance tuning with knowledge of storage array characteristics Acts as a management tool to help configure, monitor, and maintain storage arrays

Outline Introduction Shear Background Algorithm Case Studies Performance: Stripe-aligned Writes Management: Detecting Misconfiguration, Failure Conclusion

Shear Goals Determine storage array characteristics SCSI

Shear Goals Determine storage array characteristics Number of disks SCSI

Shear Goals Determine storage array characteristics Number of disks Chunk size SCSI

Shear Goals Determine storage array characteristics Number of disks Chunk size Layout and redundancy scheme RAID-0 SCSI

Shear Goals Determine storage array characteristics Number of disks Chunk size Layout and redundancy scheme RAID-1 SCSI

Shear Goals Determine storage array characteristics Number of disks Chunk size Layout and redundancy scheme PPPP PPPP PPPP PPPP RAID-5 SCSI

Shear Motivation Performance Tune file systems to array characteristics Management Verify configuration Detect failure

Shear Techniques Microbenchmarks and probes Controlled, random access read and write patterns Measure response time of access patterns Measure steady-state performance Statistical clustering Automatically classify fast and slow regimes Identify patterns that utilize only a single disk

Shear Assumptions Storage array Layout follows a repeatable pattern Composed of homogeneous disks System Able to bypass the file system and buffer cache Little traffic from other processes

Outline Introduction Shear Background Algorithm Case Studies Performance: Stripe-aligned Writes Management: Detecting Misconfiguration, Failure Conclusion

Shear Algorithm Pattern size Chunk size Layout of chunks to disks Level of redundancy

Determining the Pattern Size Find the size of the layout's repeating pattern Not always the stripe size Choose a hypothetical pattern size Perform random reads at multiples of that distance Repeat for a range of pattern sizes Cluster results and identify actual pattern size

Pattern Size Example RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 2 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 4 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 6 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 8 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 10 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 12 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 14 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 16 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 18 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 20 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 22 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 24 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 26 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 28 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 30 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example Testing 32 KB RAID-0 4 Disks 8 KB Chunks

Pattern Size Example RAID-0 4 Disks 8 KB Chunks

Pattern Size Example RAID-0 4 Disks 8 KB Chunks Actual 32 KB cluster

Shear Algorithm Pattern size Chunk size Layout of chunks to disks Level of redundancy

Determining the Chunk Size Chunk size amount of data contiguously allocated to one disk Find the boundaries between disks Choose a hypothetical boundary offset Perform random reads on both sides of that offset Repeat for all offsets in the pattern size Cluster results and identify actual chunk size

Chunk Size Example RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 0 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 2 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 4 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 6 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 8 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 10 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 12 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 14 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example Testing 16 KB RAID-0 4 Disks 8 KB Chunks

Chunk Size Example RAID-0 4 Disks 8 KB Chunks

Chunk Size Example RAID-0 4 Disks 8 KB Chunks Actual 8 KB cluster

Shear Algorithm Pattern size Chunk size Layout of chunks to disks Level of redundancy

Determining the Read Layout Find mapping of chunks to disks Choose a pair of chunks in the pattern Perform random reads to both chunks Repeat for all pairs of chunks Cluster results and identify chunks on same disk

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 0, 0 }

Read Layout Example RAID-0 ZIG-ZAG 4 Disks Testing { 0, 1 } 0 7

Read Layout Example RAID-0 ZIG-ZAG 4 Disks Testing { 0, 2 } 0 7

Read Layout Example RAID-0 ZIG-ZAG 4 Disks Testing { 0, 3 } 0 7

Read Layout Example RAID-0 ZIG-ZAG 4 Disks Testing { 0, 4 } 0 7

Read Layout Example RAID-0 ZIG-ZAG 4 Disks Testing { 0, 5 } 0 7

Read Layout Example RAID-0 ZIG-ZAG 4 Disks Testing { 0, 6 } 0 7

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 0, 7 }

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 1, 1 }

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 1, 2 } 1 6

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 1, 3 } 1 6

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 1, 4 } 1 6

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 1, 5 } 1 6

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 1, 6 } 1 6

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Testing { 1, 7 } 1 6

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks

Read Layout Example 0 7 RAID-0 ZIG-ZAG 4 Disks Actual { 0, 7 } { 1, 6 } { 2, 5 } { 3, 4} 1 6 cluster

Shear Algorithm Pattern size Chunk size Layout of chunks to disks Level of redundancy

Determining Level of Redundancy Ratio of read to write bandwidth reveals the type of redundancy in the array Expected R/W ratios: RAID-0:1 (no redundancy) RAID-1:2(mirroring) RAID-4: varies(examine write layout) RAID-5:4(parity)

Shear Experience Shear has been applied to Linux software RAID Poor RAID-5 parity updates Adaptec hardware RAID controller Implements RAID-5 left-asymmetric layout – RAID-0 – RAID-1 – Chained Declustering – RAID-4 – RAID-5 – P+Q

Outline Introduction Shear Background Algorithm Case Studies Performance: Stripe-aligned Writes Management: Detecting Misconfiguration, Failure Conclusion

RAID-5 Performance Small writes on RAID-5 are problematic Require two reads, parity calculation, two writes Writing in full stripes is more efficient PPPP PPPP PPPP PPPP RAID-5

Stripe-aligned Writes Overcome RAID-5 small write problem Modified Linux disk scheduler Groups writes into full stripes Aligns writes along stripe boundaries Approximately 20 lines of code Experiment Hardware RAID-5, 4 disks, 16 KB chunks Create 100 files of varying sizes

Stripe-aligned Writes Experiment Simple modification has a large impact

Detecting Misconfigurations Correct RAID 5-LSRAID 5-LARAID 5-RSRAID 5-RA Software RAID, 4 Disks, 8 KB Chunks What if one disk is accidentally used twice?

Detecting Misconfigurations Correct Misconfig RAID 5-LSRAID 5-LARAID 5-RSRAID 5-RA

Detecting Failures Software RAID RAID-5 LS 10 disks 8 KB chunks

Detecting Failures Software RAID RAID-5 LS 10 disks 8 KB chunks Disk 5 fails

Outline Introduction Shear Background Algorithm Case Studies Performance: Stripe-aligned Writes Management: Detecting Misconfiguration, Failure Conclusion

Gray-box research Extract / exploit information from existing interfaces Shear Extracts information Automatically determines storage array properties Exploits information File system performance tuning Storage management

Questions?