The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Hewlett-Packard Laboratories.

Slides:



Advertisements
Similar presentations
Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
Advertisements

1 Lecture 18: RAID n I/O bottleneck n JBOD and SLED n striping and mirroring n classic RAID levels: 1 – 5 n additional RAID levels: 6, 0+1, 10 n RAID usage.
RAID Redundant Array of Independent Disks
Mendel Rosenblum and John K. Ousterhout Presented by Travis Bale 1.
The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan “virtualized disk gets smart…”
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
“Redundant Array of Inexpensive Disks”. CONTENTS Storage devices. Optical drives. Floppy disk. Hard disk. Components of Hard disks. RAID technology. Levels.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
CS-3013 & CS-502, Summer 2006 More on File Systems1 More on Disks and File Systems CS-3013 & CS-502 Operating Systems.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Hewlett-Packard Laboratories Presented by Sri.
R.A.I.D. Copyright © 2005 by James Hug Redundant Array of Independent (or Inexpensive) Disks.
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
Sean Traber CS-147 Fall  7.9 RAID  RAID Level 0  RAID Level 1  RAID Level 2  RAID Level 3  RAID Level 4 
RAID Technology CS350 Computer Organization Section 2 Larkin Young Rob Deaderick Amos Painter Josh Ellis.
The design and implementation of a log-structured file system The design and implementation of a log-structured file system M. Rosenblum and J.K. Ousterhout.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
Cse Feb-001 CSE 451 Section February 24, 2000 Project 3 – VM.
RAID Systems CS Introduction to Operating Systems.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM J. Wilkes, R. Golding, C. Staelin T. Sullivan HP Laboratories, Palo Alto, CA.
By : Nabeel Ahmed Superior University Grw Campus.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
CSE 321b Computer Organization (2) تنظيم الحاسب (2) 3 rd year, Computer Engineering Winter 2015 Lecture #4 Dr. Hazem Ibrahim Shehata Dept. of Computer.
Redundant Array of Independent Disks
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
Two or more disks Capacity is the same as the total capacity of the drives in the array No fault tolerance-risk of data loss is proportional to the number.
Module 9 Review Questions 1. The ability for a system to continue when a hardware failure occurs is A. Failure tolerance B. Hardware tolerance C. Fault.
N-Tier Client/Server Architectures Chapter 4 Server - RAID Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept RAID – Redundant Array.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
CSI-09 COMMUNICATION TECHNOLOGY FAULT TOLERANCE AUTHOR: V.V. SUBRAHMANYAM.
RAID SECTION (2.3.5) ASHLEY BAILEY SEYEDFARAZ YASROBI GOKUL SHANKAR.
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
Redundant Array of Independent Disks.  Many systems today need to store many terabytes of data.  Don’t want to use single, large disk  too expensive.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Presented by Arthur Strutzenberg.
EECS 262a Advanced Topics in Computer Systems Lecture 3 Filesystems (Con’t) September 10 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
Advanced UNIX File Systems Berkley Fast File System, Logging File Systems And RAID.
Davie 5/18/2010.  Thursday, May 20 5:30pm  Ursa Minor  Co-sponsored with CSS  Guest Speakers  Dr. Craig Rich – TBA  James Schneider – Cal Poly.
Embedded System Lab. 서동화 The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout.
RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,
CS333 Intro to Operating Systems Jonathan Walpole.
Embedded System Lab. 정영진 The Design and Implementation of a Log-Structured File System Mendel Rosenblum and John K. Ousterhout ACM Transactions.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
RAID Presentation Raid is an acronym for “Redundant array of independent Drives”, or Redundant array of inexpensive drives”. The main concept of RAID is.
Cloud Computing Vs RAID Group 21 Fangfei Li John Soh Course: CSCI4707.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
RAID TECHNOLOGY RASHMI ACHARYA CSE(A) RG NO
Network-Attached Storage. Network-attached storage devices Attached to a local area network, generally an Ethernet-based network environment.
CS Introduction to Operating Systems
HP AutoRAID (Lecture 5, cs262a)
Jonathan Walpole Computer Science Portland State University
Vladimir Stojanovic & Nicholas Weaver
Storage Virtualization
HP AutoRAID (Lecture 5, cs262a)
RAID RAID Mukesh N Tekwani
ICOM 6005 – Database Management Systems Design
THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
TECHNICAL SEMINAR PRESENTATION
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
RAID RAID Mukesh N Tekwani April 23, 2019
The Design and Implementation of a Log-Structured File System
Presentation transcript:

The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Hewlett-Packard Laboratories

File System Review

UNIX File System (1974) provides an addressable structure to store and retrieve files from disk simple & elegant but slow (2% bandwidth)

File System Review UNIX File System (1974) provides an addressable structure to store and retrieve files from disk simple & elegant but slow (2% bandwidth) Berkeley Fast File System (1984) modified the block size to allow bandwidth to reach up to 47% created cylinder groups that spread metadata to reduce seek times considered hardware specifics during file system parameterization

File System Review UNIX File System (1974) provides an addressable structure to store and retrieve files from disk simple & elegant but slow (2% bandwidth) Berkeley Fast File System (1984) modified the block size to allow bandwidth to reach up to 47% created cylinder groups that spread metadata to reduce seek times considered hardware specifics during file system parameterization Sprite Log-structured File System (1991) relies on increasingly large file caches to handle most reads multiple buffers multiple writes before going to disk single buffer then gets copied entirely to disk in a single write introduced the concept of extents (large continuous set of free blocks) requires cleaning(garbage collection) & requires restructuring of active/non-active data improved crash recovery with roll forward capability !!!!

What about Hardware Failure?? Redundancy FSX FFSX LFSX

What about Hardware Failure?? Redundancy FSX FFSX LFSX RAID

Redundant array of independent disks (early 80s) early days of mainframes Redundant array of inexpensive disks (1988 Patterson, et al) for smaller computer pc ( became widely popular) introduced the concept of partial redundancy Virtualization Array of Disks are viewed as a Single Virtual Disk Requires Array Controller, SCSI connector, hardware and software support  Controls Array of Disks

the many Levels of RAID Patterson introduced five levels No Standards Exist Companies are free to invent their own versions

raid0 STRIPING Pros Good performance on large requests 100% storage capacity Cons Not fault tolerant Not considered raid by many enthusiasts because nothing is redundant

raid1 MIRRORING Pros Good performance And its fault tolerant Cons 50% storage capacity Gets expensive to scale

Parity Parity is calculated using XOR Controller takes a bit from each disk if the total is even  parity = 0 If the total is odd  parity =1 Same protection as mirroring without all the overhead Increased capacity to 80% (1-1/n where n=disks) Easy to restore bits to a single failed drive For missing data, what bit makes parity correct?

raid3 Combine Striping and Redundancy Pros increased storage capacity (1 - 1/N)% high throughput for large files provides partial redundancy using parity Cons parity is at the bit level Poor performance for small I/O no parallel reads or writes possible because parity is on a single disk

raid5 Spread Parity Across All Disks Pros block level striping allows hot swappable disk replacement on failure small requests could be performed in parallel Cons small writes require reading old data, writing new data, reading corresponding old parity value, and writing new parity value(small-write problem) if workload contains too many small writes performance suffers dramatically

All these levels, how do I choose the right one? No Level fits for all occasions fast Raid1 fast but doesn’t scale well 50% storage capacity Raid5 scales but can’t handle multiple small writes

All these levels, how do I choose the right one? No Level fits for all occasions fast Raid1 fast but doesn’t scale well 50% storage capacity Raid5 scales but can’t handle multiple small writes How Can we combine the best of both Levels?

All these levels, how do I choose the right one? No Level fits for all occasions fast Raid1 fast but doesn’t scale well 50% storage capacity Raid5 scales but can’t handle multiple small writes How Can we combine the best of both Levels? Use raid1 for Active data and raid5 for Inactive data

All these levels, how do I choose the right one? No Level fits for all occasions fast Raid1 fast but doesn’t scale well 50% storage capacity Raid5 scales but can’t handle multiple small writes How Can we combine the best of both Levels? Use Raid1 for Active Data and Raid 5 for Inactive data Create a mapping that allows migration between the two

All these levels, how do I choose the right one? No Level fits for all occasions fast Raid1 fast but doesn’t scale well 50% storage capacity Raid5 scales but can’t handle multiple small writes How Can we combine the best of both Levels? Use Raid1 for Active Data and Raid 5 for Inactive data Create a mapping that allows migration between the two Assign a hierarchical preference to each level

All these levels, how do I choose the right one? No Level fits for all occasions fast Raid1 fast but doesn’t scale well 50% storage capacity Raid5 scales but can’t handle multiple small writes How Can we combine the best of both Levels? Use Raid1 for Active Data and Raid 5 for Inactive data Create mapping that allows migration between the two Assign a hierarchical preference Provide a way to migrate data between the two hierarchies

Who Manages the Migration? Not the system administrator Error prone Can not adapt fast enough to changing environment Not the file system Good idea but not a portable solution Could use an array controller if it were smart enough It would have to identify active and inactive data migrate active data to mirrored storage and inactive data to raid5 storage provide a virtual disk to the existing file system be easy to configure

HP AutoRAID Super intelligent array controller Uses Embedded software to manage hierarchy Presents virtual logical units to file system The file system is unaware of storage hierarchy active/inactive grouping data migration Have to provide a mapping to go from virtual to physical addresses!

Data Layout – placing the data on the disk PEX physical extent 1MB of disk space allocation These are the columns of data PEG physical extent group group of at least three PEX’s on different disks Spread across disks to balance data PEG States Can be assigned to the mirrored storage class Can be assigned to the raid5 storage class Can be unassigned Segment – 128KB contiguous space Included in a stripe or mirrored pair RB Relocation Block - 64KB LUN logical unit Host-visible virtual disk STRIPE row of parity & data segments in raid5 peg

LUN ptrs to PEGS PEG tables list of RB’s list of PEX’s PEX tables 1 per disk RB1 RB2 RB3 RB4 RB5 RB6 RB7 RBn LUN/ Virtual Device Tables PEG TABLES PEGn PEG2 PEG1 RB4 RB5 RB6 RB7 PEX1 PEX2 PEX3 Disk 1Disk 2 Disk3 Pex1 segment tablePex2 segment tablePex3 segment table OS File System Mapping Structure

HP AutoRAID What can it do? Initially array starts out empty Data is added to mirrored storage until it is full Some mirrored storage is immediately reallocated to raid5 storage Just re-map PEX’s in mirrored PEG’s to RAID5 PEG’s As workload changes Newly active data are promoted to mirrored storage Data that are less active are demoted to raid5 storage All of this is done in the background - no performance interference Hot-pluggable disks allow for failed component to be removed while system is running Disks can be added to the array at any time up to maximum of 12 Controller fail-over support Active hot spare to reduce the risk of having two drive failures Raid5 uses Log-Structured writes for added performance

Added redundancy Have the ability to add disks to the array on the fly We pushed control disk control from the File System to some fancy hardware with embedded software As far as the file system is concerned we have solved all the problems, right? Well, not really! RAID5 uses log-structured writes, what about the garbage collection? HP AutoRAID is very Slick!

Added redundancy Have the ability to add disks to the array on the fly We pushed control disk control from the File System to some fancy hardware with embedded software As far as the file system is concerned we have solved all the problems, right? Well, not really! RAID5 uses log-structured writes, what about the garbage collection? Same as layout balancing, garbage collection is done in the background This is done by identifying periods of idleness Cleaning requires filling the Holes left when data are promoted to the mirrored storage class HP AutoRAID is very Slick!

Compaction, cleaning, hole plugging RAID5 PEG Hole-Plugging Garbage collection If it is nearly full RB’s from almost empty PEG’s copied to fill holes Minimizes data movement If it is almost empty Those RB’s are used to fill holes in the nearly full ones If it is almost empty and no others holes are ready to be plugged valid RB’s are written to the end of the log Complete PEG is reclaimed as a unit

Performance OLTP macrobenchmark results Raid redundancy HPAutoRaid redundancy JBOD–LVM NO redundancy Striping though, so geared for speed Results are as expected Transaction rate relative to number of disks Working set to large for 5 drives Write set doesn’t fit entirely in mirrored storage Thrashing causes poor performance

Summary HP AutoRAID works well to provide performance and redundancy Extremely easy to setup and use Works in a variety of real life environments Provides outstanding general purpose storage

References Wilkes, John. et al “The HP autoraid hierarchical storage system” Hewlett-Packard Laboratories Patterson, David A “A case for redundant arrays of inexpensive disks (RAID)” Department of Electrical Engineering UC Berkeley Henson, Val “A Brief History of Unix File Systems” Rosenblum, Mendel. et al “The design and implementation of a log-structured file system” Department of Electrical Engineering UC Berkeley McKusick, Marshal K. et al “A fast file system for UNIX*” Department of Electrical Engineering UC Berkeley Raid graphics from Parity graphics from Tanenbaum, Andrew S “Modern Operating Systems 2 nd Edition” Prentice-Hall of India