The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Hewlett-Packard Laboratories Presented by Sri.

Slides:



Advertisements
Similar presentations
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
Advertisements

Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
Faculty of Information Technology Department of Computer Science Computer Organization Chapter 7 External Memory Mohammad Sharaf.
RAID Redundant Array of Independent Disks
Raid dr. Patrick De Causmaecker What is RAID Redundant Array of Independent (Inexpensive) Disks A set of disk stations treated as one.
 RAID stands for Redundant Array of Independent Disks  A system of arranging multiple disks for redundancy (or performance)  Term first coined in 1987.
The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan “virtualized disk gets smart…”
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
EECS 262a Advanced Topics in Computer Systems Lecture 4 Filesystems (Con’t) September 15 th, 2014 John Kubiatowicz Electrical Engineering and Computer.
The Zebra Striped Network File System Presentation by Joseph Thompson.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Chapter 3 Presented by: Anupam Mittal.  Data protection: Concept of RAID and its Components Data Protection: RAID - 2.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
1 Storage (cont’d) Disk scheduling Reducing seek time (cont’d) Reducing rotational latency RAIDs.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
The Design and Implementation of a Log-Structured File System Presented by Carl Yao.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM J. Wilkes, R. Golding, C. Staelin T. Sullivan HP Laboratories, Palo Alto, CA.
DAS Last Update Copyright Kenneth M. Chipps Ph.D. 1.
FFS, LFS, and RAID Andy Wang COP 5611 Advanced Operating Systems.
David A. Patterson, Garth Gibson and Randy H. Katz “A case for redundant arrays of inexpensive disks (RAID)”, SIGMOD’88 Pages 109 – 116.
RAID Ref: Stallings. Introduction The rate in improvement in secondary storage performance has been considerably less than the rate for processors and.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
CSI-09 COMMUNICATION TECHNOLOGY FAULT TOLERANCE AUTHOR: V.V. SUBRAHMANYAM.
4.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 4: Organizing a Disk for Data.
The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Hewlett-Packard Laboratories.
Multi-level Raid Multi-level Raid 2 Agenda Background -Definitions -What is it? -Why would anyone want it? Design Issues -Configuration and.
Properties of Layouts Single failure correcting: no two units of same stripe are mapped to same disk –Enables recovery from single disk crash Distributed.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Presented by Arthur Strutzenberg.
EECS 262a Advanced Topics in Computer Systems Lecture 3 Filesystems (Con’t) September 10 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
The concept of RAID in Databases By Junaid Ali Siddiqui.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
Embedded System Lab. 서동화 The Design and Implementation of a Log-Structured File System - Mendel Rosenblum and John K. Ousterhout.
RAID Disk Arrays Hank Levy. 212/5/2015 Basic Problems Disks are improving, but much less fast than CPUs We can use multiple disks for improving performance.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
RAID Arrays A short summary for TAFE. What is a RAID A Raid Array is a way of protecting data on a hard drive by using “redundancy” to repeat data across.
Introduction to RAID Rogério Perino de Oliveira Neves Patrick De Causmaecker
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
John Kubiatowicz and Anthony D. Joseph
HP AutoRAID (Lecture 5, cs262a)
Fujitsu Training Documentation RAID Groups and Volumes
Disks and RAID.
Chapter 9 – Real Memory Organization and Management
Storage Virtualization
HP AutoRAID (Lecture 5, cs262a)
RAID Disk Arrays Hank Levy 1.
RAID RAID Mukesh N Tekwani
Computer-System Architecture
THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM
RAID Disk Arrays Hank Levy 1.
CSE 451: Operating Systems Spring 2005 Module 17 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
TECHNICAL SEMINAR PRESENTATION
UNIT IV RAID.
John Kubiatowicz Electrical Engineering and Computer Sciences
Mark Zbikowski and Gary Kimura
CSE 451: Operating Systems Autumn 2004 Redundant Arrays of Inexpensive Disks (RAID) Hank Levy 1.
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
CSE 451: Operating Systems Autumn 2009 Module 19 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
RAID Disk Arrays Hank Levy 1.
RAID RAID Mukesh N Tekwani April 23, 2019
CSE 451: Operating Systems Winter 2004 Module 17 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
John Kubiatowicz Electrical Engineering and Computer Sciences
CSE 451: Operating Systems Winter 2006 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
Andy Wang COP 5611 Advanced Operating Systems
Presentation transcript:

The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Hewlett-Packard Laboratories Presented by Sri Ramkrishna

HP AutoRAID Hierarchical Storage System Overview Two level storage hierarchical implementation inside a single array controller. Consists of a mirror copy for fast storage RAID5 storage for slower storage Seamlessly moves data from one to the other.

Overview What is a RAID system? –RAID is “Redundant Array of Disks” Usually comes in RAID3 or RAID5 –RAID3 is some number of disks with one disk dedicated for parity –RAID5 is some number of disks, where each 1 block on each disk creates a stripe, plus a parity block. –Requires an array controller –Mirror (possibly RAID1) Two copies on two disks. Generally faster.

Disk Arrays Problem is disk arrays are hard to use. –Requires understanding of the disk load –If you mess up, it’s expensive to fix. System performance becomes degraded. Have to move data off to another storage. –Adding new capacity, or new disks require you move data off and then restore.

Hierarchical Storage the Solution Combine the performance of mirrored disks with cost- capacity benefits of RAID5. Constraints –Active data must change slowly Can be implemented three ways –Manually Error prone –In the filesystem – not particularly portable –In smart array controller How HP’s solution is done.

Important Features Mapping to allow transparent migration of disk blocks. Mirroring and RAID5 Adaption to Changes in Amount of Data Stored. –Starts empty, data stored in mirrored space till full then gets migrated to RAID5. –Has a fine granularity of 64k unit when moving data between mirrored and raid5. Hot pluggable disks, fan, power supplies, and controllers.

Features continued On-Line Storage capacity expansion –Can add up to 12 disks transparently New disks are easily added Active hot spare Simple administration Log-structured RAID 5 writes.

AutoRAID Details Similar to regular RAID array –Set of disks, intelligent controller, caches for staging data Physical layout consists of: –Physical Extents (PEXes) 1M in size Consists of 128K segments –Segments are either part of mirrored set, or RAID5 –Physical Extent Group (PEG) Stripe of PEXes. PEX’s allocated such that they distribute the load across all disks. At least on three disks Are assigned to either mirrored or raid5 or unassigned

Logical View To machines, AutoRAID presents storage as logical 64K pieces called Relocation Blocks (RBs) –When a new LUN (Logical Unit Number) is created or is increased, its address space is mapped to unto a set of RBs LUNs are the logical address for each individual drive in a disk array. –Allocation occurs on write. Each PEG can hold a number of RBs –Is a function of the size of the PEG

How it works Host initiates a read or write operation to the disk array. –Reads can be cached by the array which can be pretty fast. –Writes are more complicated

AutoRAID Writes Has an non-volatile NVRAM –Host can load request into the NVRAM, once complete, host believes it’s request is done. –Some policies might wait for for additional writes to batch the writes together NVRAM is flushed, and a background write is initiated. –If the data exists in mirrored space, the data is written there. –Otherwise, the data is promoted to mirrored space since it’s now active and then written.

Promotions Migration code is called to move data from RAID5 space to mirrored space If no space is left in mirrored space, some space is demoted down to RAID5 space. There are some tricky situations where there might be a catch 22 situation that needs to be handled.

Reads and Writes Reads and Writes in Mirrored space is simple. Reads pick one of the copies and reads it. Writes are done by writing to both disks. Write is complete when both disks are written to. Reads in RAID5 space is pretty straightforward. Writes in RAID5 is more complicated.

RAID5 Writes RAID5 storage is layed out like a log –Means that RBs that move from mirrored space is appended to RAID5 storage PEG. Depending on whether it has free slots of course. –RB writes can be done in two ways Per RB –Generates two disk writes, one for data one for parity Batched writes –Waits for all the RBs in a stripe is written. –Only has one parity write –Commonly used in most RAID5 implementation

Compactions:Holes, Garbage Collection Demoting and promoting causes holes in mirrored space –Added to free list –Can be reused for promotions from RAID5 space –Can also be used to fill holes to free up a PEG, so it can be used in RAID5 storage. Same problem in RAID5 space –Called garbage collecting –Holes cannot be filled, but must be cleaned up.

Migrations/Balancing Migrations to RAID5 space from Mirrored Space –Rbs are selected by Least Recently Written (LRU) selection. –Done in the background Balancing –When new drives are added, migration is done to balance the performance.

Testing Setup Baseline configuration was 12 disk system with one controller and 24MB of controller data cache. Connected to HP 9000/ K400 system with one processor and 12 MB Compared against Data General CLARiion disk array

Performance Results AutoRAID vs RAID Array vs JBOD-LVM –OLTP show that AutoRAID out performs RAID, and 3/4 th of JBOD-LVM –JBOD-LVM JBOD means Just a Bunch of Disks –Writes were slower than JBOD-LVM because mirrored writes were slower than JBOD-LVM.

Some Notes Increasing the speed of the disks, improves the backend peformance. –Improving transfer rate is more important than rotational latency

Summary HP AutoRAID is very easy to use Sysadmins are able to add disks, and do various tasks without having to worry about whether the disk layout is correct.