The HP AutoRAID Hierarchical Storage System John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan Hewlett-Packard Laboratories Presented by Sri Ramkrishna
HP AutoRAID Hierarchical Storage System Overview Two level storage hierarchical implementation inside a single array controller. Consists of a mirror copy for fast storage RAID5 storage for slower storage Seamlessly moves data from one to the other.
Overview What is a RAID system? –RAID is “Redundant Array of Disks” Usually comes in RAID3 or RAID5 –RAID3 is some number of disks with one disk dedicated for parity –RAID5 is some number of disks, where each 1 block on each disk creates a stripe, plus a parity block. –Requires an array controller –Mirror (possibly RAID1) Two copies on two disks. Generally faster.
Disk Arrays Problem is disk arrays are hard to use. –Requires understanding of the disk load –If you mess up, it’s expensive to fix. System performance becomes degraded. Have to move data off to another storage. –Adding new capacity, or new disks require you move data off and then restore.
Hierarchical Storage the Solution Combine the performance of mirrored disks with cost- capacity benefits of RAID5. Constraints –Active data must change slowly Can be implemented three ways –Manually Error prone –In the filesystem – not particularly portable –In smart array controller How HP’s solution is done.
Important Features Mapping to allow transparent migration of disk blocks. Mirroring and RAID5 Adaption to Changes in Amount of Data Stored. –Starts empty, data stored in mirrored space till full then gets migrated to RAID5. –Has a fine granularity of 64k unit when moving data between mirrored and raid5. Hot pluggable disks, fan, power supplies, and controllers.
Features continued On-Line Storage capacity expansion –Can add up to 12 disks transparently New disks are easily added Active hot spare Simple administration Log-structured RAID 5 writes.
AutoRAID Details Similar to regular RAID array –Set of disks, intelligent controller, caches for staging data Physical layout consists of: –Physical Extents (PEXes) 1M in size Consists of 128K segments –Segments are either part of mirrored set, or RAID5 –Physical Extent Group (PEG) Stripe of PEXes. PEX’s allocated such that they distribute the load across all disks. At least on three disks Are assigned to either mirrored or raid5 or unassigned
Logical View To machines, AutoRAID presents storage as logical 64K pieces called Relocation Blocks (RBs) –When a new LUN (Logical Unit Number) is created or is increased, its address space is mapped to unto a set of RBs LUNs are the logical address for each individual drive in a disk array. –Allocation occurs on write. Each PEG can hold a number of RBs –Is a function of the size of the PEG
How it works Host initiates a read or write operation to the disk array. –Reads can be cached by the array which can be pretty fast. –Writes are more complicated
AutoRAID Writes Has an non-volatile NVRAM –Host can load request into the NVRAM, once complete, host believes it’s request is done. –Some policies might wait for for additional writes to batch the writes together NVRAM is flushed, and a background write is initiated. –If the data exists in mirrored space, the data is written there. –Otherwise, the data is promoted to mirrored space since it’s now active and then written.
Promotions Migration code is called to move data from RAID5 space to mirrored space If no space is left in mirrored space, some space is demoted down to RAID5 space. There are some tricky situations where there might be a catch 22 situation that needs to be handled.
Reads and Writes Reads and Writes in Mirrored space is simple. Reads pick one of the copies and reads it. Writes are done by writing to both disks. Write is complete when both disks are written to. Reads in RAID5 space is pretty straightforward. Writes in RAID5 is more complicated.
RAID5 Writes RAID5 storage is layed out like a log –Means that RBs that move from mirrored space is appended to RAID5 storage PEG. Depending on whether it has free slots of course. –RB writes can be done in two ways Per RB –Generates two disk writes, one for data one for parity Batched writes –Waits for all the RBs in a stripe is written. –Only has one parity write –Commonly used in most RAID5 implementation
Compactions:Holes, Garbage Collection Demoting and promoting causes holes in mirrored space –Added to free list –Can be reused for promotions from RAID5 space –Can also be used to fill holes to free up a PEG, so it can be used in RAID5 storage. Same problem in RAID5 space –Called garbage collecting –Holes cannot be filled, but must be cleaned up.
Migrations/Balancing Migrations to RAID5 space from Mirrored Space –Rbs are selected by Least Recently Written (LRU) selection. –Done in the background Balancing –When new drives are added, migration is done to balance the performance.
Testing Setup Baseline configuration was 12 disk system with one controller and 24MB of controller data cache. Connected to HP 9000/ K400 system with one processor and 12 MB Compared against Data General CLARiion disk array
Performance Results AutoRAID vs RAID Array vs JBOD-LVM –OLTP show that AutoRAID out performs RAID, and 3/4 th of JBOD-LVM –JBOD-LVM JBOD means Just a Bunch of Disks –Writes were slower than JBOD-LVM because mirrored writes were slower than JBOD-LVM.
Some Notes Increasing the speed of the disks, improves the backend peformance. –Improving transfer rate is more important than rotational latency
Summary HP AutoRAID is very easy to use Sysadmins are able to add disks, and do various tasks without having to worry about whether the disk layout is correct.