MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES D. Colarelli D. Grunwald U. Colorado, Boulder.

MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES D. Colarelli D. Grunwald U. Colorado, Boulder

Paper proposes –To replace tape libraries by large non- redundant arrays of disks –To cache on active drives Files that have been recently accessed Update logs for other files –To keep other drives mostly inactive by spinning them down between accesses Highlights

Introduction (I) Robotic tape libraries are now the standard solution for archiving very large amounts of data Disadvantages include –Slow access times: average search time of 41s for T9940 drives –Not much cheaper than disk drives Could we replace tem by massive arrays of hard drives?

Introduction (II) Major limitation of hard drive solution is power consumption –Almost ten times that of equivalent tape library Could power down disks that are not currently accessed –50% of data are likely to be never accessed –25% of data are likely to be accessed once

Introduction (III) Must be at least as reliable as tape libraries –No need to use a redundant scheme Solution is Massive Array of Inactive Drives Paper investigates design issues through trace- driven simulations

Design Issues Two major design decisions – Data migration or duplication ( caching ) – File system or block-level interface

Migration or caching Migration would move “hot” data to active drives Migration uses disk space more efficiently Requires a map or directory mechanism that maps the storage across all drives Caching would cache read data and act as a write log for write data Keeps two copies of all cached files Maps or directories are proportional to size of cache

File system or block interface Could use file system information to cache entire files Would probably perform better Would require system modifications Would work with existing systems

MAID with caching Active drives (always on) Passive drives (spin up/down) Virtualization Manager Cache Manager Passive Drive Manager

Design choices (I) Compared MAID-cache and MAID-no cache MAID-cache –Caches read and writes on active drives –Caching unit is “chunk” of 64 sectors –Cache policy is LRU –All writes are placed in the cache write-log where they wait to be committed to the non- active ( passive ) drives

Design choices (II) Must always check write log before reading data from the cache or the passive drives –Passive drives remain on standby until A cache miss occurs The write log becomes too long –Return to standby when spin-down inactivity time limit is reached –Varying time limit is primary way to affect system performance and energy consumption

Simulation parameters 1.Power management policy : –Always on –Fixed-delay spin-down –Adaptive spin-down 2.Data layout –Linear: keep successive blocks on same drive –Striped: the opposite 3.Caching/No caching

Simulation results Based on a supercomputer center workload All MAID configurations achieve similar power consumptions –15 to 16 % of that of always on configuration MAID configurations w/o cache have average response times comparable to that of always on configuration –Workload had little locality

Simulation results (II) Average response times of MAID configurations with cache much worse than that of always on configuration –0.680 to 0.720 s compared to 0.303 s Striped configuration with fixed spin-down delay has lowest average response time of all MAID configurations –0.309 s

Conclusion MAID can achieve average response times comparable to that of an always on configuration with a much lower power consumption IMPORTANT In a more recent paper, the authors found out that cached configurations worked much better for workloads exhibiting more locality of accesses than their supercomputer center workload

MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES D. Colarelli D. Grunwald U. Colorado, Boulder.

Similar presentations

Presentation on theme: "MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES D. Colarelli D. Grunwald U. Colorado, Boulder."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES D. Colarelli D. Grunwald U. Colorado, Boulder.

Similar presentations

Presentation on theme: "MASSIVE ARRAYS OF IDLE DISKS FOR STORAGE ARCHIVES D. Colarelli D. Grunwald U. Colorado, Boulder."— Presentation transcript:

Similar presentations

About project

Feedback