Download presentation
Presentation is loading. Please wait.
Published byElijah Dustin Powers Modified over 9 years ago
1
Storage Research Meets The Grid Remzi Arpaci-Dusseau
2
ADSL Where Gray-box techniques meet storage systems Storage Gray Box
3
The Who, How, and What of ADSL Who: Andrea and Remzi Arpaci-Dusseau And of course a bunch of students How: Gray-box Techniques Assume system is a “gray box” Leverage knowledge of its implementation to: Gain more information Control its behavior What: Storage Systems Smarter disks and RAIDs
4
Semantically-smart Disks Problem: Most disks don’t know much Block-based SCSI interface limits knowledge And what a waste of potential! Modern RAIDs have substantial processing, memory A semantically-smart disk system Figure out how file system is using it Exploits that to build new functionality into storage
5
Trend that Drives This Session: Data Demands on the Rise Focus of original batch queueing systems: CPU “cycle stealing” Compute clusters Distributed supercomputer But data demands of jobs are on the rise… Input, output, temp files and checkpoints Modern science is increasingly data centric
6
Focus of this talk: Traditional storage vs. Grid storage Most aspects of modern storage systems are designed with certain domain in mind Local area environment, presence of admin, etc. Grid changes almost every assumption Wide area, no admin, etc. Conclusion: Must reexamine how to build storage systems from the ground up
7
Outline Introduction Traditional vs. Grid Storage Data reliability Management Caching and Overlap Evaluation Conclusions
8
Data Reliability: Traditional All data treated equally, and is sacred Most users tolerate some amount of data loss (30 second delay before flush to disk) Losing one byte after flush is catastrophic Strong implications for design: Backup + disaster recovery
9
Data Reliability: Grid Different types of I/O, treat accordingly Einstein’s Matter-Energy equivalence: E=MC^2 Grid analogy: Data-Computation equivalence E(M) = C Knowledge is key: If you can refetch M, you can recompute C
10
Management: Traditional Storage administrators control system Performance tuning Problem fixing User handling Human intelligence can be applied to make things run smoothly
11
Management: Grid No administrator to help out Though may have to live within administrative limitations System must automatically handle problems Tune to environment Deal with failures Give reasonable feedback to users upon errors and other problem scenarios
12
Buffering and Overlap: Traditional Used throughout systems for performance Important cache: Client-side NFS: Memory AFS: Disk (and memory) Caches are managed transparently Overlap: Disk->memory, across network, also transparent Result: Operations can run as if they are local ClientServer $$
13
Buffering and Overlap: Grid Used throughout for performance, reliability Many more levels of cache Not just clients/servers Caches managed both transparently and not transparently Overlap is more complex too (multiple users, resources) Have to deal with more issues: failure, cost differentials $$$ $ Home Site WAN
14
Evaluation: Traditional Traditional storage metrics: Myopic focus May miss “big picture” One example: Availability Defined as “uptime” of system What’s good: “5 9s” of availability (up 99.999%) Implications: Systems are engineered for enterprise use (and thus over-engineered for many uses)
15
Evaluation: Grid Grid metrics can focus on what’s important for Grid jobs: Job throughput Instead of availability, measure impact of failure on the aspect of system that matters most Result: An end-to-end perspective to evaluate merit of new approaches in the Grid space
16
Summary Grid changes storage systems Makes some things harder (caching, overlap, failures) Makes other things easier (better understanding of workload and metrics) How to make it all work? Exploit knowledge: of workloads and systems to reduce difficult problems to tractable ones
17
The Data-centric Lineup Lots of exciting work going at Wisconsin in this space! First session: John Bent - “Batch-pipelined Workloads” Doug Thain - “Migratory File Services” Second Session Joseph Stanley - “NeST” Tevfik Kosar - “Stork” George Kola - “Disk Router” Guest speaker: Arie Shoshani - “Coscheduling Storage and CPUs”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.