Download presentation
Presentation is loading. Please wait.
Published byClara Stanley Modified over 6 years ago
1
Scaling for the Future Katherine Yelick U.C. Berkeley, EECS
2
Two Independent Problems
Building a reliable, scalable infrastructure Scalable processor, cluster, and wide-area systems IRAM, ISTORE, and OceanStore One example application for the infrastructure Microscale simulation of biological systems Model signals from cell membrane to nucleus Understanding disease and for pharmacological and BioMEMS-mediated therapy
3
IRAM: Scaling within a Chip
Microprocessor & DRAM on a single chip: Avoids memory bus bottleneck Address power limits by spreading logic over chip VIRAM chip: Vector architecture exploits bandwidth preserves power & area advantages Support for multimedia IBM will fabricate Sp ’01 200 MHz, 3.2 Gflops, 2 W .18 um mixed logic/DRAM $ Proc L2$ L o g i c f a b Bus D R A M I/O D R A M f a b Proc Bus I/O $B for separate lines for logic and memory Single chip: either processor in DRAM or memory in logic fab
4
ISTORE: Scaling Clusters
Design points 2001: 80 nodes in 3 racks 2002: 1000 nodes in 10 racks (?) 2005: 10K nodes in 1 rack (?) Add IRAM to 1” disk Key problems are availability, maintainability, and evolutionary growth (AME) of a thousand node servers Approach Hardware built for availability: monitor, diagnostics New class of benchmarks for AME Reliable systems from unreliable hw/sw components Introspection: the system watches itself
5
OceanStore: Scaling to Utilities
Canadian OceanStore Sprint AT&T Pac Bell IBM IBM Transparent data service provided by federation of companies: Monthly fee paid to one service provider Companies buy and sell capacity from each other Assumptions: Untrusted Infrastructure: only ciphertext in the infrastructure Promiscuous Caching: cache anywhere, anytime Optimistic Concurrency Control: avoid locking
6
The Real Scalability Problems: AME
Availability systems should continue to meet quality of service goals despite failures and extreme load Maintainability minimize human administration Evolutionary Growth graceful evolution; dynamic scalability These are problems for computation and storage services
7
Research Principles Redundancy everywhere
Hardware: processors, networks, disks,… Software: language, libraries, runtime,… Introspection reactive techniques to detect and adapt to failures, workload variations, and system evolution proactive techniques to anticipate and avert problems before they happen Benchmarking Define quantitative AME measures Benchmarks drive the field
8
Benchmarks Availability benchmarks Measure QoS as fault events occur
Support for fault injection key Example of software RAID system Maintainability benchmarks Human factor is a challenge Evolutionary growth benchmarks Performance with heterogeneous hardware
9
Example: Faults in Software RAID
Linux Solaris Compares Linux and Solaris reconstruction Linux: minimal performance impact but longer window of vulnerability to second fault Solaris: large perf. impact but restores redundancy fast
10
Simulating Microscale Biological Systems
Large scale simulation useful for Fundamental biological questions: cell behavior Design of treatments, including Bio-MEMs Simulations limited in part by Machine complexity, e.g., memory hierarchies Algorithmic complexity, e.g., adaptation Old software model: Hide the machine from the users Implicit parallelism, hardware-controlled caching, Results were unusable Witness success of MPI
11
New Model for Scalable High Confidence Computing
Domain-specific language that judiciously exposes machine structure Explicit parallelism, load balancing and locality control Allows for construction of complex, distributed data structures Current Demonstration on higher level models Heart simulation Future plans Algorithms and software that adapts to faults Microscale systems
12
Conclusions Scaling at all levels Processors, clusters, wide area
Application challenges Both storage and compute intensive Key challenges to future infrastructure are: Availability and reliability Complexity of the machine
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.