Download presentation
Presentation is loading. Please wait.
Published byKenneth Riley Modified over 9 years ago
2
UW-Madison Computer Sciences Vertical Research Group© 2010 Relax: An Architectural Framework for Software Recovery of Hardware Faults Marc de Kruijf Shuou Nomura Karthikeyan Sankaralingam
3
ISCA 2010 - 3 Executive Summary Problem Technology is driving simple hardware Fault recovery requires complex hardware Software Recovery Enables simple hardware High energy efficiency Relax: An Architectural Framework for Software Recovery ISA:a well-defined interface for software recovery Software: support to use the ISA Hardware:support to implement the ISA
4
ISCA 2010 - 4 Architecture Trend Energy efficiency Hardware simplification
5
ISCA 2010 - 5 Search Computer Vision Data Mining Media Processing Scientific Computing … Applications Trend Data-intensive, error- tolerant applications Architecture Trend Energy efficiency Hardware simplification 100110101101 001011001010 111001010111 000100001101
6
ISCA 2010 - 6 Vdd OutIn CMOS Trend Device variability, wear-out, soft errors Search Computer Vision Data Mining Media Processing Scientific Computing … Applications Trend Data-intensive, error- tolerant applications Architecture Trend Energy efficiency Hardware simplification
7
CMOS Trend Device variability, wear-out, soft errors Hardware Recovery Software Recovery Applications Trend Data-intensive, error- tolerant applications Inefficient No flexibility Checkpoints conservative Efficient Error tolerance Natural recovery points ISCA 2010 - 7 Vdd OutIn Search Computer Vision Data Mining Media Processing Scientific Computing … Architecture Trend Energy efficiency Hardware simplification Simple Hardware No speculative state Recovery Support Is Needed Complex Hardware Speculative state
8
ISCA 2010 - 8 Relax Software Recovery Hardware Detection ISA
9
ISCA 2010 - 9 ISA Software Hardware Relax
10
ISCA 2010 - 10 ISA SIMPLE HARDWARE application error tolerance software-defined recovery simplicity energy efficiency flexibility Software defines recovery handler Hardware detects and jumps to handler on fault and is allowed to commit corrupted state * rlx RECOVER... RECOVER:... rlx RECOVER... RECOVER:... * Details in paper
11
ISCA 2010 - 11 ISA Software Hardware
12
ISCA 2010 - 12 Software int sad(int *left, int *right, int len) int sum = 0; for (int i = 0; i < len; ++i) { sum += abs(left[i] - right[i]); } return sum; } SAD (Sum of Absolute Differences) Example (adapted from a H.264 video encoder)
13
ISCA 2010 - 13 Software int sad(int *left, int *right, int len) int sum = 0; for (int i = 0; i < len; ++i) { sum += abs(left[i] - right[i]); } return sum; } SAD (Sum of Absolute Differences) Example int sad(int *left, int *right, int len) int sum = 0; for (int i = 0; i < len; ++i) { sum += abs(left[i] - right[i]); return sum; } (adapted from a H.264 video encoder) raw encoded 1.No writes to memory 2.Idempotent 3.Recoverable by re-execution SIMPLE + INTUITIVE + FLEXIBLE
14
ISCA 2010 - 14 ISA Hardware Software
15
ISCA 2010 - 15 Microarchitecture 1.Fine-grained hardware detection (e.g. Argus) 2.Recovery PC register + control logic Hardware SIMPLE MICROARCHITECTURE
16
ISCA 2010 - 16 Homogenous Relax All cores with no hardware recovery support Hardware Organization “Relaxed” cores No hardware recovery Normal cores Hardware recovery Dynamically Heterogeneous Relax Hardware recovery adaptively disabled Statically Heterogeneous Relax Some cores with; some cores without FLEXIBLE DESIGN
17
ISCA 2010 - 17 ISA Software Hardware Evaluation
18
ISCA 2010 - 18 Evaluation Is it useful? How useful is it?
19
ISCA 2010 - 19 Is it Useful? Application NamePercent Execution Time Contribution of Function BarnesHut (Lonestar)>99.9% bodytrack (PARSEC)21.9% canneal (PARSEC)89.4% ferret (PARSEC)15.7% kmeans (MineBench)83.3% raytrace (PARSEC)49.4% x264 (PARSEC)49.2% Language support using LLVM One relax region per application (most dominant function) Retry and discard behavior 7 Applications IT WORKS!
20
ISCA 2010 - 20 How Useful Is It? Software recovery for timing speculation
21
ISCA 2010 - 21 Methodology Instruction-level fault injection Execution time model Statically Heterogeneous Architecture Energy model Energy-delay product (EDP) Analytical model for hardware efficiency
22
ISCA 2010 - 22 Results – Execution Time * error rates range from 10 -3 to 10 -6 errors/cycle Execution time overhead is less than 10% and 1% typical Discard performance is comparable to retry
23
ISCA 2010 - 23 Results – Energy-delay * error rates range from 10 -3 to 10 -6 errors/cycle Relax achieves energy improvements for timing speculation
24
ISCA 2010 - 24 Future Work Better software support Compiler automation? Binary instrumentation? Nesting relax blocks? Hardware support What are the chip-level area and power savings? Is Relax hardware truly simpler? Other domains Software rollback for hardware transactional memory? Tools to assist analysis of “discard” Discard is hard to reason about; non-deterministic
25
ISCA 2010 - 25 Summary Emerging Architectures Many-core architectures are simple Hardware fault recovery is complex Emerging Applications Error tolerant Large idempotent regions Software Recovery is a natural fit Relax : an architectural framework for software recovery ISA:an interface to define it Software: support for applications to use it Hardware:hardware that enables it
26
ISCA 2010 - 26 ?
27
ISCA 2010 - 27 ISA Semantics Errors must be “spatially contained” to the target resources of a relax block Misdirected stores and register not recoverable by Relax! Errors must be “temporally contained” to the scope of a relax block ECC (or other technique) necessary for memory Cache coherence, cache writeback, etc. require other mechanisms Control flow must be “legal” (follow static control flow edges) Includes hardware exceptions (must wait on detection before trap) Atomic operations (e.g. atomic increment) are problematic Not supported (sorry) ISCA 2010 - 27
28
ISCA 2010 - 28 Fault Detection Short latencies important for Detecting misdirected stores Detecting misdirected register writes Otherwise, latencies depend on region sizes 50 cycle regions + 5 cycle latency = 10% overhead Average region sizes in paper = 1000 cycles Then, 10 cycle latency = 1% overhead
29
ISCA 2010 - 29 “Optimal” Error Rate Error rate EDP Time EDP Hardware Efficiency Execution Time Overall Efficiency optimum
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.