Download presentation
Presentation is loading. Please wait.
Published byChloe Hampton Modified over 5 years ago
1
Hardware Assisted Fault Tolerance Using Reconfigurable Logic
Ayse K. Coskun CSE 237A - Project
2
Project Outline Motivation Hardware Fault Tolerance Techniques
Fault Tolerance Design Flow Example Circuitry for Implementation Redundancy Fault Masking, Error Detection, Diagnosis Reconfiguration Eliminating faulty blocks Results & Discussion
3
Motivation Fault resilience is required at certain levels in each circuit High fault rates in VDSM & nanoscale devices Fault masking Transient (temporary) errors, single event upsets High clock rates, fast propagation of faults Further solutions needed for: Eliminating defects to increase manufacturing yield Eliminating permanent faults (run-time) Increasing device life-time
4
HW Assisted Fault Tolerance
Pros Fast detection and recovery Transparent to user No other way present to build circuits with high reliability Cons Area overhead Timing overhead problem may be present Hard-real time applications Considerable design time & effort
5
Goals Study fault tolerance techniques and design flow
Implement a redundancy based circuit Fault masking Error detection and diagnosis Recovery Practice reconfiguration on the circuit Mark off faulty blocks and reconfigure (off-line) Dynamic Reconfiguration - canceled
6
Fault Tolerance Design Flow
7
Example Circuitry for Implementation (VHDL)
8
Redundancy Triple Modular Redundancy (TMR)
Fast fault masking & recovery for hard real-time and safety-critical applications Place voters at the outputs of every clocked block (No voting scheme for combinational circuits) Masking ability Only one copy is faulty More than one copy is faulty but errors are at different register locations Duplication can detect errors but cannot mask them Diagnosis and recovery Additional circuitry added for diagnosis and recovery
10
TMR with Roll-Forward Recovery
11
Fault Insertion & Diagnosis
Adding MUXes at several points to force lines to faulty values ModelSim verification Diagnosis:
12
Xilinx RTL Schematic - Top level
13
Controller –RTL schematic
14
Reconfiguration Dynamic Reconfiguration: Off-line reconfiguration:
Needs interface to load different configurations online to the chip Canceled because of complexity Off-line reconfiguration: Xilinx Area Constraints Editor Edit *.ucf file AREA_GROUP: includes selected instances of circuit INST “instance” AREA_GROUP="GROUP1"; ... AREA_GROUP "GROUP1" RANGE=SLICE_X0Y0:SLICE_X7Y35; AREA_GROUP "GROUP1" GROUP=CLOSED; AREA_GROUP "GROUP1" PLACE=OPEN;
15
Reconfiguration cont’d
16
Reconfiguration cont’d
Common reconfiguration approaches: Tile based Column-based Hierarchical (column & row ) based Xilinx design flow for reconfiguration: VHDL Synthesis Translate Map Place&Route Edit Floorplan /*.ucf file Back to Translate ...
18
Before & After Reconfiguration
20
Evaluation and Discussion
TMR with roll-forward has effective fault masking and recovery Column based reconfiguration does not add significant area overhead to TMR circuit Fault tolerant design has considerable design time and effort problem Development of automated FT design flow Fault diagnosis is also a bottleneck for large scale circuits
21
Summary Fault tolerance design flow Redundancy methods:
Fault Masking Fault Detection Recovery TMR roll-forward implementation Reconfiguration: Dynamic / Off-line Off-line column-based implementation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.