Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hardware Assisted Fault Tolerance Using Reconfigurable Logic

Similar presentations


Presentation on theme: "Hardware Assisted Fault Tolerance Using Reconfigurable Logic"— Presentation transcript:

1 Hardware Assisted Fault Tolerance Using Reconfigurable Logic
Ayse K. Coskun CSE 237A - Project

2 Project Outline Motivation Hardware Fault Tolerance Techniques
Fault Tolerance Design Flow Example Circuitry for Implementation Redundancy Fault Masking, Error Detection, Diagnosis Reconfiguration Eliminating faulty blocks Results & Discussion

3 Motivation Fault resilience is required at certain levels in each circuit High fault rates in VDSM & nanoscale devices Fault masking Transient (temporary) errors, single event upsets High clock rates, fast propagation of faults Further solutions needed for: Eliminating defects to increase manufacturing yield Eliminating permanent faults (run-time) Increasing device life-time

4 HW Assisted Fault Tolerance
Pros Fast detection and recovery Transparent to user No other way present to build circuits with high reliability Cons Area overhead Timing overhead problem may be present Hard-real time applications Considerable design time & effort

5 Goals Study fault tolerance techniques and design flow
Implement a redundancy based circuit Fault masking Error detection and diagnosis Recovery Practice reconfiguration on the circuit Mark off faulty blocks and reconfigure (off-line) Dynamic Reconfiguration - canceled

6 Fault Tolerance Design Flow

7 Example Circuitry for Implementation (VHDL)

8 Redundancy Triple Modular Redundancy (TMR)
Fast fault masking & recovery for hard real-time and safety-critical applications Place voters at the outputs of every clocked block (No voting scheme for combinational circuits) Masking ability Only one copy is faulty More than one copy is faulty but errors are at different register locations Duplication can detect errors but cannot mask them Diagnosis and recovery Additional circuitry added for diagnosis and recovery

9

10 TMR with Roll-Forward Recovery

11 Fault Insertion & Diagnosis
Adding MUXes at several points to force lines to faulty values ModelSim verification Diagnosis:

12 Xilinx RTL Schematic - Top level

13 Controller –RTL schematic

14 Reconfiguration Dynamic Reconfiguration: Off-line reconfiguration:
Needs interface to load different configurations online to the chip Canceled because of complexity Off-line reconfiguration: Xilinx Area Constraints Editor Edit *.ucf file AREA_GROUP: includes selected instances of circuit INST “instance” AREA_GROUP="GROUP1"; ... AREA_GROUP "GROUP1" RANGE=SLICE_X0Y0:SLICE_X7Y35; AREA_GROUP "GROUP1" GROUP=CLOSED; AREA_GROUP "GROUP1" PLACE=OPEN;

15 Reconfiguration cont’d

16 Reconfiguration cont’d
Common reconfiguration approaches: Tile based Column-based Hierarchical (column & row ) based Xilinx design flow for reconfiguration: VHDL Synthesis Translate Map Place&Route Edit Floorplan /*.ucf file  Back to Translate  ...

17

18 Before & After Reconfiguration

19

20 Evaluation and Discussion
TMR with roll-forward has effective fault masking and recovery Column based reconfiguration does not add significant area overhead to TMR circuit Fault tolerant design has considerable design time and effort problem Development of automated FT design flow Fault diagnosis is also a bottleneck for large scale circuits

21 Summary Fault tolerance design flow Redundancy methods:
Fault Masking Fault Detection Recovery TMR roll-forward implementation Reconfiguration: Dynamic / Off-line Off-line column-based implementation


Download ppt "Hardware Assisted Fault Tolerance Using Reconfigurable Logic"

Similar presentations


Ads by Google