Download presentation
Presentation is loading. Please wait.
Published byVictor Willis Modified over 8 years ago
1
IPR: In-Place Reconfiguration for FPGA Fault Tolerance Zhe Feng 1, Yu Hu 1, Lei He 1 and Rupak Majumdar 2 1 Electrical Engineering Department 2 Computer Science Department University of California, Los Angeles Present by Zhe Feng Address comments to lhe@ee.ucla.edu
2
Outline Introduction and motivation Introduction and motivation Algorithms Algorithms Experimental Results Experimental Results Conclusions Conclusions
3
Soft Error Soft errors could be caused by cosmic rays or noise upsets Future devices more vulnerable due to scaling Special session 1E “Resilient Computing” Two types of soft errors in FPGA Single Event Upset (SEU): Modification of the content of memory bits Single Event Transient (SET): Glitches latched by registers
4
SEU for FPGA SEU of block memory can be detected and corrected by row-based CRC and ECC SEU of configuration memory can be fixed by Periodical memory scrubbing. Scan-based CRC and ECC Both may be too late, as the circuit function may have been changed.
5
SER (Soft Error Rate) SER is calculated by Monte Carlo simulation under single fault model. SER is calculated by Monte Carlo simulation under single fault model. In each run, SER is the percentage of clock cycles with observable errors at primary output for given test bench In each run, SER is the percentage of clock cycles with observable errors at primary output for given test bench The overall SER is the average of all runs. The overall SER is the average of all runs. SER 1/ MTTF (mean time to failures) SER 1/ MTTF (mean time to failures)
6
Impact of SEU for FPGA FGPA has 10x bigger SER compared to ASIC Due to large configuration memory SEU is one of biggest challenges for FPGA-based applications Most FPGAs are used in systems but not prototypes One of the biggest application is internet routers FPGA boards returned after two crashes
7
FPGA Resynthesis Resynthesis Resynthesis Rewrites the circuit in logic or physical netlist Rewrites the circuit in logic or physical netlist Reconfigures the LUTs Reconfigures the LUTs (Source: Andrew Ling, University of Toronto, DAC'05) RTL Synthesis Logic Synthesis Technology Mapping ResynthesisPackingP&R
8
ROSE performs iterative logic transformations with explicit stochastic yield rate evaluation ROSE performs iterative logic transformations with explicit stochastic yield rate evaluation Logic transformation by fault tolerance Boolean Matching Boolean Matching Inputs Template H and Boolean function F for logic block Fault rates for the inputs and the SRAM bits of the template Outputs Either that F cannot be implemented by template H Or the configuration of H to obtain function F ROSE: RObust REsynthesis [ICCAD08’] Fault-Tolerant Boolean Matching minimizes the observable faults at the output of the template
9
Need of In-place Logic Optimization ROSE, same as most existing logic optimization techniques, does not preserve the layout (topology) of a circuit design. ROSE, same as most existing logic optimization techniques, does not preserve the layout (topology) of a circuit design. Interconnect dominates in FPGA In-place resynthesis (IPR) leads to a faster design closure. In-place resynthesis (IPR) leads to a faster design closure. Minimal or no impact on the physical design Minimal or no impact on the physical design IPR ROSE
10
Our Major Contributions Propose an in-place resynthesis algorithm, IPR Propose an in-place resynthesis algorithm, IPR Maximize the yield rate for FPGAs Maximize the yield rate for FPGAs Preserve the topology of the logic network Preserve the topology of the logic network Reduce the runtime complexity compared to other SAT-based approaches Reduce the runtime complexity compared to other SAT-based approaches IPR reduces the fault rate by 48% and increases MTTF by 1.94X. IPR reduces the fault rate by 48% and increases MTTF by 1.94X. Compared to the state-of-the-art academic technology mapper Berkeley ABC. Compared to the state-of-the-art academic technology mapper Berkeley ABC. With the same area and performance. With the same area and performance.
11
Outline Background Background Algorithms Algorithms Experimental Results Experimental Results Conclusions Conclusions
12
IPR: In-place Reconfiguration 0 00 101 11 1010 1 00 001 10 0110 0 00 101 11 1101 1 00 001 10 1101 (0 -> 1) 1 1 1 1 1 1 1 000 Fault rate = 37.5% Fault rate = 12.5% Maximize identical configuration bits for complementary inputs of an LUT. Maximize identical configuration bits for complementary inputs of an LUT. Change the functions of multiple LUTs to guarantee the function of the circuit unchanged. Change the functions of multiple LUTs to guarantee the function of the circuit unchanged.
13
IPR algorithm Initial Full-chip Functional Simulation Initial Full-chip ODC Mask Calculation Node Criticality Analysis Cone Construction In-place LUT Reconfiguration and Boolean Matching Localize Truth Table Update Localize ODC Mask Update Circuit Analysis Localize Update
14
IPR algorithm Initial Full-chip Functional Simulation Initial Full-chip ODC Mask Calculation Node Criticality Analysis Cone Construction In-place LUT Reconfiguration and Boolean Matching Localize Truth Table Update Localize ODC Mask Update Circuit Analysis Localize Update
15
ODC Mask based Node Criticality 0 101 1 LogicNetwork … 0 011 1 0 100 1 001 0 000 1 100 0 0 Primary outputs ODC mask: 1010 (I. Markov, ICCAD’07) The ODC mask quantifies the impact of a node on the primary outputs. The ODC mask quantifies the impact of a node on the primary outputs. The criticality of a node is defined as the percentage of one’s in the ODC mask, and decides the priority of reconfiguration in IPR. The criticality of a node is defined as the percentage of one’s in the ODC mask, and decides the priority of reconfiguration in IPR.
16
IPR algorithm Initial Full-chip Functional Simulation Initial Full-chip ODC Mask Calculation Node Criticality Analysis Cone Construction In-place LUT Reconfiguration and Boolean Matching Localize Truth Table Update Localize ODC Mask Update Circuit Analysis Localize Update
17
Cone Construction Select a subset S N of first-order fanout LUTs of n Construct a cone for a selected root LUT Root LUT is a fanout of S N Include S N but not its first-order fanins Cut size of the cone is limited a n d c b e Root
18
In-place LUT Reconfiguration The functions of LUTs in the cone are changed to increase # of identical configuration pairs But function of input/out nets and topology of internal nets are kept unchanged No change of circuit function and layout a n d c b e Root
19
In-place Boolean Matching Conjunctive Normal Form (CNF) Truth table can be encoded as follows Truth table can be encoded as follows The cone can be encoded as follows The cone can be encoded as follows To make a pair of configuration bits (ci, cj) To make a pair of configuration bits (ci, cj) in LUT L symmetric, we have in LUT L symmetric, we have Combining all the three, we have CNF Combining all the three, we have CNF formulation for in-place Boolean matching (IP-BM). formulation for in-place Boolean matching (IP-BM). IP-BM preserves both the logic function and topology of the cone. IP-BM preserves both the logic function and topology of the cone.
20
Outline Background Background Algorithms Experimental Results Experimental Results Conclusions Conclusions
21
Experimental Settings and CAD Flows Implemented in C++ and use miniSAT2.0 as the SAT solver Implemented in C++ and use miniSAT2.0 as the SAT solver Results collected on a Ubuntu workstation with 2.6GHz Xeon CPU and 2GB memory Results collected on a Ubuntu workstation with 2.6GHz Xeon CPU and 2GB memory QUIP benchmarks are tested QUIP benchmarks are tested Mapped with 4-LUTs by Berkeley ABC Mapped with 4-LUTs by Berkeley ABC Perform and compare the following synthesis flows: ABC, IPR, ROSE+IPR Perform and compare the following synthesis flows: ABC, IPR, ROSE+IPR
22
Experimental Settings and CAD Flows (Cont’) Fault model Fault model Uniform soft error rate for all configuration bits in LUT but ignore interconnect configuration bits during IPR. Uniform soft error rate for all configuration bits in LUT but ignore interconnect configuration bits during IPR. Uniform soft error rate for all configuration bits in LUT and interconnect during validation. Uniform soft error rate for all configuration bits in LUT and interconnect during validation. The fault rate of the chip is calculated by Monte Carlo simulation The fault rate of the chip is calculated by Monte Carlo simulation Single fault injection for all configuration bits in LUT and interconnect Single fault injection for all configuration bits in LUT and interconnect 32k random inputs 32k random inputs
23
Full-chip Fault Rate by Monte Carlo Simulation 59% fault rate reduction! ABC vs. IPR vs. ROSE+IPR: 1:0.52:0.51 ABC vs. IPR vs. ROSE+IPR: 1:0.52:0.51
24
Area (LUT#) ABC vs. IPR vs. ROSE+IPR: 1: 1 : 0.81 ABC vs. IPR vs. ROSE+IPR: 1: 1 : 0.81
25
Estimation of Mean Time To Failure The best flow in terms of the robustness and area is ROSE+IPR The best flow in terms of the robustness and area is ROSE+IPR 50x faster!
26
Conclusions We develop an in-place resynthesis algorithm, IPR. We develop an in-place resynthesis algorithm, IPR. Increases MTTF by 2X over ABC; Increases MTTF by 2X over ABC; Preserves the topology of the logic network for a faster design closure; Preserves the topology of the logic network for a faster design closure; Complementary to existing fault-tolerant resynthesis algorithms. Complementary to existing fault-tolerant resynthesis algorithms. In the future, we will consider In the future, we will consider Experiments assume multiple uncorrelated faults and given correlations between faults; Experiments assume multiple uncorrelated faults and given correlations between faults; Extend IPR with criticality considering interconnects explicitly. Extend IPR with criticality considering interconnects explicitly.
27
Thank You! IPR: In-Place Reconfiguration for FPGA Fault Tolerance Zhe Feng, Yu Hu, Lei He and Rupak Majumdar
28
Backup Slides
29
Criticality for Configuration Bit Depends on two criteria: Depends on two criteria: One is a sequence of input vectors for the LUT. One is a sequence of input vectors for the LUT. The other is the ODC mask of the LUT. The other is the ODC mask of the LUT. The criticality of a configuration bit c : The criticality of a configuration bit c :
30
In-place Boolean Matching Conjunctive Normal Form (CNF) Truth table can be encoded as follows Truth table can be encoded as follows The cone can be encoded as follows The cone can be encoded as follows To make a pair of configuration bits (ci, cj) To make a pair of configuration bits (ci, cj) in LUT L symmetric, we have in LUT L symmetric, we have Combining all the three, we have CNF Combining all the three, we have CNF formulation for in-place Boolean matching (IP-BM). formulation for in-place Boolean matching (IP-BM). IP-BM preserves both the logic function and topology of the cone. IP-BM preserves both the logic function and topology of the cone.
31
IPR algorithm Initial Full-chip Functional Simulation Initial Full-chip ODC Mask Calculation Node Criticality Analysis Cone Construction In-place LUT Reconfiguration and Boolean Matching Localize Truth Table Update Localize ODC Mask Update Circuit Analysis Localize Update
32
Localized Update Localized update of ODC mask reduces runtime Localized update of ODC mask reduces runtime Reconfigured ConeC R Maximum Fanin Cone C MFI Maximum Fanout Cone C MFO C MFI is affected, but the ODC mask is not updated to save time. ODC mask updated for C R. C MFO is not affected, so the ODC mask does not need to be updated.
33
Defects are created equally but not propagated equally Defects are created equally but not propagated equally Logic don’t-cares may mask the propagation of defects Logic don’t-cares may mask the propagation of defects Key to stochastic synthesis: Logic Masking defect 11 Not affected by defects! Observability Don’t-cares with a=1&b=1 We can maximize don’t-cares while keeps the logic function. We can maximize don’t-cares while keeps the logic function.
34
IPR Enhancement Iterative (i.e., random) algorithm without greedy procedure based on criticality Iterative (i.e., random) algorithm without greedy procedure based on criticality Provide different ordering for optimization of gates Provide different ordering for optimization of gates Without periodic yield rate evaluation Without periodic yield rate evaluation With periodic yield rate evaluation With periodic yield rate evaluation Large cut size Large cut size Increase the opportunity to find the feasible cone. Increase the opportunity to find the feasible cone.
35
IPR Enhancement (Cont’) Extend to MIMO Extend to MIMO MISOMIMO MISOMIMO Increase the opportunity to try more LUTs Increase the opportunity to try more LUTs
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.