Presentation is loading. Please wait.

Presentation is loading. Please wait.

MAPLD 2005/202 Pratt1 Improving FPGA Design Robustness with Partial TMR Brian Pratt 1,2 Michael Caffrey, Paul Graham 2 Eric Johnson, Keith Morgan, Michael.

Similar presentations


Presentation on theme: "MAPLD 2005/202 Pratt1 Improving FPGA Design Robustness with Partial TMR Brian Pratt 1,2 Michael Caffrey, Paul Graham 2 Eric Johnson, Keith Morgan, Michael."— Presentation transcript:

1 MAPLD 2005/202 Pratt1 Improving FPGA Design Robustness with Partial TMR Brian Pratt 1,2 Michael Caffrey, Paul Graham 2 Eric Johnson, Keith Morgan, Michael Wirthlin 1 1 Brigham Young University Department of Electrical Engineering 2 Los Alamos National Laboratory

2 MAPLD 2005/202 Pratt2 Motivation for Partial TMR Factors of fault-tolerant computing: –Availability –Reliability –Mitigation Cost Full TMR –Expensive in terms of power, speed, area, etc. –Worthwhile if affordable! Area Cost MTBF x Reliability constraint Area constraint

3 MAPLD 2005/202 Pratt3 Motivation for Partial TMR Partial TMR offers: –Mitigation of most sensitive design structures –Increased availability of a system by decreasing number of system resets –Decreased mitigation cost over full TMR Suitability of Partial TMR is application dependent –Reduced reliability compared to full TMR

4 MAPLD 2005/202 Pratt4 Scrubbing Must be included with Partial Mitigation Continuously ‘read’ and ‘clean’ configuration memory Single bit will be upset no longer than t s t s = time for one scrub 10010110101010001100011001011010101000110001

5 MAPLD 2005/202 Pratt5 Non-Persistent Errors An SEU in the non- persistent cross- section will cause a temporary interruption of service Requires partial reconfiguration to correct Scrubbing Repairs Configuration Correct Output time cycle error magnitude error = delta between outputs of a golden and DUT circuit

6 MAPLD 2005/202 Pratt6 Persistent Errors An SEU in the persistent cross- section will cause a permanent interruption of service Requires full system reset to correct Scrubbing Repairs Configuration Incorrect Output error = delta between outputs of a golden and DUT circuit time cycle error magnitude

7 MAPLD 2005/202 Pratt7 Non-Persistent Circuit Structures Generally consists of circuit components and routing in a feed- forward path Logic FF Logic

8 MAPLD 2005/202 Pratt8 Persistent Circuit Structures Generally consists of circuit components and routing in, or contributing to, a feed-back path Logic FF Logic

9 MAPLD 2005/202 Pratt9 Apply a mitigation technique to just the persistent cross section Logic FF Logic TMR Partial Mitigation

10 MAPLD 2005/202 Pratt10 Limitations of Partial Mitigation Does not prevent all errors –System must be corrected with configuration bitstream scrubbing –Circuit configuration can be incorrect between scrubbing Non-persistent errors remain

11 MAPLD 2005/202 Pratt11 Automated Partial TMR Analyze an EDIF source file for feedback structures –Protect these sections with TMR to reduce persistent cross section

12 MAPLD 2005/202 Pratt12 BLTmr Partial TMR Tool BYU-LANL Triple Modular Redundancy: Configurable Reliability –Limit mitigation to minimize: design resource requirements power consumption –Mitigation focused on persistent circuit structures

13 MAPLD 2005/202 Pratt13 BLTmr Partial TMR Tool Design Divided into three sections: – Feedback, Input to FB, Output Logic FF Logic

14 MAPLD 2005/202 Pratt14 BLTmr Partial TMR Tool Design Divided into three sections: – Feedback, Input to FB, Output Logic FF Logic

15 MAPLD 2005/202 Pratt15 BLTmr Partial TMR Tool Design Divided into three sections: – Feedback, Input to FB, Output Logic FF Logic

16 MAPLD 2005/202 Pratt16 BLTmr Partial TMR Tool Design Divided into three sections: – Feedback, Input to FB, Output Logic FF Logic

17 MAPLD 2005/202 Pratt17 BLTmr Tool Options BLTmr Tool applies TMR mitigation to subsections of the design: –Feedback Only –Feedback + Input to Feedback –FB + Input to FB + Output (Full TMR)

18 MAPLD 2005/202 Pratt18 BLTmr Tool Options BLTmr Tool applies TMR mitigation to subsections of the design: –Feedback Only –Feedback + Input to Feedback –FB + Input to FB + Output (Full TMR)

19 MAPLD 2005/202 Pratt19 BLTmr Tool Options BLTmr Tool applies TMR mitigation to subsections of the design: –Feedback Only –Feedback + Input to Feedback –FB + Input to FB + Output (Full TMR)

20 MAPLD 2005/202 Pratt20 BLTmr Tool Options BLTmr Tool applies TMR mitigation to subsections of the design: –Feedback Only –Feedback + Input to Feedback –FB + Input to FB + Output (Full TMR)

21 MAPLD 2005/202 Pratt21 BLTmr Tool Flow BYU EDIF development environment reads in user design Design organized into graph structure for analysis Parse EDIF Create Design Database User Constraints Analysis (Feedback, Input to FB, etc.) Cell Triplication Original Design Partially Mitigated Design Voter Insertion

22 MAPLD 2005/202 Pratt22 BLTmr Tool Flow User may direct mitigation Design analyzed to classify components as described Parse EDIF Create Design Database User Constraints Analysis (Feedback, Input to FB, etc.) Cell Triplication Original Design Partially Mitigated Design Voter Insertion

23 MAPLD 2005/202 Pratt23 BLTmr Tool Flow Circuit elements triplicated Voters inserted Mitigated design written in EDIF format Parse EDIF Create Design Database User Constraints Analysis (Feedback, Input to FB, etc.) Cell Triplication Original Design Partially Mitigated Design Voter Insertion

24 MAPLD 2005/202 Pratt24 Example Circuits Tests on two designs 1.DSP Kernel 2.Synthetic Design –LFSR modules feeding into an add-multiply tree

25 MAPLD 2005/202 Pratt25 FPGA Editor Layout Sensitivity MapPersistence Map DSP Kernel Unmitigated Fault Analysis 5,746 slices (46%) 575,448 bits (9.9%)13,841 bits (0.23%) Synthetic Design 2,538 slices (20%)189,835 bits (3.3%)77,159 bits (1.3%)

26 MAPLD 2005/202 Pratt26 FPGA Editor LayoutSensitivity MapPersistence Map Unmitigated Experimental Results – Design #1 DSP Kernel 5,746 slices (46%) 575,448 (9.90%)13,841 (0.24%) Partial TMR applied to Feedback & Input to FB 8,036 slices (65%) 569,700 (9.81%)152 (0.0026%)

27 MAPLD 2005/202 Pratt27 Unmitigated Experimental Results – Design #2 Synthetic (LFSR/Mult) 2,538 slices (20%) 189,835 (3.27%)77,159 (1.33%) Full TMR Applied 11,961 slices (97%) 20,256 (0.35%) 671 (0.012%) FPGA Editor LayoutSensitivity MapPersistence Map

28 MAPLD 2005/202 Pratt28 * Full TMR could not be applied to DSP Kernel due to FPGA resource constraints “Qpro Virtex 2.5V radiation hardened FPGAs”, Xilinx Inc., DS028 (v1.2), Nov. 5, 2001. Experimental Results

29 MAPLD 2005/202 Pratt29 Experimental Results GPS orbit (22,200 km altitude, 55° inclination) AP-8 Solar Minimum, JPL Solar Proton Quiet, CRÈME 96 Solar Minimum

30 MAPLD 2005/202 Pratt30 Summary of Results Design Size Increase Sensitivity Decrease Persistence Decrease Average MTBF Increase ‡‡ DSP Kernel* 40%3%99%90x Synthetic Design ‡ 370%89%99%114x * Unmitigated to Partial TMR of Feedback + Input to FB ‡ Unmitigated to Full TMR ‡‡ GPS orbit; AP-8 Solar Minimum, JPL Solar Proton Quiet, CRÈME 96 Solar Minimum

31 MAPLD 2005/202 Pratt31 Conclusions Pros: Partial TMR (BLTmr) as fault mitigation offers: –Increased system availability due to fewer system resets –More “affordable” fault mitigation than full TMR –Critical design areas are mitigated with an automated tool Cons: –Much of the design may be unmitigated, leaving sensitive sections May result in temporary errors


Download ppt "MAPLD 2005/202 Pratt1 Improving FPGA Design Robustness with Partial TMR Brian Pratt 1,2 Michael Caffrey, Paul Graham 2 Eric Johnson, Keith Morgan, Michael."

Similar presentations


Ads by Google