BL-TMR and Mitigation Approaches for FPGAs

Slides:

Advertisements

Similar presentations

IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©

Advertisements

Survey of Detection, Diagnosis, and Fault Tolerance Methods in FPGAs

FPGA (Field Programmable Gate Array)

Sana Rezgui 1, Jeffrey George 2, Gary Swift 3, Kevin Somervill 4, Carl Carmichael 1 and Gregory Allen 3, SEU Mitigation of a Soft Embedded Processor in.

10/14/2005Caltech1 Reliable State Machines Dr. Gary R Burke California Institute of Technology Jet Propulsion Laboratory.

Copyright 2001, Agrawal & BushnellVLSI Test: Lecture 261 Lecture 26 Logic BIST Architectures n Motivation n Built-in Logic Block Observer (BILBO) n Test.

Scrubbing Approaches for Kintex-7 FPGAs

Discussion of: “Terrestrial-based Radiation Upsets: A Cautionary Tale” CprE 583 Tony Kuker 12/06/05.

Multi-Bit Upsets in the Virtex Devices Heather Quinn, Paul Graham, Jim Krone, Michael Caffrey Los Alamos National Laboratory Gary Swift, Jeff George, Fayez.

Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.

Using FPGAs in a Radiation Environment ATLAS CMS Electronics Workshop for LHC upgrades (ACES2014) March, 2014, CERN Michael Wirthlin Brigham Young.

HPEC 2012 Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing Quinn Martin Alan George.

Complex Upset Mitigation Applied to a Re-Configurable Embedded Processor EEL 6935 Lu Hao Wenqian Wu.

1 Fault Tolerant FPGA Co-processing Toolkit Oral defense in partial fulfillment of the requirements for the degree of Master of Science 2006 Oral defense.

ICAP CONTROLLER FOR HIGH-RELIABLE INTERNAL SCRUBBING Quinn Martin Steven Fingulin.

Power Visualization, Analysis, and Optimization Tools for FPGAs Matthew French, Li Wang and Michael Wirthlin USC, Information Sciences Institute, BYU.

Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &

L189/MAPLD2004Carmichael 1 A Triple Module Redundancy Scheme for SEU Mitigation of Static Latch-Based FPGAs (“Birds-of-a-Feather”) Carl Carmichael 1, Brendan.

Micro-RDC Microelectronics Research Development Corporation A Programmable Scrubber for FPGAs ACKNOWLEDGMENT OF SUPPORT: This material is based upon work.

Fault-Tolerant Softcore Processors Part I: Fault-Tolerant Instruction Memory Nathaniel Rollins Brigham Young University.

Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.

ECE Synthesis & Verification1 ECE 667 Spring 2011 Synthesis and Verification of Digital Systems Verification Introduction.

1 Advanced Digital Design Asynchronous Design: Research Concept by A. Steininger and M. Delvai Vienna University of Technology.

Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.

© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.

Radiation Effects and Mitigation Strategies for modern FPGAs 10 th annual workshop for LHC and Future experiments Los Alamos National Laboratory, USA.

Nadpis 1 Nadpis 2 Nadpis 3 Jméno Příjmení Vysoké učení technické v Brně, Fakulta informačních technologií v Brně Božetěchova 2, Brno

Reduced Cost Reliability via Statistical Model Detection Jon-Paul Anderson- PhD Student Dr. Brent Nelson- Faculty Dr. Mike Wirthlin- Faculty Brigham Young.

© 2003 Xilinx, Inc. All Rights Reserved FPGA Design Techniques.

A comprehensive method for the evaluation of the sensitivity to SEUs of FPGA-based applications A comprehensive method for the evaluation of the sensitivity.

ASIC/FPGA design flow. FPGA Design Flow Detailed (RTL) Design Detailed (RTL) Design Ideas (Specifications) Design Ideas (Specifications) Device Programming.

Protecting the Public, Astronauts and Pilots, the NASA Workforce, and High-Value Equipment and Property Mission Success Starts With Safety Believe it or.

© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.

1 Extending Atmel FPGA Flow Nikos Andrikos TEC-EDM, ESTEC, ESA, Netherlands DAUIN, Politecnico di Torino, Italy NPI Final Presentation 25 January 2013.

SiLab presentation on Reliable Computing Combinational Logic Soft Error Analysis and Protection Ali Ahmadi May 2008.

J. Christiansen, CERN - EP/MIC

FORMAL VERIFICATION OF ADVANCED SYNTHESIS OPTIMIZATIONS Anant Kumar Jain Pradish Mathews Mike Mahar.

FT-UNSHADES Analysis of SEU effects in Digital Designs for Space Gioacchino Giovanni Lucia TEC-EDM, MPD - 8 th March Phone: +31.

MAPLD 2005/202 Pratt1 Improving FPGA Design Robustness with Partial TMR Brian Pratt 1,2 Michael Caffrey, Paul Graham 2 Eric Johnson, Keith Morgan, Michael.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.

Synthesis Of Fault Tolerant Circuits For FSMs & RAMs Rajiv Garg Pradish Mathews Darren Zacher.

® Java Debug Hardware Modules Using JBits by Jonathan Ballagh Eric Keller Peter Athanas Reconfigurable Architectures Workshop 2001.

This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.

RTL and Synthesis Design Approach to Radiation-Tolerant and Fail-Safe Targeted Applications Buu Huynh & Roger Do Mentor Graphics Corp.

Introductory project. Development systems Design Entry –Foundation ISE –Third party tools Mentor Graphics: FPGA Advantage Celoxica: DK Design Suite Design.

Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs Ghazanfar (Hossein) Asadi and Mehdi B. Tahoori Why Soft Error Rate (SER) Estimation?

1 - CPRE 583 (Reconfigurable Computing): VHDL to FPGA: A Tool Flow Overview Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 5: 9/7/2011.

2011/IX/27SEU protection insertion in Verilog for the ABCN project 1 Filipe Sousa Francis Anghinolfi.

French 207 MAPLD 2005 Slide 1 Integrated Tool Suite for Post Synthesis FPGA Power Consumption Analysis Matthew French, Li Wang University of Southern California,

LaRC MAPLD 2005 / A208 Ng 1 Radiation Tolerant Intelligent Memory Stack (RTIMS) Tak-kwong Ng, Jeffrey Herath Electronics Systems Branch Systems Engineering.

Evaluating Logic Resources Utilization in an FPGA-Based TMR CPU

Paper by F.L. Kastensmidt, G. Neuberger, L. Carro, R. Reis Talk by Nick Boyd 1.

ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.

Gill 1 MAPLD 2005/234 Analysis and Reduction Soft Delay Errors in CMOS Circuits Balkaran Gill, Chris Papachristou, and Francis Wolff Department of Electrical.

Xilinx V4 Single Event Effects (SEE) High-Speed Testing Melanie D. Berg/MEI – Principal Investigator Hak Kim, Mark Friendlich/MEI.

Chandrasekhar 1 MAPLD 2005/204 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan.

MAPLD 2005/213Kakarla & Katkoori Partial Evaluation Based Redundancy for SEU Mitigation in Combinational Circuits MAPLD 2005 Sujana Kakarla Srinivas Katkoori.

Introduction to the FPGA and Labs

SEU Mitigation Techniques for Virtex FPGAs in Space Applications

ENG3050 Embedded Reconfigurable Computing Systems

Radiation Tolerance of an Used in a Large Tracking Detector

Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &

ECE 553: TESTING AND TESTABLE DESIGN OF DIGITAL SYSTES

MAPLD 2005 BOF-L Mitigation Methods for

M. Aguirre1, J. N. Tombs1, F. Muñoz1, V. Baena1, A. Torralba1, A

Evaluation of Power Costs in Triplicated FPGA Designs

Design of a ‘Single Event Effect’ Mitigation Technique for Reconfigurable Architectures SAJID BALOCH Prof. Dr. T. Arslan1,2 Dr.Adrian Stoica3.

Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs

Hardware Assisted Fault Tolerance Using Reconfigurable Logic

Xilinx Kintex7 SRAM-based FPGA

Presentation transcript:

BL-TMR and Mitigation Approaches for FPGAs Mike Wirthlin BYU #2: "BL-TMR and Mitigation Approaches for FPGAs" It is well known that active mitigation techniques such as TMR are needed to protect FPGA circuits when operating in harsh radiation environments. The BL-TMR tool was developed jointly by BYU and the Los Alamos National Laboratory (LANL) to automatically insert TMR structures within FPGA circuits. This tool has been used in a number of experiments both in spacecraft and in radiation testing. These results suggest that the BL-TMR tool, coupled with configuration scrubbing, significantly improves the reliability of FPGA circuits. This presentation will describe the objectives of the BL-TMR tool, the mitigation approach it applies, and reliability improving results that have been demonstrated by the tool. In addition, this presentation will summarize a number of other mitigation approaches that have been or could be applied to FPGA circuits.

1. TMR Overview

Triple Modular Redundancy (TMR) A form of N Modular Redundancy Triplicate hardware resources Majority Vote on hardware outputs Tolerates any single fault Tolerates many multiple fault combinations Mike Wirthlin, BYU

TMR Granularity System Level Device Level Module Level RTL Level process(clk_int_a) begin if clk_int_a'event and clk_int_a='1' then locked_d_a <= locked_a_int; if (all_locked_a = '0') then all_locked_a <= (locked_d_a and locked_d_b and locked_d_c); else all_locked_a <= tmr_voter( locked_d_a, locked_d_b, locked_d_c); end if; end process Module Level RTL Level Logic Level Mike Wirthlin, BYU

TMR Reliability TMR has lower reliability than non-redundant for long mission times Effective TMR almost always is coupled with “repair” Non-redundant TMR Mike Wirthlin, BYU

TMR + Repair = Very Reliable! Mike Wirthlin, BYU

FPGA Configuration “Repair” x Configuration Upset Mike Wirthlin, BYU

FPGA Configuration “Repair” x Configuration Upset Repaired Mike Wirthlin, BYU

TMR & Scrubbing Example Mike Wirthlin, BYU

Voters Before Flip Flops Mike Wirthlin, BYU

Voters After Flip-Flops Mike Wirthlin, BYU

More Frequent Voting Mike Wirthlin, BYU

TMR Synchronization Fault repair through scrubbing Fixes the cause of the error Does NOT fix the state of the circuit State of circuit must be synchronized to working circuits Mike Wirthlin, BYU

Synchronizing Voters Mike Wirthlin, BYU

Synchronizing Voters Mike Wirthlin, BYU

Clock Domain Crossing Mike Wirthlin, BYU

Partial TMR TMR may be applied selectively Failures in some circuit areas cause more harm than others Some circuit areas are protected by other SEE mitigation techniques (TMR not needed) Challenge: deciding where to apply TMR Circuits with feedback (state machines) Circuits with high “functional influence” Mike Wirthlin, BYU

Persistent vs. Non-persistent Upset Some upsets repaired through scrubbing Non-persistent upsets: repairable through scrubbing Persistent upsets: requires reconfiguration Non-Persistent Upset Persistent Upset Bitstream Repair Upset Bitstream Repair Upset = First type of functional error is non-persistent error = Example - DSP circuit output streams - correct operation -> difference is zero - incorrect operation -> difference is non-zero = This plot - non-zero for 64 time steps - Upset -> outputs diverge - @ 130 scrubbing repairs - functional errors disappear - circuit and functional data returned to normal *without* reset = We call this a non-persistent error or temporary interruption of service The first type of functional error is a non-persistent error. To illustrate this type of error, we collected output data streams from identical copies of a real DSP circuit. When the circuit is operating correctly, the difference is zero. When functional errors are present, the difference is non-zero. In this plot, the DUT and golden circuit are in sync for the first 64 time-steps. At time cycle 65, an SEU upsets a configuration bit in the DUT. Almost immediately, the outputs of the two circuits diverge due to functional errors and the difference is non-zero. At time cycle 130 scrubbing repairs the upset bit and shortly thereafter the functional errors disappear. The circuit and its functional data returned back to normal *without* the need for a system reset. Due to the upset bit, the system experienced a temporary interruption of service. = An SEU in the non-persistent cross section causes a temporary interruption of service (non-persistent error) = After scrubbing repairs configuration bitstream, output errors disappear error magnitude error magnitude Correct Output Incorrect Output time cycle time cycle

Persistent Circuit Structures Logic FF Logic FF Logic FF Logic FF Logic FF Non-Persistent Structure – Feed-forward Persistent Structures – Contribute to feedback Partial TMR – Priority given to persistent structures Non-persistent Circuit Structures Feed-forward logic Errors are flushed out of the circuit once scrubbing repairs the configuration Persistent Circuit Structures Those that contribute to feedback Feedback logic Logic that feeds into feedback logic Errors are trapped in the feedback loops (state of design) even after configuration is repaired by scrubbing A computer can be (relatively) easily programmed to find feedback in circuits Static analysis of circuit done at netlist (EDIF) level Same analysis for any circuit Some gains could be realized if customizing analysis to circuit type, but would be much more complicated. Remember, one of the benefits of TMR is its simplicity! Our Strategy for Partial Mitigation is: Focus on (give priority to) the persistent cross section of a design Still gain the desired improvements in availability/reliability (at lower cost than TMR) This only works for designs that can tolerate temporary data loss (not complete reliability like Full TMR) Mike Wirthlin, BYU

Full TMR TMR Reliable (when combined with configuration scrubbing) Simple to implement Expensive (in terms of area, power, etc.)

Partial TMR BYU and LANL have developed a tool to find the persistent components of a design and focus mitigation on them Applies Triple Modular Redundancy to these persistent sections of a design Mike Wirthlin, BYU

TMR Automation TMR is relatively easy to automate Analyze design Replicate resources Insert voters Verify resulting circuit Different Strategies for Automated TMR Netlist level HDL Level Selective/Partial Several tools available for Automatic TMR Mike Wirthlin, BYU

Automated TMR Tools BL-TMR (and other several other academic projects) Mike Wirthlin, BYU

2. BL-TMR

BL-TMR BYU-LANL TMR Tool BYU-LANL Triple Modular Redundancy Developed at BYU under the support of Los Alamos National Laboratory (Cibola Flight Experiment) Used to test TMR on many designs Fault injection, Radiation testing, in Orbit Testbed for experimenting with various TMR application techniques (used for research) Mike Wirthlin, BYU

Ongoing Development Based on the success of BL-TMR, additional funding has been provided to extend BL-TMR for additional devices, environments, and address new problems Commercial companies concerned about SER rates Cisco Systems High Energy Physics Brookhaven National Laboratory (BNL), CERN Space system developers SEAKR systems, Sandia, LANL, Lockheed Martin Interest in BL-TMR is growing Commercialization currently under consideration

BL-TMR (BYU/LANL TMR) EDIF data structure & API Available tools: Parse, represent, and manipulate EDIF Available tools: EDIF parser Half-latch removal SRL replacement Feedback cutset tool Full and partial TMR Detection circuitry insertion EDIF output Project size ~50 Java packages 350+ Java classes 478,401 lines of code Includes contributions from CHREC member LANL [brian@tiger:test] java -cp ~/jars/BLTmr.jar byucc.edif.tools.tmr.FlattenTMR ../no_tmr/synth/counters80.edf --removeHL --full_tmr --technology virtex -p xcv1000fg680 --log counters80.log BLTmr Tool version 0.2.3, 12 Oct 2006 Search for EDIF files in these directories: [.] Parsing file ../no_tmr/synth/counters80.edf Removing half-latches... Flattening Flattened circuit contains 3451 primitives, 3461 nets, and 13692 net connections Processing: ASUF 1.0 Forcing triplication of instance safeConstantCell_zero Analyzing design . . . Full TMR requested. Triplicating design . . . domainreport=BLTmr_domain_report.txt Added 1931 voters. 3431 instances out of 3451 cells triplicated (99% coverage) 6862 new instances added to design. 3431 nets triplicated (6862 new nets added). 0 ports triplicated. Tools and code available at: http://sourceforge.net/projects/byuediftools/ Mike Wirthlin, BYU

BL-TMR User Control Provides significant control to user Can be scripted for complex BL-TMR runs Usage: java byucc.edif.tools.tmr.FlattenTMR <input_file> [(-o|--output) <output_file>] [(-d|--dir) dir1,dir2,...,dirN ] [(-f|--file) file1,file2,...,fileN ] [--tmrSuffix suffix1,suffix2,...,suffixN ] [--full_tmr] [--tmr_inports] [--tmr_outports] [--no_tmr_p port1,port2,...,portN ] [--tmr_c cell_type1,cell_type2,...,cell_typeN ] [--tmr_i cell_instance1,cell_instance2,...,cell_instanceN ] [--no_tmr_c cell_type1,cell_type2,...,cell_typeN ] [--no_tmr_i cell_instance1,cell_instance2,...,cell_instanceN ] [--notmrFeedback] [--notmrInputToFeedback] [--notmrFeedBackOutput] [--notmrFeedForward] [--noInoutCheck] [--SCCSortType <{1|2|3}>] [--doSCCDecomposition] [--inputAdditionType <{1|2|3}>] [--outputAdditionType <{1|2|3}>] [--mergeFactor <mergeFactor>] [--optimizationFactor <optimizationFactor>] [--factorType <{DUF|UEF|ASUF}>] [--factorValue <factorValue>] [--low <low>] [--high <high>] [--inc <inc>] [--removeHL] [--hlConst <{0|1}>] [--hlUsePort <hlPortName>] [--technology <{virtex|virtex2}>] [(-p|--part) <part>] [--summary] [--log <logfile>] [--domainReport <domainReport>] [--writeConfig[:<config_file>]] [-h|--help] [-v|--version] For detailed usage, try `--help'

Sample Execution [brian@tiger:test] java -cp ~/jars/BLTmr.jar byucc.edif.tools.tmr.FlattenTMR ../no_tmr/synth/counters80.edf --removeHL --full_tmr --technology virtex -p xcv1000fg680 --log counters80.log BLTmr Tool version 0.2.3, 12 Oct 2006 Search for EDIF files in these directories: [.] Parsing file ../no_tmr/synth/counters80.edf Removing half-latches... Flattening Flattened circuit contains 3451 primitives, 3461 nets, and 13692 net connections Processing: ASUF 1.0 Forcing triplication of instance safeConstantCell_zero Analyzing design . . . Full TMR requested. Triplicating design . . . domainreport=BLTmr_domain_report.txt Added 1931 voters. 3431 instances out of 3451 cells triplicated (99% coverage) 6862 new instances added to design. 3431 nets triplicated (6862 new nets added). 0 ports triplicated.

% Increase in Critical Path Cost of TMR Size Increase Critical Path Before TMR After TMR % Increase in Critical Path blowfish 3.1X 28.3 ns 31.7 ns 12.0% des3 3.4X 11.1 ns 13.6 ns 22.5% qpsk 80.0 ns 83.9 ns 4.9% free6502 3.3X 29.6 ns 33.1 ns 11.8% T80 27.8 ns 33.7 ns 21.2% macfir 3.9X 14.4 ns 19.5 ns 35.4% serial_divide 4.1X 9.2 ns 12.2 ns 32.6% planet 10.9 ns 12.6 ns 15.6% s1488 9.9 ns 12.0 ns s1494 10.4 ns 17.3% s298 15.8 ns 19.1 ns 20.9% tbk 10.3 ns 12.9 ns 25.2% synthetic 4.0X 5.1% lfsrs 6.3X 9.0 ns 12.7 ns 41.1% ssra_core 3.5X 6.1 ns 7.2 ns 18.0% mean 3.6X 8.17 ns 12.08 ns 16.0% Mike Wirthlin, BYU

BL-TMR Incremental Results This graph shows the sensitivity and persistence of 200 partially-mitigated designs (from the same source design) Very fine-grain addition of redundant logic (in this case about 76 LUTs per increment, can do finer) No matter where your area constraint is, this fine-grained addition of logic gets very near the line Maximum reliability for the given area cost Notice the BL-TMR tool gives priority to the persistent sections of the design, eliminating persistence first Persistence eliminated after adding 61% redundancy (floor of ~670 bits) Sensitiity eliminated after adding 210% redundancy (floor of ~2,400 bits) Overhead Logic = (LUTs + FFs) above the amount the original design has 200 Incremental outputs of the BL-TMR tool 76 (LUTs + FFs) between design points 1262 sensitive bits between design points Mike Wirthlin, BYU

3. Design Flow

Design Flow RTL RTL Synthesis EDIF Netlist Signal List pTMR Parameters pTMR Tool pTMR Property Tags Modified Netlist Tagged EDIF Netlist Xilinx Map, Par, etc. FPGA bitfile

pTMR Steps Component Merging Design Flattening Graph Creation and Analysis IOB Analysis Clock Domain Analysis Instance Removal Feedback Analysis Illegal Crossing identification TMR Prioritization & Selection Voter Selection Netlist generation

11. Netlist Generation Circuit generated from pTMR rules Cells triplicated Voters inserted Netlist created for new circuit

3. Verifying BL-TMR

Fault Injection Configure user design onto two identical FPGAs Compare results of two designs using Comparator FPGA Insert configuration SEUs into design under test (FPGA2) and compare results If discrepancies between FPGAs are found, record configuration error FPGA 1 FPGA 2 Comparator Mike Wirthlin, BYU

SEU Insertion Example #1 Apply test vector to circuit input x FPGA 1 FPGA 2 x Insert configuration SEU into FPGA #2 FPGA1  FPGA2 x Compare circuit results Comparator Mike Wirthlin, BYU

Experimental Results – Design #2 Synthetic (LFSR/Mult) FPGA Editor Layout Sensitivity Map Persistence Map Unmitigated 3,005 slices (24%) 254,840 (4.39%) 46,368 (0.80%) top row -> Unmitigated Synthetic Design original: 24% of logic resources persistent is 5.5 times smaller than dynamic bottom row -> Synthetic design w/ TMR applied to as much of the design as possible (excluding clock, I/O) 97% utilization or roughly 370% more resources dynamic (pers & non-pers) is over 99% smaller than original Again, persistent x-section decreased by a factor of 100! (also sensitive) Caveats Inputs, Outputs and Clock were not triplicated Full TMR Applied 12,165 slices (99%) 2,395 (0.041%) 671 (0.005%) Mike Wirthlin, BYU

LANL Cibola Flight Experiment Los Alamos National Laboratory technology pathfinder validate FPGAs for high performance computing Investigate SEU behavior of Xilinx Virtex FPGAs Several BYU experiments validated in orbit TMR (including BL-TMR tool) Duplication with Compare DRAM controllers Cibola Flight Experiment 560 km, 35.4º inclination Mike Wirthlin, BYU

Sandia MISSE-8 BYU Experiments on ISS Under direction of Sandia National Laboratory BYU Experiments on ISS TMR PicoBlaze (Successful mitigation event!) Smart signal detection Reduced Precision Redundancy BRAM Scrubbing & BRAM ECC Endeavor (STS-134) May 16, 2012 Photo courtesy of NASA V4 FX60 V5QV (SIRF) Dr. Wirthlin to present Photo courtesy of Sandia National Labs Photo courtesy of NASA Mike Wirthlin, BYU

Radiation Testing Apply Ionizing Radiation to Design with TMR Verify accuracy of artificial simulator Identify upset in non-configuration state Identify other failure modes UC Davis, Crocker Nuclear Laboratory Medium-energy particle accelerator (76-inch cyclotron) 63 MeV proton source Flux: 1e7 particles/cm2/second: (~1 upset/second) 16 hour test (~25,000 upsets) Proton Beam FPGA Board Mike Wirthlin, BYU

5. TMR Summary Pros: Cons Significant improvements in reliability Easy to apply (limited design effort) Can be applied selectively Cons Requires significant hardware resources Negative impact on timing Difficult to verify Mike Wirthlin, BYU

Alternatives to TMR Exploit specific circuit structures/styles Memories, state machines, processors, etc. Arithmetic structures Detection+ Detecting a fault quickly opens up many lower cost mitigation strategies Temporal Redundancy Duplication with Compare Mike Wirthlin, BYU

Future Plans Clock domain aware TMR Timing aware TMR Improved support for clock and I/O resources Integrated Duplication with Compare (DWC) More frequent voting NMR (5-MR, 7-MR, etc.) Support for New FPGA Architectures Improved verification (formal verification) GUI support Improved partial TMR selection (Algorithmic pTMR)

Questions? Mike Wirthlin, BYU