L189/MAPLD2004Carmichael 1 A Triple Module Redundancy Scheme for SEU Mitigation of Static Latch-Based FPGAs (“Birds-of-a-Feather”) Carl Carmichael 1, Brendan.

Slides:



Advertisements
Similar presentations
Melanie Berg MEI Technologies/NASA GSFC
Advertisements

Sana Rezgui 1, Jeffrey George 2, Gary Swift 3, Kevin Somervill 4, Carl Carmichael 1 and Gregory Allen 3, SEU Mitigation of a Soft Embedded Processor in.
C3 / MAPLD2004Lake1 Radiation Effects on the Aeroflex RadHard Eclipse FPGA Ronald Lake Aeroflex Colorado Springs.
10/14/2005Caltech1 Reliable State Machines Dr. Gary R Burke California Institute of Technology Jet Propulsion Laboratory.
Scrubbing Approaches for Kintex-7 FPGAs
Multi-Bit Upsets in the Virtex Devices Heather Quinn, Paul Graham, Jim Krone, Michael Caffrey Los Alamos National Laboratory Gary Swift, Jeff George, Fayez.
Radiation Effects on FPGA and Mitigation Strategies Bin Gui Experimental High Energy Physics Group 1Journal Club4/26/2015.
Complex Upset Mitigation Applied to a Re-Configurable Embedded Processor EEL 6935 Lu Hao Wenqian Wu.
April 30, Cost efficient soft-error protection for ASICs Tuvia Liran; Ramon Chips Ltd.
ICAP CONTROLLER FOR HIGH-RELIABLE INTERNAL SCRUBBING Quinn Martin Steven Fingulin.
Mathew Napier(1), Jason Moore(2), Kurt Lanes(1), Sana Rezgui(2),
The 8085 Microprocessor Architecture
Nishinaga No. 1 MAPLD2005 Availability Analysis of Xilinx FPGA on Orbit Nozomu Nishinaga National Institute of Information and Communications Technology.
DC/DC Switching Power Converter with Radiation Hardened Digital Control Based on SRAM FPGAs F. Baronti 1, P.C. Adell 2, W.T. Holman 2, R.D. Schrimpf 2,
5/4/2006BAE Analog to Digital (A/D) Conversion An overview of A/D techniques.
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Chapter 3 Bivariate Data
MAE 552 Heuristic Optimization
Testing an individual module
Chapter 5 Control Using Wireless Transmitters. Measurement and Control Data Sampling Rate  To achieve the best control response, the rule of thumb is.
How Science Works Glossary AS Level. Accuracy An accurate measurement is one which is close to the true value.
1 Seventh Lecture Error Analysis Instrumentation and Product Testing.
MODULE 12 RANDOM VIBRATION.
Chemometrics Method comparison
Descriptive Methods in Regression and Correlation
1 Fault-Tolerant Computing Systems #2 Hardware Fault Tolerance Pattara Leelaprute Computer Engineering Department Kasetsart University
Radiation Effects and Mitigation Strategies for modern FPGAs 10 th annual workshop for LHC and Future experiments Los Alamos National Laboratory, USA.
FLASH Mitigation Strategies for Space Applications Charles Howard Southwest Research Institute.
12004 MAPLD: 141Buchner Single Event Effects Testing of the Atmel IEEE1355 Protocol Chip Stephen Buchner 1, Mark Walter 2, Moses McCall 3 and Christian.
144_C4 / MAPLD04Swift and Roosta1 Tradeoffs in Flight-Design Upset Mitigation in State-of-the-Art FPGAs Hardened By Design vs. Design-Level Hardening Gary.
CMSC 345 Fall 2000 Unit Testing. The testing process.
A comprehensive method for the evaluation of the sensitivity to SEUs of FPGA-based applications A comprehensive method for the evaluation of the sensitivity.
2004 MAPLD, Paper 190 JJ Wang 1 SEU-Hardened Storage Devices in a 0.15 µm Antifuse FPGA – RTAX-S J. J. Wang 1, B. Cronquist 1, J. McCollum 1, R. Gorgis.
Presented by Anthony B. Sanders NASA/GSFC at 2005 MAPLD Conference, Washington, DC #196 1 ALTERA STRATIX TM EP1S25 FIELD-PROGRAMMABLE GATE ARRAY (FPGA)
PetrickMAPLD05/P1461 Virtex-II Pro PowerPC SEE Characterization Test Methods and Results David Petrick 1, Wesley Powell 1, Ken LaBel 1, James Howard 2.
PetrickMAPLD05/BOFL1461 Virtex-II Pro PowerPC SEE Characterization Test Methods and Results Session L: Birds of a Feather David Petrick 1, Wesley Powell.
FORMAL VERIFICATION OF ADVANCED SYNTHESIS OPTIMIZATIONS Anant Kumar Jain Pradish Mathews Mike Mahar.
ATMEL ATF280E Rad Hard SRAM Based FPGA SEE test results Application oriented SEU Sensitiveness Bernard BANCELIN ATMEL Nantes SAS, Aerospace Business Unit.
P173/MAPLD 2005 Swift1 Upset Susceptibility and Design Mitigation of PowerPC405 Processors Embedded in Virtex II-Pro FPGAs.
MAPLD 2005/202 Pratt1 Improving FPGA Design Robustness with Partial TMR Brian Pratt 1,2 Michael Caffrey, Paul Graham 2 Eric Johnson, Keith Morgan, Michael.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
Synthesis Of Fault Tolerant Circuits For FSMs & RAMs Rajiv Garg Pradish Mathews Darren Zacher.
Experimental Evaluation of System-Level Supervisory Approach for SEFIs Mitigation Mrs. Shazia Maqbool and Dr. Craig I Underwood Maqbool 1 MAPLD 2005/P181.
P189/MAPLD2004Carmichael 1 A Triple Module Redundancy Scheme for SEU Mitigation of Static Latch-Based FPGAs Carl Carmichael 1, Brendan Bridgford 1, Gary.
Discussion of time series and panel models
MooreC142/MAPLD Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift.
2011/IX/27SEU protection insertion in Verilog for the ABCN project 1 Filipe Sousa Francis Anghinolfi.
LaRC MAPLD 2005 / A208 Ng 1 Radiation Tolerant Intelligent Memory Stack (RTIMS) Tak-kwong Ng, Jeffrey Herath Electronics Systems Branch Systems Engineering.
Evaluating Logic Resources Utilization in an FPGA-Based TMR CPU
1 CzajkowskiMAPLD 2005/138 Radiation Hardened, Ultra Low Power, High Performance Space Computer Leveraging COTS Microelectronics With SEE Mitigation D.
A Simplified Approach to Fault Tolerant State Machine Design for Single Event Upsets Melanie Berg.
ECE DIGITAL LOGIC LECTURE 15: COMBINATIONAL CIRCUITS Assistant Prof. Fareena Saqib Florida Institute of Technology Fall 2015, 10/20/2015.
Nishinaga No. 1 MAPLD2005/1003-J Availability Analysis of Xilinx FPGA on Orbit Nozomu Nishinaga National Institute of Information and Communications Technology.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Xilinx V4 Single Event Effects (SEE) High-Speed Testing Melanie D. Berg/MEI – Principal Investigator Hak Kim, Mark Friendlich/MEI.
Chandrasekhar 1 MAPLD 2005/204 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan.
P201-L/MAPLD SEE Validation of SEU Mitigation Methods for FPGAs Carl Carmichael 1, Sana Rezgui 1, Gary Swift 2, Jeff George 3, & Larry Edmonds 2.
MAPLD 2005/213Kakarla & Katkoori Partial Evaluation Based Redundancy for SEU Mitigation in Combinational Circuits MAPLD 2005 Sujana Kakarla Srinivas Katkoori.
The 8085 Microprocessor Architecture
SEU Mitigation of a Soft Embedded Processor in the Virtex-II FPGAs
MAPLD 2005 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan Dr. V. Kamakoti.
The 8085 Microprocessor Architecture
SEU Mitigation Techniques for Virtex FPGAs in Space Applications
Radiation Tolerance of an Used in a Large Tracking Detector
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Irradiation Test of the Spartan-6 Muon Port Card Mezzanine
MAPLD 2005 BOF-L Mitigation Methods for
Upset Susceptibility and Design Mitigation of
The 8085 Microprocessor Architecture
Presentation transcript:

L189/MAPLD2004Carmichael 1 A Triple Module Redundancy Scheme for SEU Mitigation of Static Latch-Based FPGAs (“Birds-of-a-Feather”) Carl Carmichael 1, Brendan Bridgford 1, Gary Swift 2, Matt Napier 3 1 Xilinx Corporation, San Jose CA 2 Jet Propulsion Laboratory, Pasadena CA 3 Sandia National Laboratories, Albuquerque NM "This work was carried out in part by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration." "Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology."

L189/MAPLD2004Carmichael 2 XTMR SEU Mitigation Xilinx Triple Module Redundancy (XTMR) – Single Point Failures are eliminated by triplication of every logic node (gates & nets). – XTMR confers SEU and SET immunity – XTMR does not protect against SEFIs! – Any digital design can be XTMRed by: “Triplication” of throughput (combinational & sequential) logic “Triplication” of feedback logic and inserting majority voters Adding redundant IO (outputs with minority voters) Design cleanup (removing half-latches, SRL16s, etc.)

L189/MAPLD2004Carmichael 3 XTMR State-Machines “Pre-TMR” “Post-XTMR” XTMR provides autonomous re-synchronization of the separate redundant domains of a state-machine by inserting majority voters at the origin of any registered feed-back “Looped” path. When a configuration upset disables one domain, the other two domains continue to operate providing a correct majority representation of state data and functionality. When “Scrubbing” fixes the configuration of the upset domain, the embedded redundant voters automatically correct the state of the upset domain without any external intervention. As long as the scrub rate is greater than the upset rate, a single bit upset cannot disturb more than one redundant domain.

L189/MAPLD2004Carmichael 4 XTMR Inputs Effective SEU Mitigation requires the use of triple redundant input pins for every input signal. Not triplicating input Global signals (clk, rst, etc) can seriously compromise SEU resistance. Triplication of input data paths can be traded for EDAC. SEU resistance is sometimes a trade- off for resource utilization.

L189/MAPLD2004Carmichael 5 XTMR Outputs with Minority Voters Outputs can be triplicated, using three pins for each output signal. Minority voters monitor each of the triplicated design modules. If one module is different from the others, its output pin is driven to High-Z Voters are triplicated Minority Voter P TR0 TR1 TR2 Minority Voter P P Convergence point is outside FPGA, at trace

L189/MAPLD2004Carmichael 6 Previous SEE Test Methodology for Mitigation The assertion of the combined mitigation method of XTMR & Scrubbing is that the complete removal of Single Even Functional Errors in the user logic confers any user design to an overall error rate determined by the remaining Single Event Functional Interrupts. Therefore, a successful mitigation test is expected to produce zero errors other than SEFIs. Since the effectiveness of TMR is dependent upon no accumulation of errors in the configuration, experiments were attempted to maintain an upset rate that did not exceed the scrub rate. This methodology had two significant flaws: – One is an impracticality of testing at such low fluxes requiring unreasonably long run times and thus being incapable of reaching sufficient fluence for acceptable statistical significance of data. – The other flaw is that a zero error rate result is not useful for making any calculations or extrapolations. These issues raise concerns over the validity of any results.

L189/MAPLD2004Carmichael 7 Improved SEE Test Methodology for Mitigation There is an expected physical relationship between functional error rate of a mitigated system as a function of upset rate. The expected relationship is a function that predicts the increasing probability of upsetting bit combinations that will cause a mitigated (TMR) system to fail as a function of bit upset rate: MER = (1/2)(N B C A /T S )R U 2 – MER = Mitigation Error Rate – N B = Number of Relevant Bits – C A = Average Cluster Size – T S = Scrub Time – R U = Upset Rate of Relevant Bits. Therefore, testing at extremely high fluxes over several orders of magnitude variation can be performed to reveal this functional relationship between mitigation error rate and bit upset rate. This function can then be extrapolated to make predictions at the much lower upset rates of earth orbits.

L189/MAPLD2004Carmichael 8 Plot Definitions Predicted SEFI cross-section – Static and Dynamic SEE Characterization of the Virtex-II FPGA revealed several Single Event Functional Interrupt Modes: POR (2.5E-06), SMAP (1.72E-06), IOB (4.2E-06) – These combined cross-sections represent the minimum functional error cross-section for a single Virtex-II (XQR2V6000) device on orbit. Worst Case Orbital Upset Rate – CREME96 calculation of the worst case orbital upset rate for a XQR2V6000 is 7,740 bit-errors/day (9E-02 bit-errors/sec) in a GEO orbit at 36,000km during the worst day of an Anomalously Large Solar Flare accounting for both Heavy Ion and Proton. In a 40MeV Kr beam the exact same upset rate is achieved with a Flux of 1.25E-01 p/cm 2 /s. This denotes that the equivalent upset rates for all other orbits and solar conditions would reside to the LEFT of this line. Single Event Functional Interrupts – This is the average cross-section of the observed SEFI(s) while collecting the data represented in the plot. This cross-section is not Flux dependent. Variations from the predicted value are due to statistical significance of the total accumulated fluence during each test. Functional Errors – Data plot of the observed events when the Device Under Test returned an incorrect result. Cross-section is determined by the number of error events divided by total fluence at the specified flux. TMR denotes that the DUT design was fully mitigated with XTMR and scrubbing. The Unmitigated results were obtained with an identically functional design without XTMR, however scrubbing was also used for the unmitigated test. Extrapolation – A derived function describing the relation between Mitigation failure as a function of upset rate. Extension of the function predicts functional error cross-sections at worst case orbital upset rates to be less than SEFI cross-sections.

L189/MAPLD2004Carmichael 9 PLOT 1 36,000km GEO Orbit Worst Day Solar Flare 8,000 bit-errors/day All other orbits SEFIs drive error rate for all designs and all orbits. Mitigation errors on orbit are always less than SEFI errors by orders of magnitude 3.5E-023.5E-013.5E+003.5E+013.5E+023.5E+03 Configuration Bit Errors per Scrub Cycle 40 MeV Kr LET= 22.3 MeV/cm 2 /mg

L189/MAPLD2004Carmichael 10 PLOT 2 36,000km GEO Orbit Worst Day Solar Flare 8,000 bit-errors/day All other orbits SEFIs drive error rate for all designs and all orbits. Mitigation errors on orbit are always less than SEFI errors by orders of magnitude 3.5E-023.5E-013.5E+003.5E+013.5E+023.5E+03 Configuration Bit Errors per Scrub Cycle 40 MeV Kr LET= 22.3 MeV/cm 2 /mg 3.5E+03

L189/MAPLD2004Carmichael 11 PLOT 3 36,000km GEO Orbit Worst Day Solar Flare 8,000 bit-errors/day All other orbits SEFIs drive error rate for all designs and all orbits. Mitigation errors on orbit are always less than SEFI errors by orders of magnitude 3.5E-023.5E-013.5E+003.5E+013.5E+023.5E+03 Configuration Bit Errors per Scrub Cycle 40 MeV Kr LET= 22.3 MeV/cm 2 /mg 3.5E+03

L189/MAPLD2004Carmichael 12 SEE Test Analysis The experiments were conducted over a flux range of 7E+00 to 4E+04 (p/cm 2 /s). The Flux rates have been normalized in the secondary (top) x-axis of the plots to “average bit upsets per scrub cycle” (R S ). Each experiment demonstrated a drop in failure cross-section over several orders of magnitude, crossing the SEFI cross-section at upset rates that are still several orders of magnitude above worst case orbital upset rates. Extrapolating this data for each experiment clearly demonstrates a mitigation error cross-section at least 1 or more orders of magnitude below the SEFI cross-section at worst case orbital upset rates. By Superposition of the data fit functions, the total effective mitigated error rate cross-section is Sigma TOTAL = Sigma BRAM + Sigma CLB + Sigma MULT + Sigma SEFI Sigma TOTAL = 5.0E-8(1.4 R S ) (2) + 5.0E-6(0.7 R S ) (0.5) E-6(1.4 R S ) (0.35) E-6 (cm 2 ) Therefore, at the worst case orbital upset rate of 9E-2 upsets/sec (R S =4.5E-2 upsets/scrub) the effective total cross-section for functional error is calculated: Sigma TOTAL = 1.05E-5 (cm 2 /device) {Orbital Worst Case}

L189/MAPLD2004Carmichael 13 Conclusions Efficiency and accuracy of the validation of mitigation techniques is greatly improved by demonstrating the upset rate dependency of the mitigation method by testing at Flux rates that overwhelm the mitigation. The static SEFI cross-section is the dominating factor for calculating orbital error rates for any Virtex-II design when mitigated with Full XTMR & Scrubbing. Future Work – The authors recognize an anomaly in the data fit functions in that they were not all expressed as a square function. It is anticipated that this is due to the complexity of the bit clusters of the experimental designs. Additional research is called for to derive the separate coefficients for the MER equation for each design and explain their functional associations.