Presentation is loading. Please wait.

Presentation is loading. Please wait.

Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &

Similar presentations


Presentation on theme: "Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &"— Presentation transcript:

1 Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military & Aerospace Business Unit

2 Single Event Upset (SEU) Overview for SRAM-Based FPGAs

3 Copyright © 2004 Altera Corporation Definitions SEU: Single Event Upset  Unwanted Change in State of a Latch or a Memory Cell SER: Soft Error Rate  SEU Rate SEFI: Single Event Functional Interrupt  Functional Failure by SEU  Not All SEUs are SEFIs  Generally Takes 5-10 SEUs to Cause SEFI

4 Copyright © 2004 Altera Corporation Circuit Components of SRAM-Based FPGAs I/O Registers & I/O Configuration  No Issue, Very Robust Registers, < 1 FIT Logic Registers (LEs)  No Issues, Very Robust Registers, < Hard Error Rate User Memory  Typically On-Chip Memories are “By 9” for Parity Checking  IP Available for ECC Configuration RAM (CRAM) for LUTs & Routing  Area of Focus

5 Copyright © 2004 Altera Corporation Upset of a CRAM Cell Data In Add Vcc Vss Clear Data Out Time Voltage 6 Transistor Cell Noise Current for 10fC Collected Charge 0 50 100 150 200 050100150200 Time (ps) Current (µA)

6 Copyright © 2004 Altera Corporation SEU Induced Failure Rate* DeviceLE CountSEU Rate (FIT) SEFI Rate (FIT) MTBF** (Years) EP1C66K250601,900 Years EP1C2020K730180634 Years EP1S2526K1950400285 Years EP1S8079K6000120095 Years * Data at Sea Level **MTBF: Mean Time Between Functional Interrupt

7 Copyright © 2004 Altera Corporation Number of CRAM Bit Upsets for Each Occurrence of Functional Upset Median ~6 Median 5

8 Addressing System-Level Issues

9 Copyright © 2004 Altera Corporation SER Improvements/Mitigation Chip Design Enhancements  New Materials & Process Enhancements  Larger CRAM Structure  Increase in Capacitance on Critical Node  Smaller Process => Smaller Die => Lower SEU Probability  Built-In Error Detection/Correction Circuitry

10 Copyright © 2004 Altera Corporation SER Per SRAM Bit Trend Process Technology Year 0.5 µm 1995 0.13 µm 2002 SER per SRAM MBit 100 FITS 1,000 FITS 90 nm Projection

11 Copyright © 2004 Altera Corporation System Level Improvements Mitigation ECC for User Memory Use Detection/Correction Feature Triple Module Redundancy (TMR) To Achieve Lower Error Rate & Less Downtime Migrate to Structured ASIC

12 Copyright © 2004 Altera Corporation Soft Error Detection Methods Configuration RAM Readout  Read-Out Full Bitstream  Compare with Stored Bitstream  Can Determine where in Configuration Error Occurred Caveat: Security Issues with Reading Out Bitstream Stored CRAM Data Stored CRAM Data FPGA Microprocessor or CPLD Microprocessor or CPLD Same or Different?

13 Copyright © 2004 Altera Corporation Soft Error Detection Methods On-Chip SEU Detection  Dedicated Comparison Circuitry e.g. CRC Engine Comparing Stored CRC with That Calculated from Configuration RAM  Detection Circuitry Running Continuously  Error Detection Rate Variable Based on Implementation of Hardware, Number of CRAM Bits & Input Clock Frequency  Error Signal Available Internally or Externally Caveat: Cannot Determine Where in Configuration Error Occurred Computed Value Stored Value To Core = FPGA

14 Copyright © 2004 Altera Corporation On-Chip Detection Example Dedicated CRC Circuit  Configuration RAM Verification Capability 32-Bit Cyclic Redundancy Code Check Verified Against Internally Stored Value Runs in the Background Without Impacting Device Performance  Close to Real-Time Detection Variable Clock Frequency Depends on Number of CRAM Bits  Multi-Event Detection Up to 3-Bit for 32-Bit CRC  Result Output to Either Core or Pin Use with Either Internal or External Hardware for Error Correction

15 Copyright © 2004 Altera Corporation Correction Methods FPGA Detection, System-Level Correction  Lower Total Cost  Downtime Is Limited & Manageable  Used in Non-Critical Applications Triple Module Redundancy  Two Flavors All On-Chip in FPGA Separate Chips & Voter  Correction Can Be Real-Time  Used in Critical Applications

16 Copyright © 2004 Altera Corporation Single System Detection & Correction Step One: Detect the Soft Error  75% of Reported Errors Are “Don’t Care” Errors Step Two: Alert the System Step Three: Fix the Error  In Some Cases, Re-Program the FPGA  In Some Cases, Reboot the Sub-System  In Some Cases, Reboot the System Need to Focus on System “Downtime”  Each System Has Unique Requirements  Re-Programming FPGA Takes < 250 ms  Rebooting Time Varies & Can Be Fast “by Design”

17 Copyright © 2004 Altera Corporation TMR Method 1 Identical Hardware in FPGAs Use Voter Implemented in FPGA or CPLD Utilize Either Hardware Output or CRC Error Pin Voter Also Used to Signal Reconfiguration on Difference or Error FPGA Hardware1 FPGA Hardware1 FPGA Hardware3 FPGA Hardware3 FPGA Hardware 2 FPGA Hardware 2 FPGA or CPLD (Voting) FPGA or CPLD (Voting)

18 Copyright © 2004 Altera Corporation TMR Method 2 Multiple Instantiations of Hardware in Single FPGA For Low-Rate SEUs SEU Events May Occur Much More Frequently than Functional Error (De-Rating) Voter Signals Reconfiguration of FPGA FPGA Must be Reconfigured Voting Circuit Voting Circuit FPGA Hardware 1 Hardware 2 Hardware 3

19 Copyright © 2004 Altera Corporation De-Rating Methodology Only a Fraction of Configuration Bits Are Actually Programmed  e.g. Using Only Two Inputs of 4-Input LUT Leaves 75% of LUT as “Don’t Care”  Only About 20% of Routing Is Used  Depends on Utilization & Application Some Un-Programmed Bits Still Matter  Flipping Could Change Function of the Device Extensive Experimentation Shows a Range From 1/8 to 1/3 of the Bits Matter

20 Copyright © 2004 Altera Corporation Structured ASIC: Ultimate SEU Protection No Configuration Memory = Estimated SER is below Hard Failure Rate for the Device FPGA Structured ASIC PLD Architecture with ASIC Routing

21 Copyright © 2004 Altera Corporation Summary SEU is a Well Understood Phenomena Many Chip Level Enhancements Mitigate SEUs  Process  Design  Manufacturing Techniques Easy Detection of SEU Events is Key After Detection, Other Methods Must be Employed to Deal with the Event  Critical Nature of Application Determines Level of SEU Response Structured ASICs from FPGA Designs Offer a Much More Robust Solution Due to Removal of All CRAM


Download ppt "Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &"

Similar presentations


Ads by Google