P173/MAPLD 2005 Swift1 Upset Susceptibility and Design Mitigation of PowerPC405 Processors Embedded in Virtex II-Pro FPGAs.

Slides:



Advertisements
Similar presentations
Survey of Detection, Diagnosis, and Fault Tolerance Methods in FPGAs
Advertisements

FPGA (Field Programmable Gate Array)
Baloch 1MAPLD 2005/1024-L Design of a ‘Single Event Effect’ Mitigation Technique for Reconfigurable Architectures SAJID BALOCH Prof. Dr. T. Arslan 1,2.
Miss Penalty Reduction Techniques (Sec. 5.4) Multilevel Caches: A second level cache (L2) is added between the original Level-1 cache and main memory.
Sana Rezgui 1, Jeffrey George 2, Gary Swift 3, Kevin Somervill 4, Carl Carmichael 1 and Gregory Allen 3, SEU Mitigation of a Soft Embedded Processor in.
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
Scrubbing Approaches for Kintex-7 FPGAs
Discussion of: “Terrestrial-based Radiation Upsets: A Cautionary Tale” CprE 583 Tony Kuker 12/06/05.
Fault-Tolerant Systems Design Part 1.
HPEC 2012 Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing Quinn Martin Alan George.
Complex Upset Mitigation Applied to a Re-Configurable Embedded Processor EEL 6935 Lu Hao Wenqian Wu.
1 Fault Tolerant FPGA Co-processing Toolkit Oral defense in partial fulfillment of the requirements for the degree of Master of Science 2006 Oral defense.
ICAP CONTROLLER FOR HIGH-RELIABLE INTERNAL SCRUBBING Quinn Martin Steven Fingulin.
1 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
Nishinaga No. 1 MAPLD2005 Availability Analysis of Xilinx FPGA on Orbit Nozomu Nishinaga National Institute of Information and Communications Technology.
DC/DC Switching Power Converter with Radiation Hardened Digital Control Based on SRAM FPGAs F. Baronti 1, P.C. Adell 2, W.T. Holman 2, R.D. Schrimpf 2,
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
L189/MAPLD2004Carmichael 1 A Triple Module Redundancy Scheme for SEU Mitigation of Static Latch-Based FPGAs (“Birds-of-a-Feather”) Carl Carmichael 1, Brendan.
Micro-RDC Microelectronics Research Development Corporation A Programmable Scrubber for FPGAs ACKNOWLEDGMENT OF SUPPORT: This material is based upon work.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
2. Introduction to Redundancy Techniques Redundancy Implies the use of hardware, software, information, or time beyond what is needed for normal system.
Configurable System-on-Chip: Xilinx EDK
7. Fault Tolerance Through Dynamic or Standby Redundancy 7.5 Forward Recovery Systems Upon the detection of a failure, the system discards the current.
1 Chapter 14 Embedded Processing Cores. 2 Overview RISC: Reduced Instruction Set Computer RISC-based processor: PowerPC, ARM and MIPS The embedded processor.
Field Programmable Gate Array (FPGA) Layout An FPGA consists of a large array of Configurable Logic Blocks (CLBs) - typically 1,000 to 8,000 CLBs per chip.
1 Fault-Tolerant Computing Systems #2 Hardware Fault Tolerance Pattara Leelaprute Computer Engineering Department Kasetsart University
Radiation Effects and Mitigation Strategies for modern FPGAs 10 th annual workshop for LHC and Future experiments Los Alamos National Laboratory, USA.
144_C4 / MAPLD04Swift and Roosta1 Tradeoffs in Flight-Design Upset Mitigation in State-of-the-Art FPGAs Hardened By Design vs. Design-Level Hardening Gary.
Computing hardware CPU.
Power Reduction for FPGA using Multiple Vdd/Vth
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
A comprehensive method for the evaluation of the sensitivity to SEUs of FPGA-based applications A comprehensive method for the evaluation of the sensitivity.
Lessons Learned The Hard Way: FPGA  PCB Integration Challenges Dave Brady & Bruce Riggins.
J. Christiansen, CERN - EP/MIC
PetrickMAPLD05/P1461 Virtex-II Pro PowerPC SEE Characterization Test Methods and Results David Petrick 1, Wesley Powell 1, Ken LaBel 1, James Howard 2.
Somervill RSC 1 125/MAPLD'05 Reconfigurable Processing Module (RPM) Kevin Somervill 1 Dr. Robert Hodson 1
2/2/2009 Marina Artuso LHCb Electronics Upgrade Meeting1 Front-end FPGAs in the LHCb upgrade The issues What is known Work plan.
Fault-Tolerant Systems Design Part 1.
MAPLD 2005/202 Pratt1 Improving FPGA Design Robustness with Partial TMR Brian Pratt 1,2 Michael Caffrey, Paul Graham 2 Eric Johnson, Keith Morgan, Michael.
Swankoski MAPLD 2005 / B103 1 Dynamic High-Performance Multi-Mode Architectures for AES Encryption Eric Swankoski Naval Research Lab Vijay Narayanan Penn.
Synthesis Of Fault Tolerant Circuits For FSMs & RAMs Rajiv Garg Pradish Mathews Darren Zacher.
Experimental Evaluation of System-Level Supervisory Approach for SEFIs Mitigation Mrs. Shazia Maqbool and Dr. Craig I Underwood Maqbool 1 MAPLD 2005/P181.
CprE 458/558: Real-Time Systems
High Performance Embedded Computing (HPEC) Workshop 23−25 September 2008 John Holland & Eliot Glaser Northrop Grumman Corporation P.O. Box 1693 Baltimore,
Wang-110 D/MAPLD SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.
Final Presentation DigiSat Reliable Computer – Multiprocessor Control System, Part B. Niv Best, Shai Israeli Instructor: Oren Kerem, (Isaschar Walter)
Petrick_P2261 Virtex-II Pro SEE Test Methods and Results David Petrick 1, Wesley Powell 1, James Howard 2 1 NASA Goddard Space Flight Center, Greenbelt,
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Evaluating Logic Resources Utilization in an FPGA-Based TMR CPU
Varadarajan Srinivasan, Julian W. Farquharson,
1 CzajkowskiMAPLD 2005/138 Radiation Hardened, Ultra Low Power, High Performance Space Computer Leveraging COTS Microelectronics With SEE Mitigation D.
Greg Alkire/Brian Smith 197 MAPLD An Ultra Low Power Reconfigurable Task Processor for Space Brian Smith, Greg Alkire – PicoDyne Inc. Wes Powell.
1 Advanced Digital Design Reconfigurable Logic by A. Steininger and M. Delvai Vienna University of Technology.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
Nishinaga No. 1 MAPLD2005/1003-J Availability Analysis of Xilinx FPGA on Orbit Nozomu Nishinaga National Institute of Information and Communications Technology.
A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu
FPGA Technology Overview Carl Lebsack * Some slides are from the “Programmable Logic” lecture slides by Dr. Morris Chang.
Chandrasekhar 1 MAPLD 2005/204 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan.
P201-L/MAPLD SEE Validation of SEU Mitigation Methods for FPGAs Carl Carmichael 1, Sana Rezgui 1, Gary Swift 2, Jeff George 3, & Larry Edmonds 2.
MAPLD 2005/213Kakarla & Katkoori Partial Evaluation Based Redundancy for SEU Mitigation in Combinational Circuits MAPLD 2005 Sujana Kakarla Srinivas Katkoori.
Programmable Logic Devices
CFTP ( Configurable Fault Tolerant Processor )
SEU Mitigation Techniques for Virtex FPGAs in Space Applications
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Design of a ‘Single Event Effect’ Mitigation Technique for Reconfigurable Architectures SAJID BALOCH Prof. Dr. T. Arslan1,2 Dr.Adrian Stoica3.
Dynamic High-Performance Multi-Mode Architectures for AES Encryption
Upset Susceptibility and Design Mitigation of
Hardware Assisted Fault Tolerance Using Reconfigurable Logic
Serial Communications
Presentation transcript:

P173/MAPLD 2005 Swift1 Upset Susceptibility and Design Mitigation of PowerPC405 Processors Embedded in Virtex II-Pro FPGAs

P173/MAPLD 2005 Swift2 Authors Gary Swift Jet Propulsion Laboratory/California Institute of Technology Gregory Allen Jet Propulsion Laboratory/California Institute of Technology Jeffrey George The Aerospace Corporation

P173/MAPLD 2005 Swift3 Authors Sana Rezgui Xilinx Corporation Carl Carmichael Xilinx Corporation Fayez Chayab MDRobotics

P173/MAPLD 2005 Swift4 Abstract We show recent results for the upset susceptibility of the registers and caches in the embedded PowerPC405 in the Xilinx V2P40 FPGA. For critical flight designs where configuration upsets are mitigated effectively, these upsets can dominate the system error rate. We consider several techniques for implementing various levels of redundancy to reduce system errors, including single-, dual- and triple-chip options. We conclude that the dual-chip option may often be the best choice and warrants further study.

P173/MAPLD 2005 Swift5 Background - Reconfigurable FPGA Upsets The basic building blocks are soft to upset [Ref. 1]

P173/MAPLD 2005 Swift6 Background - Upset Mitigation Critical applications require design-level upset mitigation Design Triplication –The use of TMR (or triple modular redundancy) in a design allows correct function through triplicated majority voters even when a configuration element is upset. –The extra design effort is now largely automated by new software (TMRtool). Active Configuration Scrubbing –Upsets in the configuration must not be allowed to accumulate or TMR will “break” –Scrubbing uses some resources, but can be implemented so that it is transparent to system operation.

P173/MAPLD 2005 Swift7 Embedded “Hard-Core” Processor(s) Upset PowerPC 405 cores in Virtex II-Pro family FPGAs offer unprecedented computational power inside an FPGA, but include additional upsetable storage elements

P173/MAPLD 2005 Swift8 Processor Upsets – Data Cache Processor caches are very important features for increased performance; however, upsets in the caches can lead to system errors.

P173/MAPLD 2005 Swift9 Processor Upset Mitigation The “obvious” solution of implementing TMR with three processor cores is not an available single chip option because the maximum number of processors per FPGA is currently two. Tradeoffs between upset robustness and system complexity, possibly spanning multiple FPGAs, must be considered.

P173/MAPLD 2005 Swift10 One-Chip Solution Running two processors in lockstep is conceptually simple, esp. as they can reside in a single FPGA. A fast TMR-ed comparison block is required to contain errors and not allow them to propagate into the rest of the system. A processor upset will appear to the comparison block as a disagreement, necessitating both processors be stopped within the current clock cycle. Then they both must be forced to roll back to a known good software “bookmark” or, alternatively, to reboot.

P173/MAPLD 2005 Swift11 Flow Chart One-Chip Solution

P173/MAPLD 2005 Swift12 Advantages Contained in one chip –No chip-to-chip interconnects (minimal latency and propagation delay) –Lower power consumption –Less board area –No chip-to-chip synchronization Technology is more developed and tested [See Reference 2]

P173/MAPLD 2005 Swift13 Disadvantages More system outages –Reboot or rollback on every error –Not suitable for some critical real-time applications Twice as many errors as on a single processor, but at least they are detected Note: Requires extra device – either watchdog timer or external configuration scrubber

P173/MAPLD 2005 Swift14 Two-Chip Solution With four processors in lockstep (necessitating two chips), a solution as robust as full TMR is possible. In this scheme, a pair of processors that get into a disagreement due to an upset will be stopped while the system runs without interruption on the processor pair that are in agreement. Correct internal state information is available in the working pair., preferably soon. Thus, it is possible to re- synchronize almost transparently and rapidly get back to full four- processor lockstep operation with minimal intrusion. As a side effect of using two separate FPGAs, additional robustness is possible by adding on cross-strapped configuration control.

P173/MAPLD 2005 Swift15 Flow Chart Two-Chip Solution

P173/MAPLD 2005 Swift16 Advantages Reboots rare; requires simultaneous errors in two separate processors Processor upsets are transparently handled without system outage until convenient re-synchronization opportunites Enhanced robustness – outages lowered to less than the SEFI rate of ~1 in 80 years per device Allows added configuration robustness –Chips check each other (not self-checking) –Eliminates need for external watchdog timer

P173/MAPLD 2005 Swift17 Disadvantages Complicated –Inter-chip communication/synchronization –Transparent reboot/resynchronization of both processors in chip with error Twice the power consumption In-beam testing is not yet done (although planned for the near future)

P173/MAPLD 2005 Swift18 Three-Chip Solution The three-chip implementation (also known as the “virtual FPGA” solution [Ref. 3]) takes the responsibility of error detection out of the hands of the upsetable FPGAs by adding a Radiation- Hardened ASIC. Note that only one processor per FPGA is needed. The ASIC handles stopping error propagation and re-synchronizing an upset processor. Additionally, the ASIC can be used for configuration control of all three FPGAs.

P173/MAPLD 2005 Swift19 Flow Chart Three-Chip Solution

P173/MAPLD 2005 Swift20 Advantages Maximum robustness to upsets Only three processors in lockstep (but in 3 chips) More fabric available for other functions No system outages; errors and SEFIs are handled transparently Most implementation details are confined to the ASIC and don’t affect the IP in the FPGAs significantly

P173/MAPLD 2005 Swift21 Disadvantages Complex ASIC development for controller to vote outputs and re-load/re-sync upset processor ASIC development cost (currently funded though) Board area

P173/MAPLD 2005 Swift22 Conclusions Both two-chip and three-chip solutions have about the same robustness, power consumption, and system complication, but handle upsets better than the one- chip solution. The two- vs. three-chip decision mostly boils down to the familiar FPGA vs. ASIC debate Three-chip solution may use less power than the two- chip. (Is the ASIC’s power consumption less than that of one processor core?) At present, the JPL-preferred approach is the two- chip implementation achieving maximum flexibility and near maximum robustness to upsets.

P173/MAPLD 2005 Swift23 References [1] J. George et al., “Initial Single-Event Effects Testing and Mitigation in the Xilinx Virtex II-Pro FPGA,” Paper 211, MAPLD [2] M. Wang and G. Bolotin, “SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA,” Paper D110, MAPLD 2004, 1_d110_wang_s.ppt 1_d110_wang_s.ppt [3] J. Lyke and B. Marty, Virtual Field Programmable Gate Array Triple Modular Redundant Cell Design, Air Force Research Laboratory: Space Vehicles Directorate, AFRL-VS-PS-TR , April 28, 2004.