Presentation is loading. Please wait.

Presentation is loading. Please wait.

144_C4 / MAPLD04Swift and Roosta1 Tradeoffs in Flight-Design Upset Mitigation in State-of-the-Art FPGAs Hardened By Design vs. Design-Level Hardening Gary.

Similar presentations


Presentation on theme: "144_C4 / MAPLD04Swift and Roosta1 Tradeoffs in Flight-Design Upset Mitigation in State-of-the-Art FPGAs Hardened By Design vs. Design-Level Hardening Gary."— Presentation transcript:

1 144_C4 / MAPLD04Swift and Roosta1 Tradeoffs in Flight-Design Upset Mitigation in State-of-the-Art FPGAs Hardened By Design vs. Design-Level Hardening Gary M. Swift and Ramin Roosta Jet Propulsion Laboratory / California Institute of Technology The research done in this paper was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration (NASA) and was partially sponsored by the NASA Electronic Parts and Packaging Program. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology.

2 144_C4 / MAPLD04Swift and Roosta2 In the beginning was Actel … Leveraging from a commercial product line  ONO anti-fuse based  one-time programmable (OTP) “beginning” = 1993  Reference: Katz, R.; Barto, R.; McKerracher, P.; Carkhuff, B.; Koga, R.; “SEU hardening of field programmable gate arrays (FPGAs) for space applications and device characterization,” IEEE Transactions on Nuclear Science, Dec. 1994

3 144_C4 / MAPLD04Swift and Roosta3 Later, Xilinx Leveraging from a commercial product line  SRAM based  reconfigurable “later” = 1998  Reference: Guertin, S.M.; Swift, G.M.; Nguyen, D.; “Single-event upset test results for the Xilinx XQ1701L PROM”, Radiation Effects Data Workshop Record, 1999  Quote: (Xilinx SRAM-based FPGAs)… “do appear suited to a broad range of other (non-critical) applications, such as sensor and camera controllers.”

4 144_C4 / MAPLD04Swift and Roosta4 OUTLINE FPGAs: A key enabling technology for modern spacecraft Background in radiation testing of FPGAs ▫Earlier, Katz/Swift collaboration ▫Recently, Xilinx Consortium Feature Comparison Triple Modular Redundancy (TMR) - hardware approach vs. software approach Concluding Remarks

5 144_C4 / MAPLD04Swift and Roosta5 FPGAs: A key “enabling technology” Like custom ASICs, FPGAs can replace whole boards  Saving mass, volume, power  Achieving extra functionality FPGAs are much cheaper than ASICs  Design efforts can be later in the schedule  Design mistakes don’t require a re-spin through the foundry

6 144_C4 / MAPLD04Swift and Roosta6 MER Pyro-Controller Used self-checking of configuration to initiate a re- configuration after spotting an upset

7 144_C4 / MAPLD04Swift and Roosta7 MER Pyro-Controller Nearing Mars Xilinx XQR4062XL

8 144_C4 / MAPLD04Swift and Roosta8 My Background Actel experience is older  No direct involvement in radiation tests since the ONO anti-fuse was replaced  Results here are from others’ work Xilinx experience is recent  Active participant in Xilinx Rad Test Consortium  Currently, finishing two+ year test campaign targeting the Virtex II family

9 144_C4 / MAPLD04Swift and Roosta9 Currently Available Devices Actel RT54SX-S family vs. Xilinx Virtex II family (-SU) Note: both are essentially immune to single-event latchup and have good total ionizing dose tolerance, [ Actel > 135 krad(Si); Xilinx > 200 krad(Si) ]

10 144_C4 / MAPLD04Swift and Roosta10 Main Feature Comparison Actel Xilinx RT54SX72S XQR2V6000 Gates: 72,000~6M ( /~3.2 ) flip/flops: 2012 67,584 / 3.2 = ~20k I/O Pins:360824 / 3 = 274 Speed external : 230 MHz 622 Mb/s (I-mode LVDS) Speed internal : 310 MHz360 MHz

11 144_C4 / MAPLD04Swift and Roosta11 Extra Features Comparison ActelXilinx RT54SX72S XQR2V6000 Block RAM:no2.5 Mb I/O Standards:manymany Others: hardwired TMR Clock Manager Multipliers

12 144_C4 / MAPLD04Swift and Roosta12 Actel: What bits can upset? User flip-flops only  Direct hits of same flip/flop in multiple domains ▫Very unlikely due to layout  Clock domain hits SEFI modes essentially eliminated

13 144_C4 / MAPLD04Swift and Roosta13 Xilinx: What bits can upset? Configuration Bits  Logical Function  Routing  User Options Block RAM User Flip-flops Control Registers × Type of I/O × Mode of Block RAM Access × Clock Manager × etc… × NAND × Ex-OR × Flip-Flop type × etc…

14 144_C4 / MAPLD04Swift and Roosta14 Xilinx: Heavy Ion Test Results Resulting in fairly low in-space rates: ~6 per day for 2V6000 in GCR min. Low Threshold (soft) Low Susceptibility (hard)

15 144_C4 / MAPLD04Swift and Roosta15 Actel: Heavy Ion Test Results Very low in-space rates (assume LET th > 40 achieved): ~1 per 6800 years for SX72-S in GCR min. Where’s Threshold ??? Low Susceptibility (~100x harder) Data for two RTAX2000S prototypes at 1 MHz using checkerboard pattern from Fig. 12, JJ Wang et al., NSREC 2003 [Ref. 1]

16 144_C4 / MAPLD04Swift and Roosta16 Actel-style TMR SX-A “R” cell triplicates to: RTSX-S “R” cell

17 144_C4 / MAPLD04Swift and Roosta17 Actel-style TMR Actel-style TMR is fairly straightforward:  Each flip-flop is replaced by three plus feedback voter  Triplicated elements spread out physically  Uses one clock/inverse-clock domain  No external parts needed

18 144_C4 / MAPLD04Swift and Roosta18 Xilinx-style TMR Xilinx-style TMR is more complicated:  First, it’s not too useful without configuration scrubbing  Whole functional blocks are triplicated, not individual flip-flops  Three voters are used  Three clock domains  Elimination of: ▫Weak keepers (aka half latches) ▫Use of configuration cells as part of the design -For example, SRL16  Needs some external circuitry (at least, a watchdog timer + PROMs)

19 144_C4 / MAPLD04Swift and Roosta19 Xilinx-style TMR

20 144_C4 / MAPLD04Swift and Roosta20 Xilinx-style TMR In Xilinx-style TMR, I/O’s use three pins tied externally : Minority Voter P P P D0 D1 D2 D Pins Board Traces

21 144_C4 / MAPLD04Swift and Roosta21 Xilinx TMRtool Xilinx-style TMR done by hand is difficult and tedious An automated tool which integrates into the design flow has been developed (“now” available) In-beam testing shows tool is very effective

22 144_C4 / MAPLD04Swift and Roosta22 Upset Comparison ATMR now has eliminated:  Upsets of static storage elements, and  SEFIs ATMR upsets from:  Transients that are clocked into storage  Clock tree hits Xilinx FPGAs have a small susceptibility to two types of SEFIs  Reset (sometimes only partial)  Disable scrub port XTMR in combination with scrubbing can lower system upset rates below the SEFI rate

23 144_C4 / MAPLD04Swift and Roosta23 Rate Comparison GCR = Galactic Cosmic Ray background (interplanetary space) almost identical to geosynchronous orbit Actel Dominated by transients Roughly one system error per thousand years (GCR min ) Xilinx Dominated by SEFI rate Expect one SEFI per ~65 years in GCR min Expect one system error ~5-20x less often

24 144_C4 / MAPLD04Swift and Roosta24 CONCLUSIONS For the present – Both can achieve very acceptable radiation tolerance Actel wins on: ▫Less burden on the designer ▫No auxiliary components ▫Lower SEFI susceptibility Xilinx wins on: ▫Designer control of the resources vs. hardness tradeoff ▫On-chip feature set ▫Re-configurability Competition is good.

25 144_C4 / MAPLD04Swift and Roosta25 Acronyms FPGA - Field Programmable Gate Array ASIC - Application Specific Integrated Circuit SEU - Single Event Upset SEFI - Single Event Functionality Interrupt TMR - Triple Modular Redundancy ATMR - Actel-style TMR XTMR - Xilinx-style TMR LET - Linear Energy Transfer (proportional to deposited charge per micron for a heavy ion strike on an active node) GCR min - Galactic Cosmic Ray background (highest during “solar minimum” period of ~11-yr cycle of sunspots) MER - Mars Exploration Rovers (i.e., Spirit and Opportunity)

26 144_C4 / MAPLD04Swift and Roosta26 Additional References [1] J.J. Wang, W. Wong, S. Wolday, B. Cronquist, J. McCollum, R. Katz, I. Kleyner, “Single event upset and hardening in 0.15  antifuse-based field programmable gate array,” IEEE Transactions on Nuclear Science, Dec. 2003 [2] Jih-Jong Wang, R.B. Katz, F. Dhaoui, J.L. McCollum, W. Wong, B.E. Cronquist, R.T. Lambertson, E. Hamdy, I. Kleyner, W. Parker, “Clock buffer circuit soft errors in antifuse-based field programmable gate arrays,” IEEE Transactions on Nuclear Science, Dec. 2000 [3] R. Katz, J.J. Wang, R. Koga, K.A. LaBel, J. McCollum, R. Brown, R.A. Reed, B. Cronquist, S. Crain, T. Scott, W. Paolini, B. Sin, “Current radiation issues for programmable elements and devices,” IEEE Transactions on Nuclear Science, Dec. 1998


Download ppt "144_C4 / MAPLD04Swift and Roosta1 Tradeoffs in Flight-Design Upset Mitigation in State-of-the-Art FPGAs Hardened By Design vs. Design-Level Hardening Gary."

Similar presentations


Ads by Google