Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs

Slides:



Advertisements
Similar presentations
Giuseppe De Robertis - INFN Sez. di Bari 1 SEU – SET test structures.
Advertisements

FPGA (Field Programmable Gate Array)
Copyright 2001, Agrawal & BushnellVLSI Test: Lecture 261 Lecture 26 Logic BIST Architectures n Motivation n Built-in Logic Block Observer (BILBO) n Test.
CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 2 - Teste PPGC - UFRGS 2005/I.
April 30, Cost efficient soft-error protection for ASICs Tuvia Liran; Ramon Chips Ltd.
Fault Injection in Mixed-Signal Environment Using Behavioral Fault Modeling in Verilog-A Seyed‌ Nematollah Ahmadian, Seyed Ghassem Miremadi Behavioral.
Cross-layer Optimized Placement and Routing for FPGA Soft Error Mitigation Keheng Huang 1,2, Yu Hu 1, and Xiaowei Li 1 1 Key Laboratory of Computer System.
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
NATW 2008 Using Implications for Online Error Detection Nuno Alves, Jennifer Dworak, R. Iris Bahar Division of Engineering Brown University Providence,
1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College.
Logic Simulation 4 Outline –Fault Simulation –Fault Models –Parallel Fault Simulation –Concurrent Fault Simulation Goal –Understand fault simulation problem.
4/20/2006ELEC7250: Alexander 1 LOGIC SIMULATION AND FAULT DIAGNOSIS BY JINS DAVIS ALEXANDER ELEC 7250 PRESENTATION.
ELEC 7250 Term Project Presentation Khushboo Sheth Department of Electrical and Computer Engineering Auburn University, Auburn, AL.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Address comments to Robust FPGA Resynthesis Based on Fault-Tolerant Boolean Matching Yu Hu 1, Zhe Feng 1, Lei He 1 and Rupak Majumdar 2.
ASIC vs. FPGA – A Comparisson Hardware-Software Codesign Voin Legourski.
Techniques and Algorithms for Fault Grading of FPGA Interconnect Test Configurations Mehdi Baradaran Tahoori and Subhasish Mitra IEEE Transactions on Computer-Aided.
Introduction to FPGA’s FPGA (Field Programmable Gate Array) –ASIC chips provide the highest performance, but can only perform the function they were designed.
Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.
1 Efficient Analytical Determination of the SEU- induced Pulse Shape Rajesh Garg Sunil P. Khatri Department of ECE Texas A&M University College Station,
Reduced Cost Reliability via Statistical Model Detection Jon-Paul Anderson- PhD Student Dr. Brent Nelson- Faculty Dr. Mike Wirthlin- Faculty Brigham Young.
A F AST AND A CCURATE M ULTI -C YCLE S OFT E RROR R ATE E STIMATION A PPROACH TO R ESILIENT E MBEDDED S YSTEMS D ESIGN Department of Computer Engineering.
A comprehensive method for the evaluation of the sensitivity to SEUs of FPGA-based applications A comprehensive method for the evaluation of the sensitivity.
ASIC/FPGA design flow. FPGA Design Flow Detailed (RTL) Design Detailed (RTL) Design Ideas (Specifications) Design Ideas (Specifications) Device Programming.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
J. Christiansen, CERN - EP/MIC
THE TESTING APPROACH FOR FPGA LOGIC CELLS E. Bareiša, V. Jusas, K. Motiejūnas, R. Šeinauskas Kaunas University of Technology LITHUANIA EWDTW'04.
FT-UNSHADES Analysis of SEU effects in Digital Designs for Space Gioacchino Giovanni Lucia TEC-EDM, MPD - 8 th March Phone: +31.
A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design
Part.1.1 In The Name of GOD Welcome to Babol (Nooshirvani) University of Technology Electrical & Computer Engineering Department.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
European Test Symposium, May 28, 2008 Nuno Alves, Jennifer Dworak, and R. Iris Bahar Division of Engineering Brown University Providence, RI Kundan.
MAPLD 2005/202 Pratt1 Improving FPGA Design Robustness with Partial TMR Brian Pratt 1,2 Michael Caffrey, Paul Graham 2 Eric Johnson, Keith Morgan, Michael.
Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs Ghazanfar (Hossein) Asadi and Mehdi B. Tahoori Why Soft Error Rate (SER) Estimation?
11 Online Computing and Predicting Architectural Vulnerability Factor of Microprocessor Structures Songjun Pan Yu Hu Xiaowei Li {pansongjun, huyu,
Using Memory to Cope with Simultaneous Transient Faults Authors: Universidade Federal do Rio Grande do Sul Programa de Pós-Graduação em Engenharia Elétrica.
Greg Alkire/Brian Smith 197 MAPLD An Ultra Low Power Reconfigurable Task Processor for Space Brian Smith, Greg Alkire – PicoDyne Inc. Wes Powell.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
Methodology to Compute Architectural Vulnerability Factors Chris Weaver 1, 2 Shubhendu S. Mukherjee 1 Joel Emer 1 Steven K. Reinhardt 1, 2 Todd Austin.
IPR: In-Place Reconfiguration for FPGA Fault Tolerance Zhe Feng 1, Yu Hu 1, Lei He 1 and Rupak Majumdar 2 1 Electrical Engineering Department 2 Computer.
Introduction to Field Programmable Gate Arrays (FPGAs) EDL Spring 2016 Johns Hopkins University Electrical and Computer Engineering March 2, 2016.
Gill 1 MAPLD 2005/234 Analysis and Reduction Soft Delay Errors in CMOS Circuits Balkaran Gill, Chris Papachristou, and Francis Wolff Department of Electrical.
Chandrasekhar 1 MAPLD 2005/204 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan.
Fault-Tolerant Resynthesis for Dual-Output LUTs Roy Lee 1, Yu Hu 1, Rupak Majumdar 2, Lei He 1 and Minming Li 3 1 Electrical Engineering Dept., UCLA 2.
Field Programmable Gate Arrays
Floating-Point FPGA (FPFPGA)
A New Logic Synthesis, ExorBDS
IPF: In-Place X-Filling to Mitigate Soft Errors in SRAM-based FPGAs
Robust FPGA Resynthesis Based on Fault-Tolerant Boolean Matching
MAPLD 2005 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan Dr. V. Kamakoti.
SEU Mitigation Techniques for Virtex FPGAs in Space Applications
Instructor: Dr. Phillip Jones
Electronics for Physicists
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
M. Aguirre1, J. N. Tombs1, F. Muñoz1, V. Baena1, A. Torralba1, A
We will be studying the architecture of XC3000.
RECONFIGURABLE PROCESSING AND AVIONICS SYSTEMS
Hwisoo So. , Moslem Didehban#, Yohan Ko
BIC 10503: COMPUTER ARCHITECTURE
Soft Error Detection for Iterative Applications Using Offline Training
Evaluation of Power Costs in Triplicated FPGA Designs
Design of a ‘Single Event Effect’ Mitigation Technique for Reconfigurable Architectures SAJID BALOCH Prof. Dr. T. Arslan1,2 Dr.Adrian Stoica3.
FPGA Glitch Power Analysis and Reduction
Avidan Efody, Mentor Graphics Corp.
Electronics for Physicists
Off-path Leakage Power Aware Routing for SRAM-based FPGAs
Guihai Yan, Yinhe Han, and Xiaowei Li
Lecture 26 Logic BIST Architectures
A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs Ghazanfar (Hossein) Asadi Test & Reliability Group (TRG) Department of Electrical & Computer Engineering Northeastern University

Problem Statement Estimating soft error rate in FPGAs The probability of system failure Due to soft errors For a given mapped design Mean time to manifest a corrupted conf. bit To primary outputs or Flip-flops Test & Reliability Group (TRG)

Motivation Need for soft error rate estimation Exponential growth of vulnerable bits due to Moore’s law High cost of Error tolerant schemes To make appropriate cost/reliability trade-offs Where to put redundancy Previous work: Fault Injection Time-consuming / Incomplete / Expensive Needs physical prototype board Cannot be used in design phases Prototype board can be damaged  Hard Error Test & Reliability Group (TRG)

Error Models in FPGAs Memory resources: User bits  Transient errors Flip-flops, RAMs, … Configuration bits Mux select bits, LUT bits, PIPs, … User bits  Transient errors Config. bits  Permanent errors Test & Reliability Group (TRG)

SER Estimation Traversing structural paths [Asadi04] From error sites to outputs Test & Reliability Group (TRG)

SER Estimation in ASIC Designs S(n): System failure probability (SFP) vector Si: SFP given node i erroneous n: total fault sites Experiments on ISCAS89 show that: Three order of magnitude faster Compared to random-input simulation Accuracy: more than 90% Test & Reliability Group (TRG)

FPGA vs. ASIC in SER Estimation ASIC: transient error Only requires propagation probability FPGA: both transient & permanent errors Transient errors: the same Permanent errors: needs activation as well More error sites in FPGAs Routing signals Test & Reliability Group (TRG)

FPGA vs. ASIC in SER Estimation Nodes with different error rates in FPGAs No attenuation in FPGAs During propagation Test & Reliability Group (TRG)

SER Estimation of FPGAs: Steps Compute permanent error rates for all nodes PRi : the permanent error rate of node i n: total number of fault sites Compute netlist failure probability vector Ni= failure prob. given node i erroneous System failure rate vector (S) = PR  N Si = PRi  Ni Test & Reliability Group (TRG)

How to Compute Ni? Open & stuck-at errors: Ni = [SPi  PPi(0) + (1-SPi)  PPi(1)] = PPi PPi: Propagation prob. (the method used for ASIC) SP: Signal probability is used for activation prob. Bridging wired-AND & wired-OR error (nets i and j): Ni (Wand)= [SPi(1-SPj)PPi(0)] + [(1-SPi) SPjPPj(0)] Ni (Wor)= [SPi(1-SPj)PPj(1)] + [(1-SPi) SPjPPi(1)] LUT bit-flip: Ni = Activation prob. (cell)  Prop. Prob. (LUT output) Test & Reliability Group (TRG)

How to Compute PRi? PR(n): permanent error rate vector PRi : r  f r: Raw error rate of an SRAM cell f: Number of all possible errors at node i n: total number of error sites PRAB= 6  r Test & Reliability Group (TRG)

System Failure Rate For the first clock: For c clock cycles: The same probability is valid for the next clock cycles c: Number of clocks checking the state of the circuit After particle hit Test & Reliability Group (TRG)

Error List Mux-open PIP open Buffer off A bit-flip in LUT Control/clocking bit-flip Test & Reliability Group (TRG)

Experimental Setup Xilinx Virtex 300 (XCV300) Xilinx Design Language (XDL) Benchmark: some ISCAS89 circuits r = raw failure rate for an SRAM cell r=0.01 FIT/bit 1000 clocks executed for each SEU Platform: Sun Solaris Ultra-10 256 MB Main Memory Test & Reliability Group (TRG)

Results: Sensitive Bits Number of sensitive SRAM bits for each part Circuit S27 S298 S344 S349 s382 s386 Routing 64 459 536 650 807 714 LUT 68 418 392 520 712 660 Control/ Clocking 40 140 168 187 207 160 Total 172 1017 1096 1357 1726 1534 Test & Reliability Group (TRG)

Results: SFR & Estimation Time System Failure Rate & Estimation Time Circuit S27 S298 S344 S349 s382 s386 SFR (FIT) 1.71 9.87 9.99 12.77 16.04 12.11 SP Time (sec) 0.15 0.76 0.91 1.09 1.25 1.05 SFR Time (sec) 0.02 0.09 0.13 0.14 0.19 0.25 Total Time (sec) 0.17 0.85 1.04 1.23 1.44 1.30 Number of Clock cycles: 1000 SP Time: Signal Probability computation time SFR Time: System Failure Rate computation time Test & Reliability Group (TRG)

Results: Manifestation Time Mean Time To Manifest (MTTM) errors to outputs Circuit S27 S298 S344 S349 s382 s386 Routing 2.07 2.86 2.58 2.91 3.30 3.82 LUT 14.49 20.75 17.33 20.48 22.08 30.07 Control/ Clocking 1.18 1.31 1.36 1.40 1.77 (Results are in terms of cycles) Test & Reliability Group (TRG)

Summary & Conclusions A new approach for SER estimation For SRAM-based FPGAs No physical implementation required Can be used in early design stages Very fast simulation time Can cover all possible faults Mean Time To Manifest errors to outputs: MTTM(Control/clocking) < MTTM(routing) MTTM(routing) << MTTM(LUT) Test & Reliability Group (TRG)

Questions? Thanks Test & Reliability Group (TRG)