Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I.

Similar presentations


Presentation on theme: "CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I."— Presentation transcript:

1 CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I

2 Lecture 3 - Fault Simulation and Fault Injection Fault simulation Fault simulation algorithms Serial Parallel Deductive Concurrent Random Fault Sampling Fault Injection ASIC FPGAs

3 Problem and Motivation Fault simulation Problem: Given  A circuit  A sequence of test vectors  A fault model –Determine  Fault coverage - fraction (or percentage) of modeled faults detected by test vectors  Set of undetected faults Motivation  Determine test quality and in turn product quality  Find undetected fault targets to improve tests

4 Fault simulator in a VLSI Design Process Verified design netlist Verification input stimuli Fault simulatorTest vectors Modeled fault list Test generator Test compactor Fault coverage ? Remove tested faults Delete vectors Add vectors Low Adequate Stop

5 Fault Simulation Scenario Circuit model: mixed-level Mostly logic with some switch-level for high- impedance (Z) and bidirectional signals High-level models (memory, etc.) with pin faults Signal states: logic Two (0, 1) or three (0, 1, X) states for purely Boolean logic circuits Four states (0, 1, X, Z) for sequential MOS circuits Timing: Zero-delay for combinational and synchronous circuits Mostly unit-delay for circuits with feedback

6 Fault Simulation Scenario (continued) Faults: Mostly single stuck-at faults Sometimes stuck-open, transition, and path-delay faults; analog circuit fault simulators are not yet in common use Equivalence fault collapsing of single stuck-at faults Fault-dropping -- a fault once detected is dropped from consideration as more vectors are simulated; fault-dropping may be suppressed for diagnosis Fault sampling -- a random sample of faults is simulated when the circuit is large

7 Fault Simulation Algorithms Serial Parallel Deductive Concurrent

8 Serial Algorithm Algorithm: Simulate fault-free circuit and save responses. Repeat following steps for each fault in the fault list: Modify netlist by injecting one fault Simulate modified netlist, vector by vector, comparing responses with saved responses If response differs, report fault detection and suspend simulation of remaining vectors Advantages: Easy to implement; needs only a simulator, less memory Most faults, including analog faults, can be simulated

9 Serial Algorithm (Cont.) Disadvantage: Much repeated computation; CPU time prohibitive for VLSI circuits Alternative: Simulate many faults together Test vectors Fault-free circuit Circuit with fault f1 Circuit with fault f2 Circuit with fault fn Comparator f1 detected? Comparator f2 detected? Comparator fn detected?

10 Parallel Fault Simulation Exploits inherent bit-parallelism of logic operations on computer words Storage: one word per line for two-state simulation Multi-pass simulation: Each pass simulates w-1 new faults, where w is the machine word length Speed up over serial method ~ w-1 Not suitable for circuits with timing-critical and non- Boolean logic

11 Parallel Fault Sim. Example a b c d e f g 1 1 1 1 0 1 0 0 0 1 0 1 s-a-1 s-a-0 0 0 1 c s-a-0 detected Bit 0: fault-free circuit Bit 1: circuit with c s-a-0 Bit 2: circuit with f s-a-1

12 Deductive Fault Simulation One-pass simulation Each line k contains a list L k of faults detectable on k Following simulation of each vector, fault lists of all gate output lines are updated using set- theoretic rules, signal values, and gate input fault lists PO fault lists provide detection data Limitations: Set-theoretic rules difficult to derive for non- Boolean gates Gate delays are difficult to use

13 Deductive Fault Simulation Example a b c d e f g 1 1 1 0 1 {a0}{a0} {b 0, c 0 } {b0}{b0} {b 0, d 0 } L e = L a U L c U {e 0 } = {a 0, b 0, c 0, e 0 } L g = (L e L f ) U {g 0 } = {a 0, c 0, e 0, g 0 } U {b 0, d 0, f 1 } Notation: L k is fault list for line k k n is s-a-n fault on line k Faults detected by the input vector

14 Concurrent Fault Simulation Simulation of fault-free circuit and only those parts of the faulty circuit that differ in signal states from the fault-free circuit. A list per gate containing copies of the gate from all faulty circuits in which this gate differs. List element contains fault ID, gate input and output values and internal states, if any. All events of fault-free and all faulty circuits are implicitly simulated. Faults can be simulated in any modeling style or detail supported in simulation (offers most flexibility.) Faster than other methods, but uses most memory.

15 Conc. Fault Sim. Example a b c d e f g 1 1 1 0 1 1 1 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 1 1 a0a0 b0b0 c0c0 e0e0 a0a0 b0b0 b0b0 c0c0 e0e0 d0d0 d0d0 g0g0 f1f1 f1f1

16 a b c d e f g 1 -> 0 1 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 1 1 a0a0 b0b0 c0c0 e0e0 a0a0 b0b0 b0b0 c0c0 e0e0 d0d0 d0d0 g0g0 f1f1 f1f1

17 Conc. Fault Sim. Example a b c d e f g 1 1 1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 0 0 1 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 1 a1a1 b0b0 c0c0 e1e1 a1a1 b0b0 b0b0 e1e1 d0d0 d0d0 g1g1 f1f1 f1f1

18 Fault Sampling A randomly selected subset (sample) of faults is simulated. Measured coverage in the sample is used to estimate fault coverage in the entire circuit. Advantage: Saving in computing resources (CPU time and memory.) Disadvantage: Limited data on undetected faults.

19 Motivation for Sampling Complexity of fault simulation depends on: Number of gates Number of faults Number of vectors Complexity of fault simulation with fault sampling depends on: Number of gates Number of vectors

20 Random Sampling Model All faults with a fixed but unknown coverage Detected fault Undetected fault Random picking N p = total number of faults (population size) C = fault coverage (unknown) N s = sample size N s << N p c = sample coverage (a random variable)

21 Example A circuit with 39,096 faults has an actual fault coverage of 87.1%. The measured coverage in a random sample of 1,000 faults is 88.7%. The above formula gives an estimate of 88.7% 3%. CPU time for sample simulation was about 10% of that for all faults. 

22 ASIC (general) –performed by simulation or –emulation to speed up the process FPGAs –directly into the bitstream Fault Injection

23 General: ASIC performed by simulation or emulation to speed up the process

24 When should a fault be inserted? Pseudo- random number generator time location bit Fault Injection Enable Signals Enable_registers Enable_memory Fault mask reset Reset_core clock Memory location Fault Injection Block [0][4][5][7] 8-bit pseudo-random register Fault Injection Control Block application core... Where should a fault be inserted?

25 Device Under Test Core Registers CLKRST mask... FI_EN EN FI_EN CLKRST mask FI_EN EN FI_EN Hamming Code Hamming Decode Special paths are required in the high-level description to ensure the full controllability of a fault insertion. No major performance penalties are observed

26 Device Under Test Core Memory FI_memory location mask WEA ENA RSTA CLKA ADDRA DIA DOA DOB WEB ENB RSTB CLKB ADDRB DIB FI_enable_memory WEA FI_memory location mask WEA ENA RSTA CLKA ADDRA DIA DOA DOB WEB ENB RSTB CLKB ADDRB DIB Hamming Code Hamming Decode FI_enable_memory WEA Dual-port memories allows high flexibility with low extra logic. Good fault model approximation: the faults can be injected in the memory at any time (all micro-controller states) without perturbing the normal operation.

27 Monitor Block Observability of an error in response of a fault injected in the DUT. What is the best method to analyze the results? Assumption: application results will be stored in the internal memory, implemented using BlockRam in the Virtex FPGA. Two main tasks are performed by the monitor block: Clear all the internal memory in order to avoid accumulation of faults; Read all the internal memory to analyze if there is an error or not.

28 Fault Injection Platform Fault injection Soft-core “Golden” Soft-core Fault injection block reset Reset_core clock Monitor Block Monitor Block data status Lost of sequence error Chipscope Analyzer Virtex FPGA jtag computer “Golden” chip approach was used to observe the faults in the system. Comparing: –Data memory error –Application_end signal in the end lost of sequence.

29 Reset Reset_MEM Read_MEM Clear_MEM Clear Memory Application Fault Injection Signals Reset_DUT Clear Memory Application Read Memory Time range for Fault Injection Application_end Lost of sequence Error

30 Case Sudy:  8051 Fault Injection Experiment 000000010000001000000011000001000000010100000110 000000010000000100000001000000010000000100000001 000000100000001000000010000000100000001000000010 000000110000001100000011000000110000001100000011 000001000000010000000100000000100000010000000100 000001010000010100000101000000101000010100000101 000001100000011000000110000000110000011000000110 000101010010101000111111010101000110100101111110 000000000000000000000000000000000000000000000000 …. 000000000000000000000000000000000000000000000000 variables Matrix 2 Matrix Result 0h 0Ah 0Bh 1Fh 2Fh 52h 53h 76h 100h Matrix 1 Registers: Datapath, Control block, state machine (16 registers) Memory: 256 bytes application memory (RAM) 8051 Application: 6x6 Matrix Multiplication –Multiplication is performed by subsequent addition and left shift register. According to the TIME: –fault in the memory location < 52h can generate an error or a lost of sequence. memory

31 Case Study:  8051 Fault Injection Experiment Virtex Platform The fault injection system and the (DUT) were implemented in Virtex FPGA platform. Thousands of faults were injected in some seconds The emulation in FPGA showed a ~120,000 times acceleration Component: Virtex XCV300 - 240p 8051-like micro-controller runs at 10 MHz Fault injection Standard 8051 –28 % LUTs Fault injection Hardened 8051 –30 % LUTs

32 Case Study:  8051 Fault Injection Experiment Chipscope Analyzer Results of thousands of injected faults were analyzed in the computer by the Chipscope Analyzer software (ISE).

33 Fault Injection FPGAs (Directly into the bitstream)

34 Introduction Digital designs synthesized in SRAM-based FPGAs are sensitive to upsets in the FPGA customization cells. Design clk E1 E2 E1 E3 E2 E3 Synthesis Customization bits …10101011110001101010… MMMM Upset = bit-flip …10101011100001101010… clk E1 E2 E1 E3 E2 E3 fault

35 Motivation A way to test the reliability of a (fault-tolerant) design is to perform fault injection, simulating the effects of SEUs, and see how the design responds The estimated time for injecting one bit fault in a FPGA bitstream is around 1 to 3 seconds The main steps of the fault injection are: –read the bit file –flip one bit –write the new bit file with the correct CRC –download it to the board by the SelectMAP configuration mode interface –check the output signal

36 Fault Injection Cost Analysis XCV300 Analysis CLB Rows x CLB Cols 32 x 48 Total of 1.536 CLBs 1.327.104 CLB Bits Total of 470 CLBs Used 406.080 CLB Bits 31 days! 9.5 days!

37 CLB View in the FPGA Editor Which bit in the bitstream correspond to this connection? Which bit in the bitstream correspond to the logic?

38 Virtex Bitstream CLB BRAM IOB 586 bits What does this bit do?

39 CLB View in the FPGA Editor Which bit in the bitstream correspond to this connection? Bit:?, frame: ? Which bit in the bitstream correspond to the logic? Bit: ?, frame:?

40 CLBclass Tool CLB Map

41 SEUs Effects on FPGAs I Bit Flip in a Single Line bit: 03 frame: 29 Fault effect: - Open Circuit

42 SEUs Effects on FPGAs II Bit Flip in a Hex Line Mux bit: 01, frame:32 Fault effect: - Open Circuit - Short Cut Circuit

43 Bitstream Error flag List of bits that cause error: Major Address, Frame, Byte Frame, Bit (bit location in the Bitstream) XCV300 240p (DUT) Read: … 011001010101000001100001… step 1: load … 01100 0 010101000001100001… 1 step 2: load … 01100 1 010101000001100001… step 3: reset flip-flops … 011001 1 00101000001100001… 3-modes Bitstream: 1,663,200 bits Fault InjectionTool used at Xilinx

44 Fault Injection Tool (under developed at UFRGS)

45 Fault Injection Time Reduction Bits to be tested of the 1,536 CLBs Number of BitsEstimated timeTime Save (%) All bits 1,327,10431 days LUTs98,3041.1 days96,45 GRM638,9767.4 days76,13 Single matrix466,9445.4 days82,58 Hex matrix172,0322 days93,55 Responsible for the majority of the errors due to SEU - Open circuits combined to Short cut circuits One can choose where to inject the faults!! This can reduce the fault injection time.


Download ppt "CMP238: Projeto e Teste de Sistemas VLSI Marcelo Lubaszewski Aula 3 - Teste PPGC - UFRGS 2005/I."

Similar presentations


Ads by Google