Presentation is loading. Please wait.

Presentation is loading. Please wait.

Online and Operand-Aware Detection of Failures Utilizing False Alarm Vectors Amir Yazdanbakhsh, David Palframan, Azadeh Davoodi, Mikko Lipasti, Nam Sung.

Similar presentations


Presentation on theme: "Online and Operand-Aware Detection of Failures Utilizing False Alarm Vectors Amir Yazdanbakhsh, David Palframan, Azadeh Davoodi, Mikko Lipasti, Nam Sung."— Presentation transcript:

1 Online and Operand-Aware Detection of Failures Utilizing False Alarm Vectors Amir Yazdanbakhsh, David Palframan, Azadeh Davoodi, Mikko Lipasti, Nam Sung Kim Department of Electrical and Computer Engineering

2 2 Introduction Technology scaling beyond 32nm degrades the manufacturing yield o Can be addressed by imposing restrictive design rules or just using regular fabrics [T. Jhaveri, SPIE’06] o Can be addressed by using configurable logic blocks to make post-silicon corrections [Y. Ran, TVLSI’06] o Redundancy based techniques can also be used o Exploiting existing redundancy in high performance processors [P. Shivakumar, ICCD’12][S. Shyam, ASPLOS’06][J. Srinivasan, ISCA’05] o Incorporate redundancy at the granularity of a bit slice [K. Namba, PRDC’05]

3 3 Motivation C7C7 C4C4 01234567 Defective prefix node impacts C 4 and C 7 0000…11010…0110 Minimized vectors Recover Checker The checker detects a match with the faulty vectors and a small number of False Alarm vectors at runtime.

4 4 Contributions Checker Unit ModuleFalse Alarm Vectors Flexible option for online and operand-level fault detection Update faulty vectors over the time TCAM-based implementation which can store cubes with don’t care No extra logic on the critical paths Efficient use of false alarm vectors to reduce the number of vectors to be checked, thus reducing the TCAM area Integrate the false alarm insertion into ESPRESSO 2-level logic minimization tool The recovery flag is not falsely activated too frequently

5 5 Checker Unit: Comparison with A Redundancy-based Alternative Checker UnitRedundancy-based Recover 0xxx11x0 TCAM Operand Checker Does not affect the critical path Flexible checker unit (can update faulty vectors) Online and operand-aware detection of failures  Affects the critical path (large muxes)  Fixed design approach (can not be updated)  Two out of three adders should always be fault-free

6 6 Overview of TCAM TCAM can store test cubes which have don’t care bits Conventional TCAM needs to support random access to a specific entry to update the key value at runtime –Requires a log N-to-N decoder for a TCAM with N entries –The checker unit does not need such a decoder Each entry must be updated only once, every time the chip is turned on Supporting a sequential access to write the test cubes to the TCAM is sufficient In our framework the size of TCAM can get impractically large if all the faulty are individually stored –We propose to a few false alarm vectors to reduce number of entries in the TCAM and therefore reduce the TCAM size

7 7 False Alarm Insertion to Minimize the TCAM Size: Example A B C D E ABCDE x10xx 0110x 11100 11101 x00x1 x0101 V1V1 V2V2 V3V3 V4V4 V5V5 V6V6 Identify cubes which excite fault

8 8 ABCDE x10xx 0110x 11100 11101 x00x1 x0101 V1V1 V2V2 V3V3 V4V4 V5V5 V6V6 ABCDE xx0x1 x10xx xxx01 x1x0x V1V1 V2V2 V3V3 V4V4 Test cube minimization False Alarm Insertion to Minimize the TCAM Size: Example

9 9 ABCDE xx0x1 x10xx xxx0x V1V1 V2V2 V3V3 ABCDE xx0x1 x10xx xxx01 x1x0x V1V1 V2V2 V3V3 V4V4 We reduce the number of test cubes from 6 to 3 Identify cubes which excite fault Test cube minimization Further minimization with False Alarm Insertion False Alarm Insertion to Minimize the TCAM Size: Example

10 10 False Alarm Insertion Problem Definition Reduce the number of cubes beneath the given threshold by adding as few false alarm vectors as possible Why we need False Alarms? Due to area budget, number of entries in TCAM is limited The number of test cubes translates to the number of entries in the TCAM

11 11 Using Two-Level Logic Minimization Two-level logic minimization can be used to minimize the number of test cubes We expand the ESPRESSO* tool by inserting false alarm vectors to achieve higher minimization *ESPRESSO. http://embedded.eecs.berkeley.edu/pubs/downloads/espresso/.

12 12 False Alarm Insertion by Extending ESPRESSO F = IRREDUNDANT (F ON, F DC ) F = REDUCE (F ON, F DC ) F = EXPAND (F ON, F OFF ) F = IRREDUNDANT (F ON, F DC ) F = REDUCE (F ON, F DC ) F = EXPAND (F ON, F OFF ) Stop Minimization? Test cubes Minimized test cubes Overview of the main loop of ESPRESSO F = EXPAND-FA (F ON, F OFF ) F = IRREDUNDANT (F ON, F DC ) F = REDUCE (F ON, F DC ) F = EXPAND (F ON, F OFF ) F = EXPAND-FA (F ON, F OFF ) F = IRREDUNDANT (F ON, F DC ) F = REDUCE (F ON, F DC ) F = EXPAND (F ON, F OFF ) # vectors < threshold Minimized cubes Minimized cubes with false alarm Extension with False Alarm insertion

13 13 False Alarm Insertion Example EXPAND-FA IRREDUNDANT REDUCE EXPAND

14 14 False Alarm Insertion Procedure Each call EXPAND-FA function expands multiple test cubes –How I sequentially go through the on-set? –Look at the paper –Which cube is selected to be expanded? –Same section – stopping criteria (when you reach the target number of cubes)

15 15 False Alarm Insertion for One Cube A0A1 A2 A3 000x ON 1 xx1x x1xx 1xx1 OFF 1 OFF 2 OFF 3 Offset Matrix False Alarm Matrix 002- 020- 100- B1B1 B2B2 B3B3 122- 1 100 0 000 0 000 1 000 OFF 1 OFF 2 OFF 3 ON 1 ON 2 ON’ 2 False Alarm Matrix (i, j) –Entry (i, j) indicates false alarms between the off-set cube i and (the expanded) cube when literal j is dropped A0A1A0A1 A2A3A2A3

16 16 Simulation Configuration Single-failure scenarios in various nodes of 32-bit Brent-Kung adder (prefix adder) Generate all the test vectors for two failing cases modeled by a stuck-at-0 and stuck-at-1 using ATALANTA* ATPG toolset Using SPEC2006 suite for workload-dependent case Record the input arguments to the adder by running each benchmark on an X86 simulator Analyzing area overhead in 2-issue and 4-issue microprocessors *H.K. Lee and D.S. Ha. Atalanta: an efficient ATPG for combinational circuits. Technical Report; Department of Electrical Engineering, Virginia Polytechnic Institute and State University, pages 93 12, 1993.

17 17 Comparison of Probability of Detection Probability of detection: percentage of times that the checker unit activates the recovery signal (could be false alarm or true positive) Average PoD degrades with decrease in the number of test cubes Average PoD after inserting false alarms does not degrade significantly in FA-128 or FA-64 or FA-32 compared to W/O FA This behavior is true for both workload-dependent and random cases

18 18 Comparison of False Alarm Insertion Algorithms FA-Ag Algorithm –At each iteration, all the cubes are expanded using the expand-FA procedure. Each entry indicates the fraction of false alarms from the total number of detection ( ) –FA denotes the number fo false alarm minterms and TP the number of true positive when a fault is truly happening. On-average FA-Ag results in more overhead with increase in the number of test cubes compared to FA

19 19 Area Overhead Implemented approaches –Baseline k+1 (for k=2, 4) K-issue processor with 1 redundant component –K+TCAM K-issue processor with checker implemented as TCAM –K+FPGA K-issue processor with checker implemented as FPGA 2+TCAM has better area than 2+1 for 32 and 48 cubes 2+FPGA always has more area than baseline Similar behavior for 4+TCAM and 4+FPGA

20 20 Conclusion A new framework for online detection of failures at operand level of granularity Design a flexible TCAM-based checker unit Propose a false alarm insertion algorithm to reduce the number of vectors below the given threshold Incorporate the false alarm insertion algorithm into ESPRESSO 2-level logic minimization tool Future works: Use checker unit for other existing modules inside the processor Utilizing the online and operand-aware detection for other type of faults such as delay path fault

21 21 Questions?


Download ppt "Online and Operand-Aware Detection of Failures Utilizing False Alarm Vectors Amir Yazdanbakhsh, David Palframan, Azadeh Davoodi, Mikko Lipasti, Nam Sung."

Similar presentations


Ads by Google