Reducing Test Application Time Through Test Data Mutation Encoding Sherief Reda and Alex Orailoglu Computer Science Engineering Dept. University of California,

Slides:



Advertisements
Similar presentations
Retiming Scan Circuit To Eliminate Timing Penalty
Advertisements

Design Rule Generation for Interconnect Matching Andrew B. Kahng and Rasit Onur Topaloglu {abk | rtopalog University of California, San Diego.
Overview Part 2 – Combinational Logic Functions and functional blocks
VLSI-SoC, Atlanta J. Dalmasso, ML Flottes, B. Rouzeyre CNRS/ Univ. Montpellier II France 1 17/10/2007.
Copyright 2001, Agrawal & BushnellVLSI Test: Lecture 261 Lecture 26 Logic BIST Architectures n Motivation n Built-in Logic Block Observer (BILBO) n Test.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Scalable Test Pattern Generator Design Method for BIST Petr Fišer, Hana Kubátová Czech Technical University in Prague Faculty of Electrical Engineering.
Aiman El-Maleh, Ali Alsuwaiyan King Fahd University of Petroleum & Minerals, Dept. of Computer Eng., Saudi Arabia Aiman El-Maleh, Ali Alsuwaiyan King Fahd.
An Efficient Test Relaxation Technique for Synchronous Sequential Circuits Aiman El-Maleh and Khaled Al-Utaibi King Fahd University of Petroleum & Minerals.
CS 151 Digital Systems Design Lecture 25 State Reduction and Assignment.
X-Compaction Itamar Feldman. Before we begin… Let’s talk about some DFT history: Design For Testability (DFT) has been around since the 1960s. The technology.
11/17/05ELEC / Lecture 201 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
1 ReCPU:a Parallel and Pipelined Architecture for Regular Expression Matching Department of Computer Science and Information Engineering National Cheng.
An Arithmetic Structure for Test Data Horizontal Compression Marie-Lise FLOTTES, Regis POIRIER, Bruno ROUZEYRE Laboratoire d’Informatique, de Robotique.
October 8, th Asian Test Symposium 2007, Biejing, China XXXXX00XXX XX000101XXXXXXXXXXXXX0XXX1X0 101XXX1011XXXXXX0XXX XX000101XXXXXXXXXXXXX0XXX1XX.
Fall 2006, Nov. 30 ELEC / Lecture 12 1 ELEC / (Fall 2006) Low-Power Design of Electronic Circuits Test Power Vishwani D.
A Hybrid Test Compression Technique for Efficient Testing of Systems-on-a-Chip Aiman El-Maleh King Fahd University of Petroleum & Minerals, Dept. of Computer.
IC-SOC STEAC: An SOC Test Integration Platform Cheng-Wen Wu.
A Geometric-Primitives-Based Compression Scheme for Testing Systems-on-a-Chip Aiman El-Maleh 1, Saif al Zahir 2, Esam Khan 1 1 King Fahd University of.
TAP (Test Access Port) JTAG course June 2006 Avraham Pinto.
Multivalued Logic for Reduced Pin Count and Multi-Site SoC Testing Baohu Li and Vishwani D. Agrawal Auburn University, ECE Dept., Auburn, AL 36849, USA.
CAFO: Cost Aware Flip Optimization for Asymmetric Memories RAKAN MADDAH *, SEYED MOHAMMAD SEYEDZADEH AND RAMI MELHEM COMPUTER SCIENCE DEPARTMENT UNIVERSITY.
ENGIN112 L25: State Reduction and Assignment October 31, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 25 State Reduction and Assignment.
04/26/2006VLSI Design & Test Seminar Series 1 Phase Delay in MAC-based Analog Functional Testing in Mixed-Signal Systems Jie Qin, Charles Stroud, and Foster.
1 AN EFFICIENT TEST-PATTERN RELAXATION TECHNIQUE FOR SYNCHRONOUS SEQUENTIAL CIRCUITS Khaled Abdul-Aziz Al-Utaibi
An Embedded Core DFT Scheme to Obtain Highly Compressed Test Sets Abhijit Jas, Kartik Mohanram, and Nur A. Touba Eighth Asian Test Symposium, (ATS.
JPEG C OMPRESSION A LGORITHM I N CUDA Group Members: Pranit Patel Manisha Tatikonda Jeff Wong Jarek Marczewski Date: April 14, 2009.
Outline Decoder Encoder Mux. Decoder Accepts a value and decodes it Output corresponds to value of n inputs Consists of: Inputs (n) Outputs (2 n, numbered.
Testimise projekteerimine: Labor 2 BIST Optimization
A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms Jia Yu and Rajkumar Buyya Grid Computing and Distributed.
Adopting Multi-Valued Logic for Reduced Pin-Count Testing Baohu Li, Bei Zhang and Vishwani Agrawal Auburn University, ECE Dept., Auburn, AL 36849, USA.
1 Fitting ATE Channels with Scan Chains: a Comparison between a Test Data Compression Technique and Serial Loading of Scan Chains LIRMM CNRS / University.
Presenter: Hong-Wei Zhuang On-Chip SOC Test Platform Design Based on IEEE 1500 Standard Very Large Scale Integration (VLSI) Systems, IEEE Transactions.
Lecture 18: Dynamic Reconfiguration II November 12, 2004 ECE 697F Reconfigurable Computing Lecture 18 Dynamic Reconfiguration II.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 January Session 3.
ECE 553: TESTING AND TESTABLE DESIGN OF DIGITAL SYSTEMS
December, 2004 Ecole Polytechnique 1 Deterministic BIST By Amiri Amir Mohammad Professor Dr. Abdelhakim Khouas Project Presentation for ELE6306 (Test des.
Garo Bournoutian and Alex Orailoglu Proceedings of the 45th ACM/IEEE Design Automation Conference (DAC’08) June /10/28.
Design of a High-Throughput Low-Power IS95 Viterbi Decoder Xun Liu Marios C. Papaefthymiou Advanced Computer Architecture Laboratory Electrical Engineering.
Integrated Test Data Compression and Core Wrapper Design for Low-Cost System-on-a-Chip Testing Paul Theo Gonciari Bashir Al-Hashimi Electronic Systems.
Wrapper/TAM Optimization1 System-on-Chip (SoC) Testing SoC Wrapper/TAM Design.
TOPIC : Signature Analysis. Introduction Signature analysis is a compression technique based on the concept of (CRC) Cyclic Redundancy Checking It realized.
On the Relation between SAT and BDDs for Equivalence Checking Sherief Reda Rolf Drechsler Alex Orailoglu Computer Science & Engineering Dept. University.
Improving NoC-based Testing Through Compression Schemes Érika Cota 1 Julien Dalmasso 2 Marie-Lise Flottes 2 Bruno Rouzeyre 2 WNOC
Useless Memory Allocation in System on a Chip Test: Problems and Solutions Paul Theo Gonciari Bashir Al-Hashimi Electronic Systems Design Group University.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 February 2, 2006 Session 6.
Sunpyo Hong, Hyesoon Kim
MSI Combinational logic circuits
POWER OPTIMIZATION IN RANDOM PATTERN GENERATOR By D.Girish Kumar 108W1D8007.
Fault-Tolerant Resynthesis for Dual-Output LUTs Roy Lee 1, Yu Hu 1, Rupak Majumdar 2, Lei He 1 and Minming Li 3 1 Electrical Engineering Dept., UCLA 2.
Finite state machine optimization
Finite state machine optimization
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
DIGITAL 2 : EKT 221 RTL : Microoperations on a Single Register
Test Sequence Length Requirements for Scan-based Testing
VLSI Testing Lecture 14: Built-In Self-Test
CPE/EE 428/528 VLSI Design II – Intro to Testing (Part 2)
CPE/EE 428/528 VLSI Design II – Intro to Testing (Part 3)
ECE 434 Advanced Digital System L18
CSE 370 – Winter Sequential Logic-2 - 1
第七章 資訊隱藏 張真誠 國立中正大學資訊工程研究所.
Esam Ali Khan M.S. Thesis Defense
Sungho Kang Yonsei University
MS Thesis Defense Presentation by Mustafa Imran Ali COE Department
Lecture 26 Logic BIST Architectures
Reseeding-based Test Set Embedding with Reduced Test Sequences
Test Data Compression for Scan-Based Testing
Scalable light field coding using weighted binary images
CSE 370 – Winter Sequential Logic-2 - 1
A Random Access Scan Architecture to Reduce Hardware Overhead
Presentation transcript:

Reducing Test Application Time Through Test Data Mutation Encoding Sherief Reda and Alex Orailoglu Computer Science Engineering Dept. University of California, San Diego

Outline Introduction Test Data Mutation Encoding Time Reduction Analysis Experimental Results Conclusions Motivation Scheme overview Overlap exploration Computational aspects Hardware challenges Don’t care handling

Introduction Advancements in VLSI device fabrication  Unprecedented integration levels Increased test application time hinders volume manufacturing in today’s demanding market. High integration manufacturing  Increased test application time Testing multiple cores on System-on-a-Chip (SoC)  Increased test application time

Motivation 0 1 X 1 X X X X X X 1 X 1 TDI LFSR TDO X 0 X 1 X 0 1 X X 0 X X 1 X 0 X Scan chain length Test time increase Flip Mutation reduces test time by specifying only the bits to be flipped Problem: Test responses destroy the scan cells’ content! Test Vector I Test Vector II Mutate

Decompose scan chain 0 X X 1 1 X X X X 1 0 X X Large test vectors are transformed into small horizontal test slices Motivation 0 1 X 1 X X X X X X 1 X 1 TDI LFSR TDO 0111XXX0X1XX11XX Scan chain length Bits to specify inversion Small test slice Small number of bits to specify an inversion

Motivation 0 1 X 1 X X X X X X 1 X 1 TDI LFSR TDO Decompose scan chain Large test vectors are transformed into small horizontal test slices Scan chain length Bits to specify inversion Small test slice Small number of bits to specify an inversion

Test Data Mutation Encoding TDO 2x4 Decoder TDI DSR Flip DOR ENABLE MISR CLK TDI: Test Data Input DSR: Decoder Shift Register DOR: Decoder Output Register TDO: Test Data Output 0(00)1(01)2(10)3(11) Mutated Test Slice Bits 2 & 3 need to be flipped 10, 11 to be injected = 4 bits Overlap can reduce this to just 11 = 2 bits

Test Data Mutation Encoding 2x4 Decoder TDI DSR Flip DOR ENABLE MISR TDO CLK Mutated Test Slice TDI: Test Data Input DSR: Decoder Shift Register DOR: Decoder Output Register TDO: Test Data Output Bits 2 & 3 need to be flipped 10, 11 to be injected = 4 bits Overlap can reduce this to just 11 = 2 bits 0(00)1(01)2(10)3(11)

Test Data Mutation Encoding TDO 2x4 Decoder TDI DSR Flip DOR ENABLE MISR CLK Mutated Test Slice TDI: Test Data Input DSR: Decoder Shift Register DOR: Decoder Output Register TDO: Test Data Output Bits 2 & 3 need to be flipped 10, 11 to be injected = 4 bits Overlap can reduce this to just 11 = 2 bits 0(00)1(01)2(10)3(11) 00110

Test Data Mutation Encoding 2x4 Decoder TDI DSR Flip DOR ENABLE MISR TDO CLK Mutated Test Slice TDI: Test Data Input DSR: Decoder Shift Register DOR: Decoder Output Register TDO: Test Data Output 0(00)1(01)2(10)3(11) 0011

Test Data Mutation Encoding 2x4 Decoder TDI DSR Flip DOR ENABLE MISR TDO CLK Mutated Test Slice TDI: Test Data Input DSR: Decoder Shift Register DOR: Decoder Output Register TDO: Test Data Output 0(00)1(01)2(10)3(11) 001

Test Data Mutation Encoding 2x4 Decoder TDI DSR Flip DOR ENABLE MISR TDO CLK Mutated Test Slice TDI: Test Data Input DSR: Decoder Shift Register DOR: Decoder Output Register TDO: Test Data Output 0(00)1(01)2(10)3(11) 00

Test Data Mutation Encoding 2x4 Decoder TDI DSR Flip DOR ENABLE MISR TDO CLK Mutated Test Slice TDI: Test Data Input DSR: Decoder Shift Register DOR: Decoder Output Register TDO: Test Data Output 0(00)1(01)2(10)3(11) 0

Test Data Mutation Encoding 2x4 Decoder TDI DSR Flip DOR ENABLE MISR TDO CLK Mutated Test Slice 7 clock cycles are needed to inject 21 bits through 3 parallel streams to mutate the test vector. 57% reduction in test application time TDI: Test Data Input DSR: Decoder Shift Register DOR: Decoder Output Register TDO: Test Data Output 0(00)1(01)2(10)3(11)

Fundamental Challenges Input test data indicates flips needed to mutate test slices. Optimal ordering of the indices  maximal overlap  minimal test application time. Problem: What is the flipping order that attains the minimal number of clock cycles? Input test data encodes the indices of flip locations.

Overlap Exploration x8 Decoder MISR TDO TDI

x8 Decoder MISR TDO TDI Overlap Exploration

State Transition Diagram of DSR (DeBruijn Diagram) x8 Decoder MISR TDO TDI Overlap Exploration

State Transition Diagram of DSR (DeBruijn Diagram) Distance Matrix Objective: Mutating an 8 bit test slice through flipping bits 2 & Overlap Exploration

State Transition Diagram of DSR (DeBruijn Diagram) Objective: Mutating an 8 bit test slice through flipping bits 2 & Distance Matrix First option: yields 3 clock cycles Overlap Exploration

State Transition Diagram of DSR (DeBruijn Diagram) Objective: Mutating an 8 bit test slice through flipping bits 2 & Distance Matrix Second option: yields 4 clock cycles Overlap Exploration

Computational Aspects Optimal number of test bits  Enumerating all the possible trips to pick the one that achieves the minimal total distance If there are n bits to flip, then there are n! trips to consider in order to calculate the optimal trip Large number of flips  Enumeration of all trips is computationally infeasible  A greedy strategy is utilized

Greedy strategy is applied to visit the three states Greedy strategy - Repeat until the test slice is mutated: Move from the current state to the closest next state corresponding to the bit index to be flipped - Move from the initial state to the closest state. Objective: Mutating an 8 bit test slice through flipping bits 5, 6 & Computational Aspects

Greedy strategy is applied to visit the three states Greedy strategy - Repeat until the test slice is mutated: Move from the current state to the closest next state corresponding to the bit index to be flipped - Move from the initial state to the closest state. Objective: Mutating an 8 bit test slice through flipping bits 5, 6 & Computational Aspects

Greedy strategy is applied to visit the three states Greedy strategy - Repeat until the test slice is mutated: Move from the current state to the closest next state corresponding to the bit index to be flipped - Move from the initial state to the closest state. Objective: Mutating an 8 bit test slice through flipping bits 5, 6 & Computational Aspects

Greedy strategy is applied to visit the three states Greedy strategy - Repeat until the test slice is mutated: Move from the current state to the closest next state corresponding to the bit index to be flipped - Move from the initial state to the closest state. Objective: Mutating an 8 bit test slice through flipping bits 5, 6 & Computational Aspects

Don’t Care Handling Test Slice A Test Slice B Test Slice C Test Slice D x0xxx0xx xxxxxxxx xxxxxxxx 1x0xx0xx Test Slice E

Don’t Care Handling Test Slice A Test Slice B Test Slice C Test Slice D Test Slice E x0xxx0xx xxxxxxxx xxxxxxxx 1x0xx0xx There are 2 cases: A run of don’t cares in between two identical specified bits A run of don’t cares in between two distinctly specified bits

Don’t Care Handling Test Slice A Test Slice B Test Slice C Test Slice D Test Slice E xxx0xx 1xxxxxxx xxxxxxx 1x0xx0xx There are 2 cases: A run of don’t cares in between two identical specified bits A run of don’t cares in between two distinctly specified bits

Don’t Care Handling Assume we have the 3 test slices x0xxx0xx 0x0xxxxx Test Slice A Test Slice B Test Slice C x0xxx0xx 0x0xxxxx 6 clock cycle AB 3 Clock Cycles

Don’t Care Handling Assume we have the 3 test slices clock cycle BC 3 Clock Cycles x0xxx0xx 0x0xxxxx Test Slice A Test Slice B Test Slice C

clock cycles While mutating test slice A to test slice B we can flip bit 5 in anticipation for test slice C. This saves 2 bits in mutating test slice B to C. Don’t Care Handling Assume we have the 3 test slices x0xxx0xx 0x0xxxxx Test Slice A Test Slice B Test Slice C AB 3 Clock Cycles

clock cycles While mutating test slice A to test slice B we can flip bit 5 in anticipation for test slice C. This saves 2 bits in mutating test slice B to C. Don’t Care Handling Assume we have the 3 test slices x0xxx0xx 0x0xxxxx Test Slice A Test Slice B Test Slice C BC 1 Clock Cycle

Reducing I/O Pin Requirements TDO/Enable To alleviate the requirement of adding an extra ENABLE pin, one of the I/O pins can be multiplexed or the TDO can be multiplexed. TDI MISR v Control v Enable TDO CLK 2x4 decoder

How many scan chains should the original scan chain be decomposed into? What is the decoder size to be used? What is the attainable time reduction for various scan chain configurations? What is the relation between the number of flips to be performed in mutating test slices and the achievable time reduction? Fundamental Issues

Outline Introduction Test Data Mutation Encoding Time Reduction Analysis Experimental Results Conclusions Motivation Scheme overview Overlap exploration Computational aspects Hardware requirements Don’t care handling

Time Reduction Analysis Initial State Shifts New Reachable States Next State X001 2 XX0 2 3 XXX 4 If we only need to flip one bit to mutate the current test slice to the next test slice, how many shift clock cycles are needed? 000 3x8 Decoder MISR TDO TDI Hardware Organization State transition diagram of the decoder shift register X XX0 2 2 XXX 3 3 Weighted average shifts for state 000: cycles Weighted average shifts for state 001: cycles

The average number of clock cycles needed to reach a combination of states is not only function of the initial state but also of the particular combination of states to be visited. Time Reduction Analysis Given an initial state, what is the average number of clock cycles needed to reach a different state? 1.84 clock cycles.

Test Slice Size Time Reduction Analysis In general, what is the average number of clock cycles needed to mutate the test slices of various sizes?

Test Slice Size Time Reduction Analysis Test Slice Size Time Reduction Ratio = Test Slice Size Average number of shift clock cycles

Test Slice Size In this experiment, we assume that the number of bits to be flipped to mutate a 32 bit test slice is 8 times the number of bits to be flipped to mutate a 4 bit slice. Time Reduction Analysis Bits to Flip x Test slice size/4

Experimental Results Compressing MinTest vectors results in an average time reduction ratio of 2.4 for the 5 benchmark circuits MinTest: Hamzaoglu & Patel, ITC, 1998 Virtual Scan Chains: Jas & Touba, VTS, 2000 Golomb Coding: Chandra & Chakrabarty, VTS, 2000 Test Data Mutation using MinTest fully specified vectors MinTest fully specified vectors are compressed using test data mutation

Experimental Results Compressing the incompletely specified test vectors, using last flip heuristic, results in an average time reduction ratio of 6.7 in comparison with MinTest for the 5 benchmark circuits MinTest: Hamzaoglu & Patel, ITC, 1998 Test Data Mutation using MinTest fully specified vectors Test Data Mutation using incompletely specified vectors Test Data Mutation is applied to incompletely specified test vectors obtained from Atalanta

Experimental Results Augmenting Scan Chain Concealment results in an increased test time reduction by a factor of 1.8 Scan Chain Concealment: Bayraktaroglu, Orailoglu, DAC, 2001 Test Data Mutation using scan chain concealment fully specified vectors

Conclusions A new methodology to reduce test application time through test data mutation is presented Thorough analysis of the proposed method identifies configurations and conditions for optimal test time reduction Reduced hardware overhead Experimental results on ISCAS’89 benchmarks confirm drastic test time application reductions Effective overlapping of test data yields huge reductions in test application time