Integrated Test Data Compression and Core Wrapper Design for Low-Cost System-on-a-Chip Testing Paul Theo Gonciari Bashir Al-Hashimi Electronic Systems Design Group University of Southampton, UK Nicola Nicolici Electrical and Computer Engineering McMaster University, Canada
Overview Low-cost system-on-a-chip test Single vs. multiple scan chains compression Proposed add-on architecture –TAM add-on architecture Core wrapper design Reduce control and area overhead –Design flow integration Experimental results Conclusion
Low-cost SOC test Problems –High volume of test data –Increased chip/ATE frequency ratio –Increased chip/ATE pin number ratio –Increased scan-power dissipation High ATE costs and yield loss
Low-cost SOC test Solutions –Test data reduction –Reuse existing ATE technology –Exploit chip/ATE frequency ratio –Reduce pin count testing (RPCT) –Scan chain partitioning
TAM add-on architecture Core SOC Low-cost solution for core based SOC test TAM add-on
Overview Low-cost system-on-a-chip test Single vs. multiple scan chains compression Proposed add-on architecture –TAM add-on architecture Core wrapper design Reduce control and area overhead –Design flow integration Experimental results Conclusion
Single scan chain TDC s i s o Core sync ATE Head decoder 5 FF SISR counter SOC
Single scan chain TDC (cont) Exploit test set regularities (e.g., runs of 0s) Based on coding schemes Exploit frequency ratio Synchronization overhead – temporal deserialization [Gonciari, ETW02] –External clock synchronization –FIFO like structures High scan power due to the long scan chain
Multiple scan chain TDC SISR scan chain Core WSC XOR Network Core scan chain data in ctrl
Multiple scan chain TDC (cont) Exploit care bits sparseness Uses XOR based spreading networks Temporal pattern lockout –Extra control line –Doubles the volume of test data –Influences test application time Structural Pattern lockout –can influence fault coverage High scan power due to driving of all scan chains Extend single scan chain TDC to multiple scan chains
Extend single scan chain TDC … Use one decoder and shift register [Chandra, DATE02] decoder shift register scan chain Core
Use one decoder and shift register Loosened the ATE timing constraint –Exploitation of frequency ratio Reduce peek scan-power –Shift register buffering Synchronization overhead Decrease in compression ratio –Unbalanced scan chains –Test set rotation
Extend single scan chain TDC … (cont) Use one decoder per scan chain [Chandra, TCAD01] [Gonciari, ETW02] ctrl distr dec1 dec2 dec3 scan chain Core
Use one decoder per scan chain Loosened the ATE timing constraint –Exploitation of frequency ratio Reduced scan-power –Scan chain partitioning Good compression ratio –No test set rotation Reduced synchronization overhead Increased area and control overhead Large number of scan chains Unbalanced scan chains
Low-cost SOC test Solutions –Test data reduction –Reuse existing ATE technology –Exploit chip/ATE frequency ratio –Reduce pin count testing (RPCT) –Scan chain partitioning Use one decoder per scan chain Increased area and control overhead Large number of scan chains Unbalanced scan chains
Overview Low-cost system-on-a-chip test Single vs. multiple scan chains compression Proposed add-on architecture –TAM add-on architecture Core wrapper design Reduce control and area overhead –Design flow integration Experimental results Conclusion
TAM add-on architecture Core SOC Low-cost solution for core based SOC test TAM add-on
Core wrapper design WSC2 WSC3 WSC1 WSC4 Core tb2 tb3 tb4 tb1 Why core wrapper design ? WSC partitioning [Gonciari, VTS02] –Useless memory reduction –Easy control
Reducing control and area overhead ctrl distr dec1 dec2 dec3 WSC Core dec4 WSC Instead of
Reducing control and area overhead … WSC Core WSC partitioning –2 partitions –1 control unit per partition –1 decoder per partition Exploit WSC partitioning for area and control reduction
Reducing control and area overhead … WSC Control –Length of max scan chain –No of scan chains –Diff of partitions length Easy control per partition diff length no WSCs
WSC dec1 Extended decoder (xDec) – input dec scan clk data lengthno WSCs diff
Extended decoder (xDec) – output WSC dec no WSCs mux SISR
Extended distribution architecture distr xDec1 mux SISR Core WSC xDec2 mux SISR mux xDistr
Extended distribution architecture … Core WSC Core WSC Unequal partition size for some cores !!
Extended distribution architecture xDec1 mux xDec2 mux add-on-xDistr mux Core WSC Core
Multiple TAM SOC test Core 2xSISR add-on Core SOC
Design flow integration
Overview Low-cost system-on-a-chip test –Test data reduction –Synchronization overhead Single vs. multiple scan chains compression Proposed add-on architecture –TAM add-on architecture Core wrapper design Reduce control and area overhead –Design flow integration Experimental results Conclusion
Minimum VTD vs. equal partitions Test bus = 16 Frequency ratio 2
Minimum VTD vs. equal partitions Test bus = 16 Frequency ratio 4
add-on-xDistr vs. SSC Core s35932Frequency ratio 2
add-on-xDistr vs. SSC Core s35932Frequency ratio 4
add-on-xDistr vs. SSC System 1Frequency ratio 2 Test bus 24Reduction 19.29%
add-on-xDistr vs. SSC System 2Frequency ratio 2 Test bus 24Reduction 26.88%
Conclusion Low-cost solution for core based SOC test TAM add-on architecture Design flow integration Exploited core wrapper design features –Reduced control overhead –Reduced area overhead Reduced scan power through partitioning Small area overhead (3-4%) for System1,2
Test data reduction dec DIB SO SOC ATE CUT Head Aims –Volume of test data –Area overhead –Test application time
Generic on-chip decoder CI PG ATE scan clk data in ate clk Data out sync Serial decoder –PG and CI can not work independently –Implicit communication between PG and CI Parallel decoder –PG and CI can work independently –Explicit communication between PG and CI
Synchronization overhead Extensions to the DIB –Multiple ATE channels –Deserialization units –Latency FIFOs –Clock synchronization
Synchronization overhead (cont) dec DIB SOC ATE CUT SO New ATEs Source synchronous buses Require programming
Synchronization overhead (cont) dec DIB SOC ATE CUT SO
Synchronization overhead (cont) Low-cost test through ATE reuse –Small area overhead increase –Solution for entire chip test –Test application time reduction dec DIB SOC ATE CUT SO
Synchronization overhead Old ATEs –Latency FIFO –Clock synchronization PG CI STOPCI ATE clk Chip clk PG
On-chip SO solution PG CI STOPCI ATE clk Chip clk PG
On-chip SO solution (cont) Increased VTD and TAT Exploit DUMMY bits and reduce VTD and TAT PG CI DUMMYCI ATE clk Chip clk PG
On-chip SO solution (cont) Distribution unit –Any number of cores –Self synchronous architecture PG PG1 CI1 CI2CI1 ATE clk Chip clk PG1 distr dec1 dec2
XOR-network %tpl
S38417: VTD / TAT for w = 32
S35932: VTD / TAT for w = 32