Presenter: Jyun-Yan Li A software-based self-test methodology for in-system testing of processor cache tag arrays G. Theodorou, N. Kranitis, A. Paschalis.

Slides:



Advertisements
Similar presentations
Presenter: Jyun-Yan Li On the Generation of Functional Test Programs for the Cache Replacement Logic W. J. Perez H. Universidad del Valle Grupo de Bionanoelectrónica.
Advertisements

D. Tam, R. Azimi, L. Soares, M. Stumm, University of Toronto Appeared in ASPLOS XIV (2009) Reading Group by Theo 1.
11/8/2005Comp 120 Fall November 9 classes to go! Read Section 7.5 especially important!
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
Reporter :LYWang We propose a multimedia SoC platform with a crossbar on-chip bus which can reduce the bottleneck of on-chip communication.
Code Transformations to Improve Memory Parallelism Vijay S. Pai and Sarita Adve MICRO-32, 1999.
BackSpace: Formal Analysis for Post-Silicon Debug Flavio M. de Paula * Marcel Gort *, Alan J. Hu *, Steve Wilton *, Jin Yang + * University of British.
Presenter: Jyun-Yan Li A Software-Based Self-Test Methodology for On-Line Testing of Processor Caches G. Theodorou, N. Kranitis, A. Paschalis, D. Gizopoulos.
BIST for Logic and Memory Resources in Virtex-4 FPGAs Sachin Dhingra, Daniel Milton, and Charles Stroud Electrical and Computer Engineering Auburn University.
Using one level of Cache:
How caches take advantage of Temporal locality
Embedded Hardware and Software Self-Testing Methodologies for Processor Cores Li Chen, Sujit Dey, Pablo Sanchez, Krishna Sekar, and Ying Chen Design Automation.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Spring 07, Jan 16 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
Associative Cache Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word (or sub-address in line) Tag.
Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.
Lecture 27 Memory and Delay-Fault Built-In Self-Testing
Evaluation of Redundancy Analysis Algorithms for Repairable Embedded Memories by Simulation Laboratory for Reliable Computing (LaRC) Electrical Engineering.
Laboratory for Reliable Computing (LaRC) Department of Electrical Engineering National Tsing Hua University Hsinchu, Taiwan Flash Memory Built-In.
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
ECE/CSC Yan Solihin 1 An Optimized AMPM-based Prefetcher Coupled with Configurable Cache Line Sizing Qi Jia, Maulik Bakulbhai Padia, Kashyap Amboju.
Cache intro CSE 471 Autumn 011 Principle of Locality: Memory Hierarchies Text and data are not accessed randomly Temporal locality –Recently accessed items.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
BIST vs. ATPG.
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
Software-Based Online Detection of Hardware Defects: Mechanisms, Architectural Support, and Evaluation Kypros Constantinides University of Michigan Onur.
Cache Organization of Pentium
Presenter: Jyun-Yan Li Systematic Software-Based Self-Test for Pipelined Processors Mihalis Psarakis Dimitris Gizopoulos Miltiadis Hatzimihail Dept. of.
Presenter : Ching-Hua Huang 2013/9/16 Visibility Enhancement for Silicon Debug Cited count : 62 Yu-Chin Hsu; Furshing Tsai; Wells Jong; Ying-Tsai Chang.
Presenter : Ching-Hua Huang 2013/7/15 A Unified Methodology for Pre-Silicon Verification and Post-Silicon Validation Citation : 15 Adir, A., Copty, S.
Presenter: Jyun-Yan Li Effective Software-Based Self-Test Strategies for On-Line Periodic Testing of Embedded Processors Antonis Paschalis Department of.
A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy Jason Zebchuk, Elham Safi, and Andreas Moshovos
Presenter: Jyun-Yan Li A hybrid approach to the test of cache memory controllers embedded in SoCs’ W. J. Perez, J. Velasco Universidad del Valle Grupo.
IEEE ICECS 2010 SysPy: Using Python for processor-centric SoC design Evangelos Logaras Elias S. Manolakos {evlog, Department of Informatics.
EE141 VLSI Test Principles and Architectures Ch. 9 - Memory Diagnosis & BISR - P. 1 1 Chapter 9 Memory Diagnosis and Built-In Self-Repair.
Prefetching Challenges in Distributed Memories for CMPs Martí Torrents, Raúl Martínez, and Carlos Molina Computer Architecture Department UPC – BarcelonaTech.
Presenter: PCLee. Semiconductor manufacturers aim at delivering high-quality new devices within shorter times in order to gain market shares.
Built-In Self-Test/Self-Diagnosis for RAMs
Test and Test Equipment Joshua Lottich CMPE /23/05.
Garo Bournoutian and Alex Orailoglu Proceedings of the 45th ACM/IEEE Design Automation Conference (DAC’08) June /10/28.
Preeti Ranjan Panda, Anant Vishnoi, and M. Balakrishnan Proceedings of the IEEE 18th VLSI System on Chip Conference (VLSI-SoC 2010) Sept Presenter:
Computer Architecture Lecture 26 Fasih ur Rehman.
A Single-Pass Cache Simulation Methodology for Two-level Unified Caches + Also affiliated with NSF Center for High-Performance Reconfigurable Computing.
Low-Power Cache Organization Through Selective Tag Translation for Embedded Processors with Virtual Memory Support Xiangrong Zhou and Peter Petrov Proceedings.
Increasing Cache Efficiency by Eliminating Noise Prateek Pujara & Aneesh Aggarwal {prateek,
Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput.
Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.
4.3 Virtual Memory. Virtual memory  Want to run programs (code+stack+data) larger than available memory.  Overlays programmer divides program into pieces.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
1 CSCI 2510 Computer Organization Memory System II Cache In Action.
Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.
Technical University Tallinn, ESTONIA Copyright by Raimund Ubar 1 Raimund Ubar N.Mazurova, J.Smahtina, E.Orasson, J.Raik Tallinn Technical University.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
A Test-Per-Clock LFSR Reseeding Algorithm for Concurrent Reduction on Test Sequence Length and Test Data Volume Wei-Cheng Lien 1, Kuen-Jong Lee 1 and Tong-Yu.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
IMPROVING THE PREFETCHING PERFORMANCE THROUGH CODE REGION PROFILING Martí Torrents, Raúl Martínez, and Carlos Molina Computer Architecture Department UPC.
Exam 2 Review Two’s Complement Arithmetic Ripple carry ALU logic and performance Look-ahead techniques, performance and equations Basic multiplication.
Cache memory. Cache memory Overview CPU Cache Main memory Transfer of words Transfer of blocks of words.
Cache Organization of Pentium
COSC3330 Computer Architecture
Ioannis E. Venetis Department of Computer Engineering and Informatics
Research Topic Approval Presentation --- Instructions
5.2 Eleven Advanced Optimizations of Cache Performance
William Stallings Computer Organization and Architecture 7th Edition
Set-Associative Cache
Virtual Memory: Working Sets
Principle of Locality: Memory Hierarchies
Lois Orosa, Rodolfo Azevedo and Onur Mutlu
Presentation transcript:

Presenter: Jyun-Yan Li A software-based self-test methodology for in-system testing of processor cache tag arrays G. Theodorou, N. Kranitis, A. Paschalis Department of Informatics & Telecommunications, University of Athens, Greece D. Gizopoulos Department of Informatics, University of Piraeus, Greece On-Line Testing Symposium (IOLTS), 2010 IEEE 16th International

Software-Based Self-Test (SBST) has emerged as an effective alternative for processor manufacturing and in- system testing. For small memory arrays that lack BIST circuitry such as cache tag arrays, SBST can be a flexible and low-cost solution for March test application and thus a viable supplement to hardware approaches. In this paper, a generic SBST program development methodology is proposed for periodic in-system (on-line) testing of L1 data and instruction cache memory tag arrays (both for direct mapped and set associative organization) based on contemporary March test algorithms. The proposed SBST methodology utilizes existing special performance instructions and performance monitoring mechanisms of modern processors to overcome cache tag testability challenges. 2

Experimental results on OpenRISC 1200 processor core demonstrate that high test quality of contemporary March test algorithms is preserved while low-cost in-system testing in terms of test duration and test code size is achieved. 3

Memory Built-In Self-Test(MBIST)  at-speed and several tests  Impact on chip area and performance  SBST has non-intrusive and flexibility Defect tag arrays may cause erroneous cache miss  No MBIST circuit because size 4

Performance counter [16] Performance counter [16] This paper March SS [2] March SS [2] Direct mapped D$ SBST [14] Direct mapped D$ SBST [14] Implement SBST 5 MBIST [5] MBIST [5] compare March Memory fault simulator Simulation engine & fault descriptors Traditional March [1] Traditional March [1] March Minimal SS [3] March Minimal SS [3] RAMSES [19] RAMSES [19] Impact on chip area & performance Detect erroneous cache miss SBST

March SS Data cache algorithm  Create_address(DB:i:B) Instruction cache algorithm  Create_address(DB:i:0) 6 r0(1): read data (inverted data) B: word offset DB: N-bit word B: word offset DB: N-bit word Fetch first instruction in the cache line for alignment TF RDF

7 start create address (A) Prefetch block(A) create address (A) i=Nd? M2, M3,M4, M5 Load(A) Prefetch block(A) i=Nd? Performan ce counter =0? No Yes error No Test successful Yes R hit W tag Load(A) R hit K times? Yes end Yes No R hit W tag M0 M1

8 start create address (A) Prefetch block(A) i=Nd? W tag M0 Disable cache create address (A) M1 Call (A) Prefetch block(A) M2, M3,M4, M5 Performan ce counter =0? error No Test successful K times? end Yes No Enable cache Disable cache Call (A) Enable cache Disable cache Yes No i=Nd? No Yes R hit

OpenRISC 1200 Result: 9 typeWrite policyTag array cache4Kbyte direct mappedWrite through256*20 [15] n: total number of bits of the array

Write operation  [14]: 6 instructions  This paper: 2 instructions Read verification  [14]: 7 instructions  This paper: 2 instructions March C- algorithm 10

Using March SS algorithm and decreasing code size by performance counter and prefetch instruction My comment  Assume cache hit to detect cache behaviors by prefetch operation  Some jargons not describe, ex: CF  Fault description’s definition 。 some March tests do not guarantee complete fault coverage in all fault models 11