Presenter : Cheng-Ta Wu David Lin1, Ted Hong1, Farzan Fallah1, Nagib Hakim3, Subhasish Mitra1, 2 1 Department of EE and 2 Department of CS Stanford University,

Slides:



Advertisements
Similar presentations
Larrabee Eric Jogerst Cortlandt Schoonover Francis Tan.
Advertisements

Topics to be discussed Introduction Performance Factors Methodology Test Process Tools Conclusion Abu Bakr Siddiq.
Cache Coherence “Can we do a better job of supporting cache coherence?” Ross Daly Chan Kim.
Exploring Memory Consistency for Massively Threaded Throughput- Oriented Processors Blake Hechtman Daniel J. Sorin 0.
High Performing Cache Hierarchies for Server Workloads
1 Hardware Support for Isolation Krste Asanovic U.C. Berkeley MURI “DHOSA” Site Visit April 28, 2011.
Zhongkai Chen 3/25/2010. Jinglei Wang; Yibo Xue; Haixia Wang; Dongsheng Wang Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China This paper.
CML Efficient & Effective Code Management for Software Managed Multicores CODES+ISSS 2013, Montreal, Canada Ke Bai, Jing Lu, Aviral Shrivastava, and Bryce.
Testing and Quality Assurance
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Feng-Xiang Huang 2015/5/4 International Symposium Quality Electronic Design (ISQED), th M. H Neishaburi, Zeljko Zilic, McGill University, Quebec.
Reporter:PCLee With a significant increase in the design complexity of cores and associated communication among them, post-silicon validation.
The Performance of Spin Lock Alternatives for Shared-Memory Microprocessors Thomas E. Anderson Presented by David Woodard.
Feng-Xiang Huang A Low-Cost SOC Debug Platform Based on On-Chip Test Architectures.
1 Presenter: Chien-Chih Chen. 2 An Assertion Library for On- Chip White-Box Verification at Run-Time On-Chip Verification of NoCs Using Assertion Processors.
Microsoft Research Faculty Summit Yuanyuan(YY) Zhou Associate Professor University of Illinois, Urbana-Champaign.
Presenter : Shih-Tung Huang Tsung-Cheng Lin Kuan-Fu Kuo 2015/6/26 EICE team dIP: A Non-Intrusive Debugging IP for Dynamic Data Race Detection in Many-core.
Software Testing Prasad G.
What Exactly are the Techniques of Software Verification and Validation A Storehouse of Vast Knowledge on Software Testing.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
COMPUTER ORGANIZATIONS CSNB123 May 2014Systems and Networking1.
L i a b l eh kC o m p u t i n gL a b o r a t o r y Trace-Based Post-Silicon Validation for VLSI Circuits Xiao Liu Department of Computer Science and Engineering.
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
1. Topics to be discussed Introduction Objectives Testing Life Cycle Verification Vs Validation Testing Methodology Testing Levels 2.
Emerging Technologies: A CompSci Perspective UC SANTA BARBARA Tim Sherwood.
Reporter: PCLee. Assertions in silicon help post-silicon debug by providing observability of internal properties within a system which are.
Presenter: Jyun-Yan Li Systematic Software-Based Self-Test for Pipelined Processors Mihalis Psarakis Dimitris Gizopoulos Miltiadis Hatzimihail Dept. of.
Concurrent Autonomous Self-Test for Uncore Components in SoCs Yanjing Li, Stanford University Onur Mutlu, Carnegie Mellon University Donald S. Gardner,
IVEC: Off-Chip Memory Integrity Protection for Both Security and Reliability Ruirui Huang, G. Edward Suh Cornell University.
Presenter : Ching-Hua Huang 2013/7/15 A Unified Methodology for Pre-Silicon Verification and Post-Silicon Validation Citation : 15 Adir, A., Copty, S.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Dynamic Verification of Cache Coherence Protocols Jason F. Cantin Mikko H. Lipasti James E. Smith.
Robust Low Power VLSI ECE 7502 S2015 Post-Silicon Verification using Quick Error Detection ECE 7502 Class Discussion Ben Calhoun Thursday January 22, 2015.
1 Instruction Sets and Beyond Computers, Complexity, and Controversy Brian Blum, Darren Drewry Ben Hocking, Gus Scheidt.
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors THOMAS E. ANDERSON Presented by Daesung Park.
Today’s Agenda  HW #1  Finish Introduction  Input Space Partitioning Software Testing and Maintenance 1.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
Abdullah Aldahami ( ) March 23, Introduction 2. Background 3. Simulation Techniques a.Experimental Settings b.Model Description c.Methodology.
Chapter 17 Looking “Under the Hood”. 2Practical PC 5 th Edition Chapter 17 Getting Started In this Chapter, you will learn: − How does a computer work.
CML SSDM: Smart Stack Data Management for Software Managed Multicores Jing Lu Ke Bai, and Aviral Shrivastava Compiler Microarchitecture Lab Arizona State.
- 1 - ©2009 Jasper Design Automation ©2009 Jasper Design Automation JasperGold for Targeted ROI JasperGold solutions portfolio delivers competitive.
Software Development Problem Analysis and Specification Design Implementation (Coding) Testing, Execution and Debugging Maintenance.
Processor Architecture
Software Quality Assurance and Testing Fazal Rehman Shamil.
An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)
컴퓨터교육과 이상욱 Published in: COMPUTER ARCHITECTURE LETTERS (VOL. 10, NO. 1) Issue Date: JANUARY-JUNE 2011 Publisher: IEEE Authors: Omer Khan (Massachusetts.
Chapter 11 System-Level Verification Issues. The Importance of Verification Verifying at the system level is the last opportunity to find errors before.
Challenges in Hardware Logic Verification Bruce Wile IBM Server Group Verification Lead 10/25/01.
SOFTWARE TESTING AND QUALITY ASSURANCE. Software Testing.
The University of Adelaide, School of Computer Science
Cs498dm Software Testing Darko Marinov January 24, 2012.
MAHARANA PRATAP COLLEGE OF TECHNOLOGY SEMINAR ON- COMPUTER PROCESSOR SUBJECT CODE: CS-307 Branch-CSE Sem- 3 rd SUBMITTED TO SUBMITTED BY.
Testing and Debugging UCT Department of Computer Science Computer Science 1015F Hussein Suleman March 2009.
Chapter Goals Describe the application development process and the role of methodologies, models, and tools Compare and contrast programming language generations.
Chapter 17 Looking “Under the Hood”
Multiprocessing.
nZDC: A compiler technique for near-Zero silent Data Corruption
What happens inside a CPU?
Hwisoo So. , Moslem Didehban#, Yohan Ko
Fault Injection: A Method for Validating Fault-tolerant System
COMPI: Concolic Testing for MPI Applications
Interconnect with Cache Coherency Manager
Lecture: Cache Hierarchies
Chapter 17 Looking “Under the Hood”
Plc & scada applications
Chapter 4 Multiprocessors
Programming with Shared Memory Specifying parallelism
Presentation transcript:

Presenter : Cheng-Ta Wu David Lin1, Ted Hong1, Farzan Fallah1, Nagib Hakim3, Subhasish Mitra1, 2 1 Department of EE and 2 Department of CS Stanford University, Stanford, CA, USA 3 Intel Corporation Santa Clara, CA, USA DAC’12, June 3–7, 2012, San Francisco, CA, USA 1

We present a new technique for systematically creating post-silicon validation tests that quickly detect bugs in processor cores and uncore components (cache controllers, memory controllers, on-chip networks) of multi-core System on Chips (SoCs). Such quick detection is essential because long error detection latency, the time elapsed between the occurrence of an error due to a bug and its manifestation as an observable failure, severely limits the effectiveness of existing post-silicon validation approaches. In addition, we provide a list of realistic bug scenarios abstracted from “difficult” bugs that occurred in commercial multi-core SoCs. Our results for an OpenSPARC T2-like multi-core SoC demonstrate: 1. Error detection latencies of “typical” post-silicon validation tests can be very long, up to billions of clock cycles, especially for bugs in uncore components. 2. Our new technique shortens error detection latencies by several orders of magnitude to only a few hundred cycles for most bug scenarios. 3. Our new technique enables 2-fold increase in bug coverage. An important feature of our technique is its software-only implementation without any hardware modification. Hence, it is readily applicable to existing designs. 2

Typical post-silicon validation tests  Very long detection latencies for detecting bugs.  Difficult to trace too far back to history for bug localization.  Check the expected output values is not in time. This paper presented a new Proactive Load and Check(PLC) technique to short the latencies of bug detection 3

4 [Hong 10] QED: Quick Error Detection Tests for Effective Post-Silicon Validation This Paper PLC transformation extend

Step 1: Initialization  Transforming the existing validation tests into new test with PLC. 。 Using EDDI-V transformation.  “Error Detection by Duplicated Instructions for Validation”  Perform loads from selected variables.  Insert self-consistency checks on those variables. 5

。 Create PLC_List   Protect the listed variables to against race conditions. 6

Step 2: PLC Operation Insertion  PLC transformation inserts PLC operations in each thread in each processor core.  PLC_inst_min 。 To minimize possible intrusiveness due to PLC operations. 。 The minimum number of instructions in the same thread that must execute before a PLC operation is inserted. 7

Environment :  8 processor cores, 64 threads.  private split L1 data and instruction caches.  crossbar-based interconnects.  8-way banked L2 cache using directory-based cache coherence protocol.  4 memory controllers. 8

Benchmark  SPLASH-2: FFT, LU  proprietary industrial post-silicon validation test targeting memory bugs. Results  OERT(Original Equivalent RunTime tests) 9

Several orders of magnitude improvement in error detection latencies. The error detection latencies of PLC tests are within a few hundred. 2-fold improvement in the coverage of bug scenarios. 10

Post-silicon validation involves three activities:  detecting a problem by applying proper stimuli  localizing the problem to a small region inside the chip  fixing the problem through software patches, circuit editing, or silicon re- spin. The effort to localize the problem from an observed failure often dominates the cost of post-silicon validation. 11

By analyzing “difficult” bugs(from proprietary bug databases) that occurred in lasted commercial multi- core SoCs(OpenSPARC T2-like) These bug scenarios are considered “difficult” because of very long debug times as indicated in bug reports. Each bug scenario is decomposed into a bug activation criterion and a bug effect. 12

The condition that must be satisfied to activate a bug. Criteria 1-4 correspond to cache controller bugs. Criteria 5 correspond to bugs inside cache/memory controller and on-chip networks. Criteria 6-8 correspond to processor core bugs. 13

Be defined as the incorrect behavior resulting from bug activation. Effect A-E correspond to cache controller bugs. Effect F corresponds to memory controller bugs. Effect G corresponds to interconnection network bugs. Effect H-J correspond to bugs inside processor cores. 14

Create families of bug scenarios by adjusting integer parameters X and Y in Tables 1a and 1b. For example, pairing bug activation criterion 2, for X=10, with bug effect A produces the following bug scenario: 15