1 Architectural Complexity: Opening the Black Box Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems EECC-756.

Slides:



Advertisements
Similar presentations
Presenter : Shao-Chieh Hou VLSI Design, Automation and Test, VLSI-DAT 2007.
Advertisements

EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Microprocessors. Von Neumann architecture Data and instructions in single read/write memory Contents of memory addressable by location, independent of.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Computer Organization and Architecture
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Feng-Xiang Huang MCORE Architecture implements Real-Time Debug Port based on Nexus Consortium Specification David Ruimy Gonzales Senior Member of Technical.
RUAG Aerospace 11 Using SpaceWire as EGSE Interface Anders Petersén, Torbjörn Hult RUAG Aerospace Sweden AB (Saab Space) International SpaceWire Conference.
The ARM7TDMI Hardware Architecture
Feng-Xiang Huang A Low-Cost SOC Debug Platform Based on On-Chip Test Architectures.
Presenter : Shau-Jay Hou Tsung-Cheng Lin Kuan-Fu Kuo 2015/6/12 EICE team TraceDo: An On-Chip Trace System for Real-Time Debug and Optimization in Multiprocessor.
Chapter 12 Pipelining Strategies Performance Hazards.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
Real-Time Systems Design JTAG – testing and programming.
Processor Design and Implementation for Real-Time Testing of Embedded Systems Walters, G.; King, E.; Kessinger, R.; Fryer, R. 17th DASC. The AIAA/IEEE/SAE,
Ritu Varma Roshanak Roshandel Manu Prasanna
Chapter 12 CPU Structure and Function. Example Register Organizations.
1 Evgeny Bolotin – ICECS 2004 Automatic Hardware-Efficient SoC Integration by QoS Network on Chip Electrical Engineering Department, Technion, Haifa, Israel.
1-1 Embedded Software Development Tools and Processes Hardware & Software Hardware – Host development system Software – Compilers, simulators etc. Target.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
I/O Systems CSCI 444/544 Operating Systems Fall 2008.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
Lect 13-1 Lect 13: and Pentium. Lect Microprocessor Family  Microprocessor  Introduced in 1989  High Integration  On-chip 8K.
Cortex-M3 Debugging System
Presenter : Shao-Cheih Hou Sight count : 11 ASPDAC ‘08.
Ross Brennan On the Introduction of Reconfigurable Hardware into Computer Architecture Education Ross Brennan
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA
Reporter: PCLee. Assertions in silicon help post-silicon debug by providing observability of internal properties within a system which are.
National Taiwan University JTAG and Multi-ICE Speaker : 沈文中.
Introduction CSE 410, Spring 2008 Computer Systems
Standard based Instrumentation schemes for 3D SoC Neal Stollon Chairman, Nexus 5001 Forum
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
I/O Systems I/O Hardware Application I/O Interface
Presenter: Hong-Wei Zhuang On-Chip SOC Test Platform Design Based on IEEE 1500 Standard Very Large Scale Integration (VLSI) Systems, IEEE Transactions.
Computer Architecture and Organization Introduction.
Develop and Implementation of the Speex Vocoder on the TI C64+ DSP
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
2015/10/14Part-I1 Introduction to Parallel Processing.
Other Chapters From the text by Valvano: Introduction to Embedded Systems: Interfacing to the Freescale 9S12.
NS7520.
Lecture 1 1 Computer Systems Architecture Lecture 1: What is Computer Architecture?
TEMPLATE DESIGN © Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan.
Buffer-On-Board Memory System 1 Name: Aurangozeb ISCA 2012.
© 2004 Mercury Computer Systems, Inc. FPGAs & Software Components Graham Bardouleau & Jim Kulp Mercury Computer Systems, Inc. High Performance Embedded.
Computer Architecture System Interface Units Iolanthe II approaches Coromandel Harbour.
Chapter 13: I/O Systems. 13.2/34 Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
25 April 2000 SEESCOASEESCOA STWW - Programma Evaluation of on-chip debugging techniques Deliverable D5.1 Michiel Ronsse.
Introduction to Microprocessors
PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.
Lab 2 Parallel processing using NIOS II processors
Computer Architecture 2 nd year (computer and Information Sc.)
This material exempt per Department of Commerce license exception TSU Xilinx On-Chip Debug.
Gedae, Inc. Gedae: Auto Coding to a Virtual Machine Authors: William I. Lundgren, Kerry B. Barnes, James W. Steed HPEC 2004.
Chapter 13 – I/O Systems (Pgs ). Devices  Two conflicting properties A. Growing uniformity in interfaces (both h/w and s/w): e.g., USB, TWAIN.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
Challenges in Hardware Logic Verification Bruce Wile IBM Server Group Verification Lead 10/25/01.
HCS12 Technical Training Module 15 – Break Module Slide 1 MOTOROLA and the Stylized M Logo are registered in the US Patent & Trademark Office. All other.
Computer Architecture Furkan Rabee
Introduction to Parallel Processing
JTAG and Multi-ICE National Taiwan University
JTAG, Multi-ICE and Angel
Chapter 13: I/O Systems.
Presentation transcript:

1 Architectural Complexity: Opening the Black Box Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems EECC-756

2 Modern Design Trends Larger on-chip caches Extended levels of cache System-on-a-chip integration Overall increasing design complexity All lead to more complex debugging of designs

3 The Good News Automated design tools are minimizing design errors IP reuse minimizes bugs Simulation tools discover most logic errors before fabrication Massive test suites allow comprehensive testing So what happened to Intel with FPU flaw?

4 Past Methods for Debugging Signal probing Bus monitoring Software debugging

5 Past Methods for Debugging (cont’d) Signal probing – More internal logic per pin = less info on pin – Pin inaccessibility due to modern packages (i.e. sockets, BGAs) Bus monitoring – Caches hide data accesses Software debugging – Impractical for real-time applications – Little or no hardware support in the past

6 Solutions Test Access Port (TAP) – Uses JTAG IEEE specification for boundary scan Probe Mode – Allows step by step analysis of code impact on internal registers In-circuit Emulation (ICE) – Allows execution tracing – Real-time applicability

7 Test Access Port (TAP) Implementation of boundary scan JTAG IEEE specification Allows access to all internal flip-flops in boundary scan chain Numerous chains serve different functions (i.e. IO flip-flops) Allows non-destructive snapshot of internal state at any point in time

8 Test Access Port (cont’d) Single instruction register Multiple data registers (scan chains)

9 Probe Mode Special processor mode halts program execution Uses the TAP interface to receive instructions and output internal data Allows read/write access to any internal registers Allows memory accesses to test cache functionality

10 Probe Mode (cont’d)

11 In-Circuit Emulation (ICE) Support Special pins provide branching information Example: Pentium Dual Pipeline – 3 dedicated pins IU – Asserted when instruction completes in the U instruction pipeline IV – Asserted when instruction completes in the V instruction pipeline IBT – (Instruction Branch Taken) Asserted when a branch is taken

12 In-Circuit Emulation (cont’d) Branch signal information provides realtime code tracing Branch trace message buffers provide further information Branch trace message buffers in conjunction with Probe Mode allow detailed realtime code tracing

13 Branch Trace Message Buffers FIFO queue Can be read through TAP during program execution Circular mode (trace-back from breakpoint) vs. Jump-to-Probe Mode (maintain instruction stream) Incident counter expands buffer size Intel automatically generates a special BTM cycle on local bus to export BTM info

14 Branch Trace Buffer Logic Implementation

15 Multiprocessor Issues Three methods for opening the “black box” on a single processor system – TAP (boundary scan) – Probe Mode – Branch Tracing Methods for ICE Multiple processor system design also has challenges

16 Multiprocessor Challenges Race conditions due to parallel data accesses Inconsistent and unpredictable network paths Differing processor behaviors on heterogeneous networks Communication patterns that restrict performance or scalability

17 Multiprocessor Solutions : Debugging Code Create sequential version of code Execute parallel tasks on a single computer as separate processes Visualization tools that create space-time diagrams or animations to show 2- dimensional changes of state Unified Trace Environment (IBM)

18 Multiprocessor Solutions : Debugging Designs Ability to monitor communication packets circumvents most visibility problems – Debug messages can be included in packet Network protocol simulations – Protocol verification programs (i.e. petri-nets) – Network communication pattern simulators However...

19 Multiprocessor Design Trends Currently, uniprocessor designs are hitting roadblocks – large dies impractical signal transit time – routing increases exponentially with die size One possible solution : multiple processors on a single die re-emergence of visibility problems

20 Conclusion Several methods available for internal execution tracing of uniprocessors – Test Access Port (JTAG IEEE1149.1) – Probe Mode extension – Branch Tracing Don’t count out TAP, Probe Mode, and ICE for multiprocessors