PAC ISS Zong-Cing Lin PAS lab, CSIE, NTU.

Slides:



Advertisements
Similar presentations
The Fetch – Execute Cycle
Advertisements

POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? ILP: VLIW Architectures Marco D. Santambrogio:
Computer Organization and Architecture
Microprocessor.  The CPU of Microcomputer is called microprocessor.  It is a CPU on a single chip (microchip).  It is called brain or heart of the.
Processor System Architecture
Computer Organization and Architecture
1 Implementation of VLD and Constant Division on PAC DSP Platform Student: Chung-Yen Tsai Advisor: Prof. David W. Lin Date:
Henry Hexmoor1 Chapter 10- Control units We introduced the basic structure of a control unit, and translated assembly instructions into a binary representation.
7/2/ _23 1 Pipelining ECE-445 Computer Organization Dr. Ron Hayne Electrical and Computer Engineering.
Multiple cycle implementation Each instruction takes more than one clock cycles to execution Q: How to break an instruction? Break each instruction into.
Computer Science 210 Computer Organization The Instruction Execution Cycle.
RICE UNIVERSITY Implementing the Viterbi algorithm on programmable processors Sridhar Rajagopal Elec 696
CSC 3210 Computer Organization and Programming Chapter 1 THE COMPUTER D.M. Rasanjalee Himali.
Chapter 4 MARIE: An Introduction to a Simple Computer.
Computer Design Basics
CPU Design. Introduction – The CPU must perform three main tasks: Communication with memory – Fetching Instructions – Fetching and storing data Interpretation.
Model Computer CPU Arithmetic Logic Unit Control Unit Memory Unit
ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013.
E X C E E D I N G E X P E C T A T I O N S VLIW-RISC CSIS Parallel Architectures and Algorithms Dr. Hoganson Kennesaw State University Instruction.
Computer Organization CDA 3103 Dr. Hassan Foroosh Dept. of Computer Science UCF © Copyright Hassan Foroosh 2002.
Ch. 10 Central Processing Unit Designs - CISC. Two CPU designs CISC –Non-pipelined datapath with a micro- programmed control unit RISC –Pipelined datapath.
Question What technology differentiates the different stages a computer had gone through from generation 1 to present?
Simple ALU How to perform this C language integer operation in the computer C=A+B; ? The arithmetic/logic unit (ALU) of a processor performs integer arithmetic.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
Lecture 2: Instruction Set Architecture part 1 (Introduction) Mehran Rezaei.
Figure 9.1 Architecture of a Simple Computer System.
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
1 Basic Processor Architecture. 2 Building Blocks of Processor Systems CPU.
Chapter 20 Computer Operations Computer Studies Today Chapter 20.
Basic Computer Organization and Design
Assembly language.
Chapter 10: Computer systems (1)
Control Unit Lecture 6.
Chapter 4 The Von Neumann Model
Chap 7. Register Transfers and Datapaths
Introduction of microprocessor
Chapter 4 The Von Neumann Model
Morgan Kaufmann Publishers The Processor
Chapter 4 The Von Neumann Model
Introduction to Micro Controllers & Embedded System Design Stored Program Machine Department of Electrical & Computer Engineering Missouri University.
Computer Science 210 Computer Organization
Design of the Control Unit for Single-Cycle Instruction Execution
Instruction Level Parallelism and Superscalar Processors
Computer Architecture
Array Processor.
Computer Science 210 Computer Organization
Superscalar Processors & VLIW Processors
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
CS149D Elements of Computer Science
Computer Organization and ASSEMBLY LANGUAGE
Design of the Control Unit for One-cycle Instruction Execution
Computer Architecture
A Multiple Clock Cycle Instruction Implementation
Processor Organization and Architecture
MARIE: An Introduction to a Simple Computer
Figure 8.1 Architecture of a Simple Computer System.
Morgan Kaufmann Publishers The Processor
BIC 10503: COMPUTER ARCHITECTURE
The Processor Lecture 3.1: Introduction & Logic Design Conventions
Instruction Execution Cycle
TI C6701 VLIW MIMD.
Introduction to Microprocessor Programming
Control units In the last lecture, we introduced the basic structure of a control unit, and translated our assembly instructions into a binary representation.
Instruction Set Principles
Addressing mode summary
Review: The whole processor
Chapter 11 Processor Structure and function
COMPUTER ORGANIZATION AND ARCHITECTURE
Processor Organization and Architecture
Presentation transcript:

PAC ISS Zong-Cing Lin PAS lab, CSIE, NTU

Outline Introduction PAC ISS Architecture Pipeline Instruction packet PAS lab, CSIE, NTU

Introduction PACDSP: a VLIW DSP High performance, low cost for multimedia applications. Suitable for the products with multi-standard CODEC requirement. PMP (portable media player) Smart phone TV controller Low cost by VLIW PAS lab, CSIE, NTU

Architecture CFU: customized function in PAS lab, CSIE, NTU

Pipeline Instruction Fetch Instruction Memory Access One cycle latency to access the instruction memory Instruction Dispatch Dispatching the instructions of VLIW packet into the relative slots Instruction Decode PAS lab, CSIE, NTU

Pipeline (cont’d) Read Operand Execution 1 Execution 2 Execution 3 most datapath function unit Execution 2 Multi-cycle instructions Sending control signals to data memory Execution 3 Processing data loaded from the data memory. Register Write-back PAS lab, CSIE, NTU

PSCU Program Sequence Control Unit the controlling issues of the program flow. dispatching instructions to the scalar unit and VLIW datapath. In the speaking of pipeline architecture, it should bring PSCU under preceding 4 stages. PAS lab, CSIE, NTU

Scalar Scalar unit Handling control-based task for PACDSP. Simple capacity for data computing, like a RISC machine in PACDSP. Main functions: Program flow control function Data processing function Memory access function Data transfer function Register: General purpose scalar register (R0-R15) System register (SR0-SR15) Predication register (P0-P15) PAS lab, CSIE, NTU

VLIW datapath Two clusters; Four way Arithmetic Unit Load/Store Unit Arithmetic and comparison instructions Data transfer instructions Bit manipulation instructions Multiplication and accumulation instructions Special instructions Load/Store Unit Arithmetic and comparison instructions Data transfer instructions Bit manipulation instructions Load/Store instructions Supporting double load/store instructions Special instructions 稍微描述一下他們在第五個pipeline stage同時開工 但Load/Store Unit結束的時間可能會比較晚 PAS lab, CSIE, NTU

VLIW datapath (cont’d) Register: Ping-pong register file (D0-D7 & D8-D15) Accumulator register (AC0-AC7) Address register (A0-A7) Constant register (C0-C7) Control flags (CF0-CF7) Ping-pong之名也許是因為au和ls可以讀寫不同group的register PAS lab, CSIE, NTU

Instruction Packet Instruction slot Instruction types PSCU / Scalar instructions 1 VLIW Load/Store Instructions (cluster 1) 2 VLIW Arithmetic Instructions (cluster 1) 3 VLIW Load/Store Instructions (cluster 2) 4 VLIW Arithmetic Instructions (cluster 2) 稍微提一下有提供instruction broadcast的功能 1、3和2、4 PAS lab, CSIE, NTU

PAC ISS Cycle-accurate instruction set simulator (ISS) It can dump registers and memory contents cycle by cycle. It can simulate more than 10000 cycles per second in average PAS lab, CSIE, NTU

Execution Flow of PAC ISS (I) Set the configure file of ISS to dump register or memory value. See demo!! Row Meaning Range To dump the register of cluster 1 if it is set 1:dump; 0:don’t dump 1 To dump the register of cluster 2 if it is set 2 To dump the register of scalar if it is set 3 To dump the predicate & branch registers if it is set 4 To dump the control registers if it is set 5 To dump the data memory if it is set 6 The start address of dumped memory (in hex) 00000000~0000fffe 7 The end address of dumped memory (in hex) 00000001~0000ffff 8 To dump the constant register of cluster1 if it is set 9 To dump the constant register of cluster2 if it is set 10 The flag of boot standalone No use now. PAS lab, CSIE, NTU

Execution Flow of PAC ISS (II) Options: o: specify the input file is in ELF format (essential) g: run ISS with GNU GDB d: dump the registers and internal memory content cycle by cycle s: step by step execution r: display the registers content on stdout m: display the internal memory content on stdout PAS lab, CSIE, NTU

Execution Flow of PAC ISS (II) (cont’d) Options: l: set the memory model to interleaving mode p: pre-load data into memory c: show PC value in dump file f: dump the final cycle content i: specify the local memory size in MB PAS lab, CSIE, NTU

Demo (I) See the basic information after simulation Use file to check input file type Options: o specify the input file is in ELF format Local memory size PAS lab, CSIE, NTU

Demo (II) Options: s: step by step execution r: display the registers content on stdout o: specify the input file is in ELF format PAS lab, CSIE, NTU

Demo (III) Dump cycle by cycle Options: c: show PC value in dump file d: dump the registers and internal memory content cycle by cycle o: specify the input file is in ELF format PAS lab, CSIE, NTU

Demo (IV) Run ISS with GDB Options: PAS lab, CSIE, NTU