Lecture 15 VHDL Modeling of Microprocessors.

Slides:



Advertisements
Similar presentations
3-Software Design Basics in Embedded Systems
Advertisements

Chapter 3 General-Purpose Processors: Software
Arithmetic Logic Unit (ALU)
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
Topics covered: CPU Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.
Computer Organization and Architecture
Chapter 8. Pipelining. Instruction Hazards Overview Whenever the stream of instructions supplied by the instruction fetch unit is interrupted, the pipeline.
Computer Organization and Architecture
Computer Organization and Architecture
Computer Systems. Computer System Components Computer Networks.
Processor Technology and Architecture
Execution of an instruction
Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.
Chapter 3 General-Purpose Processors: Software
Stored Program Concept: The Hardware View
General Purpose Processors: Software This Week In DIG II  Introduction  Basic Architecture  Operation  Programmer’s view (that would be you !) 
Topics covered: CPU Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Recap – Our First Computer WR System Bus 8 ALU Carry output A B S C OUT F 8 8 To registers’ input/output and clock inputs Sequence of control signal combinations.
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 3 General-Purpose Processors: Software ECE 4330 Embedded System Design.
Henry Hexmoor1 Chapter 10- Control units We introduced the basic structure of a control unit, and translated assembly instructions into a binary representation.
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 3 General-Purpose Processors: Software.
General Purpose Processors: Software. This Week In DIG II  Introduction  Basic Architecture  Memory  Programmer’s view (that would be you !)  Development.
Computer Organization and Assembly language
Computer ArchitectureFall 2007 © Sep 10 th, 2007 Majd F. Sakr CS-447– Computer Architecture.
General Purpose Processors: Software. This Week In DIG II  Introduction  Basic Architecture  Memory  Programmer’s view (that would be you !)  Development.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 27: Single-Cycle CPU Datapath Design Instructor: Sr Lecturer SOE Dan Garcia
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
CH12 CPU Structure and Function
Micro-operations Are the functional, or atomic, operations of a processor. A single micro-operation generally involves a transfer between registers, transfer.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Microprocessors.
Lecture #32 Page 1 ECE 4110–5110 Digital System Design Lecture #32 Agenda 1.Improvements to the von Neumann Stored Program Computer Announcements 1.N/A.
1 3-Software Design Basics in Embedded Systems. 2 Two Memory Architectures Processor Program memory Data memory Processor Memory (program and data) HarvardPrinceton.
Multiple-bus organization
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Experiment 7 VHDL Modeling of Embedded Microprocessors and Microcontrollers.
Chapter 4 The Von Neumann Model
1 3-Software Design Basics in Embedded Systems Optimizing the design of General Purpose processors.
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis 1 Basic Architecture Control unit and datapath –Note similarity.
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 3 General-Purpose Processors: Software.
Computer Architecture Lecture 03 Fasih ur Rehman.
Computer and Information Sciences College / Computer Science Department CS 206 D Computer Organization and Assembly Language.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Elements of Datapath for the fetch and increment The first element we need: a memory unit to store the instructions of a program and supply instructions.
1 3-Software Design Basics in Embedded Systems. 2 Two Memory Architectures Processor Program memory Data memory Processor Memory (program and data) HarvardPrinceton.
8085 INTERNAL ARCHITECTURE.  Upon completing this topic, you should be able to: State all the register available in the 8085 microprocessor and explain.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Recap – Our First Computer WR System Bus 8 ALU Carry output A B S C OUT F 8 8 To registers’ read/write and clock inputs Sequence of control signal combinations.
CPU (Central Processing Unit). The CPU is the brain of the computer. Sometimes referred to simply as the processor or central processor, the CPU is where.
3-Software Design Basics in Embedded Systems
Basic Computer Organization and Design
Control Unit Lecture 6.
William Stallings Computer Organization and Architecture 8th Edition
Chapter 3 General-Purpose Processors: Software
Basic Processing Unit Unit- 7 Engineered for Tomorrow CSE, MVJCE.
Guest Lecturer TA: Shreyas Chand
Chapter 3 General-Purpose Processors: Software
Fundamental Concepts Processor fetches one instruction at a time and perform the operation specified. Instructions are fetched from successive memory locations.
Chapter 3 General-Purpose Processors: Software
Microprocessors.
General Purpose Processor : Software
Course Code 114 Introduction to Computer Science
Chapter 4 The Von Neumann Model
Presentation transcript:

Lecture 15 VHDL Modeling of Microprocessors

Outline VHDL Modeling of Microprocessors Single purpose processors versus general purpose processors Microcontroller architecture overview Simple Microcontroller design Fibonacci sequence on Simple Microcontroller

Resources Vahid and Givargis, Embedded System Design: A Unified Hardware/Software Introduction Chapter 3, General Purpose Processors: Software

Single Versus General Purpose Processors Thus far we have designed single-purpose processors The controller is "hard-coded" to perform one task only Examples: MIN_MAX_AVR, sort, subway ticket dispensing machine A general purpose processor is able to perform many tasks The control unit is not "hard-coded" to perform a specific task Rather it reads instructions from memory and executes them Can implement any function as long as the instruction set supports it Due to its general nature, functions executing on general purpose processors are usually slower than on single-purpose processors "Software implementation"  funciton run on general purpose processor "Hardware implementation"  function run on single purpose processor Many of the following come from the excellent book: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Introduction General-Purpose Processor Processor designed for a variety of computation tasks Low unit cost, in part because manufacturer spreads NRE (non-recurring engineering cost) over large numbers of units Motorola sold half a billion 68HC05 microcontrollers in 1996 alone Carefully designed since higher NRE is acceptable Can yield good performance, size and power Low NRE cost, short time-to-market/prototype, high flexibility User just writes software; no processor design a.k.a. “microprocessor” – “micro” used when they were implemented on one or a few chips rather than entire rooms Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Basic Architecture Control unit and datapath Key differences Note similarity to single-purpose processor Key differences Datapath is general Control unit doesn’t store the algorithm – the algorithm is “programmed” into the memory Processor Control unit Datapath ALU Registers IR PC Controller Memory I/O Control /Status Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Datapath Operations Load Read memory location into register Processor Control unit Datapath ALU Controller +1 Control /Status ALU operation Input certain registers through ALU, store back in register Registers 10 11 Store Write register to memory location PC IR I/O ... Memory 10 11 ... Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Control Unit ... Control unit: configures the datapath operations Sequence of desired operations (“instructions”) stored in memory – “program” Instruction cycle – broken into several sub-operations, each one clock cycle, e.g.: Fetch: Get next instruction into IR Decode: Determine what the instruction means Fetch operands: Move data from memory to datapath register Execute: Move data through the ALU Store results: Write data from register to memory Processor Control unit Datapath ALU Registers IR PC Controller Memory I/O Control /Status 10 ... load R0, M[500] 500 501 100 inc R1, R0 101 store M[501], R1 102 R0 R1 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Control Unit Sub-Operations Fetch Get next instruction into IR PC: program counter, always points to next instruction IR: holds the fetched instruction Processor Control unit Datapath ALU Controller Control /Status Registers PC 100 IR R0 R1 load R0, M[500] I/O ... Memory 100 load R0, M[500] 500 10 101 inc R1, R0 501 ... 102 store M[501], R1 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Control Unit Sub-Operations Decode Determine what the instruction means Processor Control unit Datapath ALU Controller Control /Status Registers PC 100 IR R0 R1 load R0, M[500] I/O ... Memory 100 load R0, M[500] 500 10 101 inc R1, R0 501 ... 102 store M[501], R1 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Control Unit Sub-Operations Fetch operands Move data from memory to datapath register Processor Control unit Datapath ALU Controller Control /Status Registers 10 PC 100 IR R0 R1 load R0, M[500] I/O ... Memory 100 load R0, M[500] 500 10 101 inc R1, R0 501 ... 102 store M[501], R1 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Control Unit Sub-Operations Execute Move data through the ALU This particular instruction does nothing during this sub-operation Processor Control unit Datapath ALU Controller Control /Status Registers 10 PC 100 IR R0 R1 load R0, M[500] I/O ... Memory 100 load R0, M[500] 500 10 101 inc R1, R0 501 ... 102 store M[501], R1 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Control Unit Sub-Operations Store results Write data from register to memory This particular instruction does nothing during this sub-operation Processor Control unit Datapath ALU Controller Control /Status Registers 10 PC 100 IR R0 R1 load R0, M[500] I/O ... Memory 100 load R0, M[500] 500 10 101 inc R1, R0 501 ... 102 store M[501], R1 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Instruction Cycles ... PC=100 Fetch ops Store results Fetch Decode Processor Control unit Datapath ALU Registers IR PC Controller Memory I/O Control /Status 10 ... load R0, M[500] 500 501 100 inc R1, R0 101 store M[501], R1 102 R0 R1 10 Fetch ops Store results Fetch load R0, M[500] Decode Exec. clk 100 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Instruction Cycles ... PC=100 PC=101 Fetch ops Store results Fetch Processor Control unit Datapath ALU Registers IR PC Controller Memory I/O Control /Status 10 ... load R0, M[500] 500 501 100 inc R1, R0 101 store M[501], R1 102 R0 R1 Fetch ops Store results Fetch Decode Exec. clk +1 11 Exec. PC=101 Fetch ops Store results inc R1, R0 Fetch Decode clk 10 101 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Instruction Cycles ... PC=100 PC=101 PC=102 Fetch ops Store results Processor Control unit Datapath ALU Registers IR PC Controller Memory I/O Control /Status 10 ... load R0, M[500] 500 501 100 inc R1, R0 101 store M[501], R1 102 R0 R1 Fetch ops Store results Fetch Decode Exec. clk PC=101 Fetch ops Store results Fetch Decode Exec. Decode clk 10 11 11 Store results 102 store M[501], R1 Fetch PC=102 Fetch ops Exec. clk Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Architectural Considerations N-bit processor N-bit ALU, registers, buses, memory data interface Embedded: 8-bit, 16-bit, 32-bit common Desktop/servers: 32-bit, even 64 PC size determines address space Processor Control unit Datapath ALU Registers IR PC Controller Memory I/O Control /Status Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Architectural Considerations Clock frequency Inverse of clock period Must be longer than longest register to register delay in entire processor Memory access is often the longest Processor Control unit Datapath ALU Registers IR PC Controller Memory I/O Control /Status Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Pipelining: Increasing Instruction Throughput Wash 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Non-pipelined Pipelined Dry 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 non-pipelined dish cleaning Time pipelined dish cleaning Time Fetch-instr. 1 2 3 4 5 6 7 8 Decode 1 2 3 4 5 6 7 8 Fetch ops. 1 2 3 4 5 6 7 8 Pipelined Execute 1 2 3 4 5 6 7 8 Instruction 1 Store res. 1 2 3 4 5 6 7 8 Time pipelined instruction execution Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Superscalar and VLIW Architectures Performance can be improved by: Faster clock (but there’s a limit) Pipelining: slice up instruction into stages, overlap stages Multiple ALUs to support more than one instruction stream Superscalar Scalar: non-vector operations Fetches instructions in batches, executes as many as possible May require extensive hardware to detect independent instructions VLIW: each word in memory has multiple independent instructions Relies on the compiler to detect and schedule instructions Currently growing in popularity Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Two Memory Architectures Processor Program memory Data memory Memory (program and data) Harvard Princeton Princeton Fewer memory wires Harvard Simultaneous program and data memory access Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Cache Memory Memory access may be slow Cache is small but fast memory close to processor Holds copy of part of memory Hits and misses Processor Memory Cache Fast/expensive technology, usually on the same chip Slower/cheaper technology, usually on a different chip Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Assembly-Level Instructions opcode operand1 operand2 ... Instruction 1 Instruction 2 Instruction 3 Instruction 4 Instruction Set Defines the legal set of instructions for that processor Data transfer: memory/register, register/register, I/O, etc. Arithmetic/logical: move register through ALU and back Branches: determine next PC value when not just PC+1 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

A Simple (Trivial) Instruction Set Assembly instruct. First byte Second byte Operation MOV Rn, direct 0000 Rn direct Rn = M(direct) MOV direct, Rn 0001 Rn direct M(direct) = Rn MOV @Rn, Rm 0010 Rn Rm M(Rn) = Rm MOV Rn, #immed. 0011 Rn immediate Rn = immediate ADD Rn, Rm 0100 Rn Rm Rn = Rn + Rm SUB Rn, Rm 0101 Rn Rm Rn = Rn - Rm JZ Rn, relative 0110 Rn relative PC = PC+ relative (only if Rn is 0) opcode operands Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Addressing Modes Data Immediate Register-direct Register indirect Operand field Register address Memory address Addressing mode Register-file contents Memory Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Equivalent assembly program Sample Program int total = 0; for (int i=10; i!=0; i--) total += i; // next instructions... C program MOV R0, #0; // total = 0 MOV R1, #10; // i = 10 JZ R1, Next; // Done if i=0 ADD R0, R1; // total += i MOV R2, #1; // constant 1 JZ R3, Loop; // Jump always Loop: Next: // next instructions... SUB R1, R2; // i-- Equivalent assembly program MOV R3, #0; // constant 0 1 2 3 5 6 7 Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Running a Program If development processor is different than target, how can we run our compiled code? Two options: Download to target processor Simulate Simulation One method: Hardware description language But slow, not always available Another method: Instruction set simulator (ISS) Runs on development processor, but executes instructions of target processor Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Designing a General Purpose Processor Not something an embedded system designer normally would do But instructive to see how simply we can build one top down Remember that real processors aren’t usually built this way Much more optimized, much more bottom-up design Reset Fetch Decode IR=M[PC]; PC=PC+1 Mov1 RF[rn] = M[dir] Mov2 Mov3 Mov4 Add Sub Jz 0110 0101 0100 0011 0010 0001 op = 0000 M[dir] = RF[rn] M[rn] = RF[rm] RF[rn]= imm RF[rn] =RF[rn]+RF[rm] RF[rn] = RF[rn]-RF[rm] PC=(RF[rn]=0) ?rel :PC to Fetch PC=0; from states below FSMD Declarations: bit PC[16], IR[16]; bit M[64k][16], RF[16][16]; Aliases: op IR[15..12] rn IR[11..8] rm IR[7..4] dir IR[7..0] imm IR[7..0] rel IR[7..0] Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Simple Microprocessor Design

Architecture of a Simple Microprocessor Storage devices for each declared variable register file holds each of the variables Functional units to carry out the FSMD operations One ALU carries out every required operation Connections added among the components’ ports corresponding to the operations required by the FSM Unique identifiers created for every control signal Datapath IR PC Controller (Next-state and control logic; state register) Memory RF (16) RFwa RFwe RFr1a RFr1e RFr2a RFr2e RFr1 RFr2 RFw ALU ALUs 2x1 mux ALUz RFs PCld PCinc PCclr 3x1 mux Ms Mwe Mre To all input control signals From all output control signals Control unit 16 Irld 2 1 A D Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

A Simple Microprocessor Reset PC=0; PCclr=1; Datapath IR PC Controller (Next-state and control logic; state register) Memory RF (16) RFwa RFwe RFr1a RFr1e RFr2a RFr2e RFr1 RFr2 RFw ALU ALUs 2x1 mux ALUz RFs PCld PCinc PCclr 3x1 mux Ms Mwe Mre To all input control signals From all output control signals Control unit 16 Irld 2 1 A D Fetch IR=M[PC]; PC=PC+1 MS=10; Irld=1; Mre=1; PCinc=1; Decode from states below Mov1 RF[rn] = M[dir] RFwa=rn; RFwe=1; RFs=01; Ms=01; Mre=1; op = 0000 to Fetch Mov2 M[dir] = RF[rn] RFr1a=rn; RFr1e=1; Ms=01; Mwe=1; 0001 to Fetch Mov3 M[rn] = RF[rm] RFr1a=rn; RFr1e=1; Ms=10; Mwe=1; 0010 to Fetch Mov4 RF[rn]= imm RFwa=rn; RFwe=1; RFs=10; 0011 to Fetch Add RF[rn] =RF[rn]+RF[rm] RFwa=rn; RFwe=1; RFs=00; RFr1a=rn; RFr1e=1; RFr2a=rm; RFr2e=1; ALUs=00 0100 to Fetch Sub RF[rn] = RF[rn]-RF[rm] RFwa=rn; RFwe=1; RFs=00; RFr1a=rn; RFr1e=1; RFr2a=rm; RFr2e=1; ALUs=01 0101 to Fetch Jz PC=(RF[rn]=0) ?rel :PC PCld= ALUz; RFrla=rn; RFrle=1; 0110 to Fetch FSM operations that replace the FSMD operations after a datapath is created FSMD You just built a simple microprocessor! Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Simple Microprocessor: Architecture Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Simple Microprocessor: VHDL Hierarchy Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"

Simple Microprocessor Example: Fibonacci Sequence

Fibonacci Sequence on Microcontroller Equation: Each number is the summation of the previous two numbers Sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, etc. Instead of implementing single-purpose Fibonacci sequence generator as we have been doing thus far, implement it as a sequence of instructions in memory The simple microprocessor will execute these instructions

Code snippet from memory.vhd showing instructions -- program to generate 10 fibonacci numbers 0 => "0011000000000000", -- mov R0, #0 1 => "0011000100000001", -- mov R1, #1 2 => "0011001000110100", -- mov R2, #52 3 => "0011001100000001", -- mov R3, #1 4 => "0001000000110010", -- mov M[50], R0 5 => "0001000100110011", -- mov M[51], R1 6 => "0001000101100100", -- mov M[100], R1 7 => "0100000100000000", -- add R1, R0 8 => "0000000001100100", -- mov M[100], R0 9 => "0010001000010000", -- mov M[R2], R1 10 => "0100001000110000", -- add R2, R3 11 => "0000010000111011", -- mov R4, M[59] 12 => "0110010000000101", -- jz R4, #6 13 => "0111000000110010", -- output M[50] 14 => "0111000000110011", -- ,, M[51] 15 => "0111000000110100", -- ,, M[52] 16 => "0111000000110101", -- ,, M[53] 17 => "0111000000110110", -- ,, M[54] 18 => "0111000000110111", -- ,, M[55] 19 => "0111000000111000", -- ,, M[56] 20 => "0111000000111001", -- ,, M[57] 21 => "0111000000111010", -- ,, M[58] 22 => "0111000000111011", -- ,, M[59] 23 => "1111000000000000", -- halt Source: Vahid and Givargis, "Embedded System Design: A Unified Hardware/Software Introduction"