Instruction Execution (Load and Store instructions) ie := ( (op<4..0>= 1) : R[ra] ¬ M[disp], load register (ld) (op<4..0>= 2) : R[ra] ¬ M[rel], load register relative (ldr) (op<4..0>= 3) : M[disp] ¬ R[ra], store register (st) (op<4..0>= 4) : M[rel] ¬ R[ra], store register relative (str) (op<4..0>= 5) : R[ra] ¬ disp, load displacement address (la) (op<4..0>= 6) : R[ra] ¬ rel, load relative address (lar) . . . Other instructions go here
Review
CS501 Advanced Computer Architecture Lecture06 Dr.Noor Muhammad Sheikh
Instruction Execution (Branch instructions) ie := ( . . . (op<4..0>= 8) : (cond : PC ¬ R[rb]), conditional branch (br) (op<4..0>= 9) : (R[ra] ¬ PC, cond : (PC ¬ R[rb]) ), branch and link (brl)
Instruction Execution (Branch instructions) ie := ( . . . (op<4..0>= 8) : (cond : PC ¬ R[rb]), conditional branch (br) (op<4..0>= 9) : (R[ra] ¬ PC, cond : (PC ¬ R[rb]) ), branch and link (brl) cond := ( c3á2..0ñ=0 : 0, never c3á2..0ñ=1 : 1, always c3á2..0ñ=2 : R[rc]=0, if register is zero c3á2..0ñ=3 : R[rc]¹0, if register is nonzero c3á2..0ñ=4 : R[rc]á31ñ=0, if positive or zero c3á2..0ñ=5 : R[rc]á31ñ=1 ), if negative
Instruction Execution (Branch instructions) ie := ( . . . (op<4..0>= 8) : (cond : PC ¬ R[rb]), conditional branch (br) (op<4..0>= 9) : (R[ra] ¬ PC, cond : (PC ¬ R[rb]) ), branch and link (brl) This simply means that when c3<2..0> is equal to one of these six values, substitute the expression on the right hand side of the : in place of cond cond := ( c3á2..0ñ=0 : 0, never c3á2..0ñ=1 : 1, always c3á2..0ñ=2 : R[rc]=0, if register is zero c3á2..0ñ=3 : R[rc]¹0, if register is nonzero c3á2..0ñ=4 : R[rc]á31ñ=0, if positive or zero c3á2..0ñ=5 : R[rc]á31ñ=1 ), if negative
Instruction Execution (Arithmetic and Logical instructions) ie := ( . . . (op<4..0>=12) : R[ra] ¬ R[rb] + R[rc], (op<4..0>=13) : R[ra] ¬ R[rb] + c2á16..0ñ {sign extend}, (op<4..0>=14) : R[ra] ¬ R[rb] - R[rc], (op<4..0>=15) : R[ra] ¬ - R[rc], (op<4..0>=20) : R[ra] ¬ R[rb] & R[rc], (op<4..0>=21) : R[ra] ¬ R[rb] & c2á16..0ñ {sign extend}, (op<4..0>=22) : R[ra] ¬ R[rb] ~ R[rc], (op<4..0>=23) : R[ra] ¬ R[rb] ~ c2á16..0ñ {sign extend}, (op<4..0>=24) : R[ra] ¬ ! R[rc],
Instruction Execution (Arithmetic and Logical instructions) ie := ( . . . (op<4..0>=12) : R[ra] ¬ R[rb] + R[rc], (op<4..0>=13) : R[ra] ¬ R[rb] + c2á16..0ñ {sign extend}, (op<4..0>=14) : R[ra] ¬ R[rb] - R[rc], (op<4..0>=15) : R[ra] ¬ - R[rc], (op<4..0>=20) : R[ra] ¬ R[rb] & R[rc], (op<4..0>=21) : R[ra] ¬ R[rb] & c2á16..0ñ {sign extend}, (op<4..0>=22) : R[ra] ¬ R[rb] ~ R[rc], (op<4..0>=23) : R[ra] ¬ R[rb] ~ c2á16..0ñ {sign extend}, (op<4..0>=24) : R[ra] ¬ ! R[rc], and add sub addi neg or andi ori not
Instruction Execution (Shift instructions) ie := ( . . . (op<4..0>=26) : R[ra]á31..0 ñ ¬ (n α 0) © R[rb] á31..nñ, (op<4..0>=27) : R[ra]á31..0 ñ ¬ (n α R[rb] á31ñ) © R[rb] á31..nñ, (op<4..0>=28) : R[ra]á31..0 ñ ¬ R[rb] á31-n..0ñ © (n α 0), (op<4..0>=29) : R[ra]á31..0 ñ ¬ R[rb] á31-n..0ñ © R[rb]á31..32-n ñ, where n := ( (c3á4..0ñ=0) : R[rc], (c3á4..0ñ¹0) : c3 á4..0ñ ), Notation: α means replication © means concatenation
Instruction Execution (Shift instructions) ie := ( . . . (op<4..0>=26) : R[ra]á31..0 ñ ¬ (n α 0) © R[rb] á31..nñ, (op<4..0>=27) : R[ra]á31..0 ñ ¬ (n α R[rb] á31ñ) © R[rb] á31..nñ, (op<4..0>=28) : R[ra]á31..0 ñ ¬ R[rb] á31-n..0ñ © (n α 0), (op<4..0>=29) : R[ra]á31..0 ñ ¬ R[rb] á31-n..0ñ © R[rb]á31..32-n ñ, shr shra where n := ( (c3á4..0ñ=0) : R[rc], (c3á4..0ñ¹0) : c3 á4..0ñ ), shl shc Notation: α means replication © means concatenation
Instruction Execution (Miscellaneous instructions) ie := ( . . . (op<4..0>= 0) : , No operation (nop) (op<4..0>= 31) : Run ¬ 0, Halt the processor (Stop) ); iF ); Instruction Execution ends here
The basic D Flip-Flop
The basic D Flip-Flop Q output Data input Enable input Active low Clock input Active low clear input
The n-bit register Definition: a group of FFs operating synchronously Can be formed by using n D flip-flops Connect the clock inputs of DFFs together (and enables also) Clock to DFFs has to be free running, especially for dynamic gates; use En to load registers
A 4-bit register: circuit
A 4-bit register: our symbol Inputs Clock Enable Outputs
A 4-bit register: test circuit
A 4-bit register: waveforms Inputs Outputs
A 4-to-1 MUX: our symbol Inputs 3 2 1 output
A 4-to-1 MUX: test circuit
A 4-to-1 MUX: waveforms 3 1
Tri-state buffers: circuit symbol ENABLE Data Input Data Output Don’t care Data input Enable Data output X Z 1
A 4-bit tri-state buffer unit (our symbol)
A 4-bit tri-state buffer unit (test circuit)
Concept of control signals Notice we have added a “control” signal with each register (e.g.., LRD with register RD) RD will be loaded with a value from its input only when LRD=1 By selectively enabling these control signals, we can perform register transfers of our choice “Cond” will be a Boolean expression in general; in this example Cond will be equivalent to LRD=1
Simple conditional transfer Cond: RD ← RS
Two-way transfers To be able to implement Cond1: RD RS Cond2: RS RD together, we need a path from RD to RS and a path from RS to RD, each having m lines (for m-bit RD and RS) We can connect the output of RD to the input of RS in the previous circuit
Connecting multiple registers Example: Five m-bit registers in a “point-to-point” scheme require 20 connections; each with m wires In general, n registers in a point to point scheme require n(n-1) connections FOR LARGE VALUES OF n, THIS IS NOT PRACTICAL R1 R5 R4 R3 R2 two connections m wires each
Register file on a bus This diagram shows eight 4-bit registers (R0, R1, …, R7) connected to a 4-bit bus using four 1-bit tri-state buffer blocks (called AA_TS4). Control signals are also shown on the diagram, eg., R1out is used to enable the tri-state buffers at the output of R1, thereby placing the contents of R1 on the bus. Similarly, LR7 is used to load a value from the bus into register R7.
Example: (op=1): R4← R3 + R2;
Implementing (opc=1): R4← R3 + R2; Time step Operation to be performed (structural RTL) Control signals to be activated 1 A ← R3 LA, R3out 2 C ← A + R2 LC, R2out 3 R4 ← C LR4, Cout These steps have to be performed one after the other It indicates how the add operation is accomplished using the hardware shown before
This circuit performs a rotate right of the input data depending on the pattern applied to the S1, S0 inputs A 4-bit Barrel Shifter
Symbol and Function Table for the 4-bit Barrel Shifter Output in terms of the inputs In3 In2 In1 In0 1 In0 In3 In2 In1 In1 In0 In3 In2 In2 In1 In0 In3
Example: R4← ror R3 (2 times);
A 4-bit Barrel Shifter with registers on a single bus
A 4-bit Barrel Shifter with registers on a single bus Magnified A 4-bit Barrel Shifter with registers on a single bus Magnified
A 4-bit Barrel Shifter with registers on a single bus
Magnified
Implementing R4← ror R3 (2 times); Time step Operation to be performed (structural RTL) Control signals to be activated 1 C ← R3 (after rotating right twice) R3out, nb1, LC 2 R4 ← C LR4, Cout These steps have to be performed one after the other It indicates how the rotate operation is accomplished using the hardware shown before