Download presentation
Presentation is loading. Please wait.
1
ECE243 CPU
2
IMPLEMENTING A SIMPLE CPU
How are machine instructions implemented? What components are there? How are they connected and controlled?
3
MINI ISA: every instruction is 1-byte wide address space 4 registers:
data and address values are also 1-byte wide address space byte addressable (every byte has an address) 8 addr bits => 256 byte locations 4 registers: r0..r3 PC (resets to $80) Condition codes: Z (zero), N (negative) these are used by branches
4
Some Definitions: IMM3: a 3-bit signed immediate, 2 parts:
1 sign bit: sign(IMM3) 2 bit value: value(IMM3) IMM4: a 4-bit signed immediate IMM5: a 5-bit unsigned immediate OpA, OpB: registers variables represent one of r0..r3 SE8(X): means sign-extend value X to 8 bits NOTE: ALL INSTS DO THIS LAST: PC = PC + 1
5
Mini ISA Instructions load OpA (OpB): OpA = mem[OpB] PC = PC + 1
store OpA (OpB): mem[OpB] = OpA PC = PC + 1 add OpA OpB OpA = OpA+ OpB IF (OpA == 0) Z = 1 ELSE Z = 0 IF (OpA< 0) N = 1 ELSE N = 0 sub OpA OpB OpA = OpA - OpB PC = PC + 1
6
Mini ISA Instructions nand OpA OpB OpA = OpA bitwise-NAND OpB
IF (OpA == 0) Z = 1 ELSE Z = 0 IF (OpA< 0) N = 1 ELSE N = 0 PC = PC + 1 ori IMM5 r1 = r1 bitwise-OR IMM5 IF (r1 == 0) Z = 1 ELSE Z = 0 IF (r1< 0) N = 1 ELSE N = 0 PC = PC + 1 shift OpA IMM3 IF (sign(IMM3)) OpA = OpA << value(IMM3) ELSE OpA = OpA >> value(IMM3)
7
Mini ISA Instructions bz IMM4 IF (Z == 1) PC = PC + SE8(IMM4)
bnz IMM4 IF (Z == 0) PC = PC + SE8(IMM4) bpz IMM4 IF (N == 0) PC= PC + SE8(IMM4) PC = PC + 1
8
ENCODINGS: Inst(opcode)
Load(0000), store(0010), add(0100), sub(0110), nand(1000): Ori:
9
ENCODINGS: Inst(opcode)
Shift: BZ(0101), BNZ(1001), BPZ(1101):
10
DESIGNING A CPU Two main components: datapath: control:
datapath and control datapath: registers, functional units, muxes, wires must be able to perform all steps of every inst control: a finite state machine (FSM) commands the datapath performs: fetch, decode, read, execute, write, get next inst
11
ECE243 CPU: basic components
12
REGISTERS REGISTERS can always read we assume falling-edge-triggered
8 REGWrite? in out clock REGISTERS can always read we assume falling-edge-triggered in is stored if REGWrite=1 on falling clock edge we won’t normally draw the clock input
13
MUXES ‘select’ signal chooses which input to route to output out 8 1
1 out ‘select’ signal chooses which input to route to output
14
REGISTER FILE Out1 is the value of reg indexed by OpA
(r0,r1,r2,r3) 2 8 OpA OpB Out1 Out2 clock REGWrite? Rwrite in Out1 is the value of reg indexed by OpA Out2 is the value of reg indexed by OpB if REGWrite is 1 when clock goes low then the value on ‘in’ is written to reg indexed by Rwrite
15
ALU (arithmetic logic unit)
8 In0 In1 Z N out ALUop 3 ALUop: add = 000 sub = 001 or = 010 nand = 011 shift = 100 Z = nor(out7,out6,out5…out0) N = out bit 7 (implies negative---sign bit)
16
MEMORY our CPU has two memories for simplicity:
instruction memory and data memory known as a “Harvard architecture”
17
INSTRUCTION MEM is read only
8 addr Iout is read only Iout is set to the value indexed by the address
18
DATA MEMORY can read or write on falling clock edge: DATA MEM 8 addr
Din Dout MEMRead? clock MEMWrite? can read or write but only one in a given clock cycle on falling clock edge: if MEMWrite==1: value on Din is stored at addr if MEMRead==1: value at addr is output on Dout
19
SE8(x): SIGN-EXTEND TO 8 BITS
assuming 4-bit input Recall: want: SE8(0100) -> SE8(1100) -> In bits i3,i2,i1,i0; out bits o7…o0
20
ZE8(x): ZERO EXTEND TO 8 bits
assuming 5-bit input Recall: want ZE8(00100) -> ZE8(11100) -> In bits i4,i3,i2,i1,i0; out bits o7…o0
21
CPU: Single Cycle Implementation
ECE243 CPU: Single Cycle Implementation
22
SINGLE CYCLE DATAPATH each instruction executes entirely
in one cycle of the cpu clock registers are triggered by the falling edge new values begin propagating through datapath some values may be temporarily incorrect the clock period is large enough to ensure: that all values correct before next falling edge
23
FETCH needed by every instruction
addr PC INST MEM 8 inst PCwrite? 8 needed by every instruction i.e., every instruction must be fetched
24
PC = PC + 1 PC INST MEM 8 addr inst PCwrite? 8
25
BRANCHES: BZ IMM4 (if branch is taken does: PC = PC + IMM4 + 1) PC
INST MEM 8 addr inst PCwrite? 8 + 1 8 IMM4 opcode (if branch is taken does: PC = PC + IMM4 + 1)
26
ADD add OpA OpB Does OpA = OpA + OpB same datapath for sub and nand 1
PC 8 addr INST MEM inst PCwrite? 8 PCsel IMM4 8 4 SE8 + + 1 8 Does OpA = OpA + OpB same datapath for sub and nand OpA OpB i7 i6 i5 i4 i3 i2 i1 i0 Inst:
27
SHIFT: SHIFT OpA IMM3 REGwrite? N Z 2 REG FILE Rw PC 2 Out1 addr INST
2 REG FILE Rw PC 2 Out1 addr INST MEM A L U 8 OpA 2 OpB inst Out2 PCwrite? 8 in 2 PCsel IMM4 ALUop 8 4 SE8 + + 1 8 OpA i7 i6 i5 i4 i3 i2 i1 i0 IMM3
28
ORI: ORI IMM5 does: r1 <- r1 bitwise-or IMM5 REGwrite? A L U N Z 2
2 REG FILE Rw Out1 PC INST MEM 8 addr OpA 2 OpB inst Out2 PCwrite? 8 8 2 in PCsel ZE8 IMM3 IMM4 ALU2 ALUop 8 4 SE8 + + 1 8 IMM5 i7 i6 i5 i4 i3 i2 i1 i0 does: r1 <- r1 bitwise-or IMM5
29
Store: Store OpA (OpB) does: mem[OpB] = OpA OpASel REGwrite? A L U N Z
1 2 1 1 REG FILE Rw PC INST MEM 2 2 Out1 8 addr OpA 2 OpB 00 01 10 11 inst Out2 PCwrite? 8 8 in 5 2 3 PCsel IMM5 ZE8 IMM3 ZE8 IMM4 ALU2 ALUop 8 4 SE8 + + 1 8 does: mem[OpB] = OpA OpA OpB opcode i7 i6 i5 i4 i3 i2 i1 i0 Inst:
30
Load: Load OpA (OpB) does: OpA = mem[OpB] MEMwrite MEMread addr Data
OpASel REGwrite? A L U N Z 1 Din 2 1 1 REG FILE Rw PC INST MEM 2 2 Out1 8 addr OpA 2 OpB 00 01 10 11 inst Out2 PCwrite? 8 8 in 2 5 3 ZE8 PCsel IMM5 ZE8 IMM3 IMM4 ALUop ALU2 8 4 SE8 + + 1 8 OpA OpB opcode i7 i6 i5 i4 i3 i2 i1 i0 Inst: does: OpA = mem[OpB]
31
Final Datapath! MEMwrite MEMread addr Data MEM OpASel REGwrite? RFin A
U N Z 1 Din 2 1 1 REG FILE Rw 1 PC INST MEM 2 2 Out1 8 addr OpA 2 OpB 00 01 10 11 inst Out2 PCwrite? 8 8 in 2 5 3 ZE8 PCsel IMM5 ZE8 IMM3 IMM4 ALUop ALU2 8 4 SE8 + + 1 8
32
DESIGNING THE CONTROL UNIT
CTRL PCsel … opcode Z N CONTROL SIGNALS TO GENERATE: PCsel, PCwrite, REGwrite, MEMread, MEMwrite, OpASel, ALUop, ALU2, RFin
33
Control Signals Load OpA (OpB) INPUTS OUTPUTS INST Inst bits 3-0 N Z
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 Load OpA (OpB) INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop LOAD 0000 X
34
Control Signals Store OpA (OpB) INPUTS OUTPUTS INST Inst bits 3-0 N Z
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 Store OpA (OpB) INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop STORE 0010 X
35
Control Signals Add OpA OpB INPUTS OUTPUTS INST Inst bits 3-0 N Z
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 Add OpA OpB INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop ADD 0100 X
36
Control Signals Sub OpA OpB INPUTS OUTPUTS INST Inst bits 3-0 N Z
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 Sub OpA OpB INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop SUB 0110 X
37
Control Signals Nand OpA OpB INPUTS OUTPUTS INST Inst bits 3-0 N Z
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 Nand OpA OpB INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop NAND 1000 X
38
Control Signals ori IMM5 INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 ori IMM5 INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop ORI X111 X
39
Control Signals Shift OpA IMM3 INPUTS OUTPUTS INST Inst bits 3-0 N Z
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 Shift OpA IMM3 INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop SHIFT X011 X
40
Control Signals bz IMM4 INST Inst bits 3-0 N Z PCSel PCWrite RegWrite
FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 bz IMM4 INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop BZ 0101 X 1
41
Control Signals bnz IMM4 INST Inst bits 3-0 N Z PCSel PCWrite RegWrite
FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 bnz IMM4 INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop BNZ 1001 X 1
42
Control Signals bpz IMM4 INST Inst bits 3-0 N Z PCSel PCWrite RegWrite
FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 bpz IMM4 INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop BPZ 1101 X 1
43
All Control Signals INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel
INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop LOAD 0000 X 1 XXX STORE 0010 ADD 0100 00 000 SUB 0110 001 NAND 1000 011
44
All Control Signals INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel
INPUTS OUTPUTS INST Inst bits 3-0 N Z PCSel PCWrite RegWrite MemRead OpASel MemWrite ALU2 RFin ALUop ORI X111 X 1 01 010 SHIFT X011 10 100 BZ 0101 XXX BNZ 1001 BPZ 1101
45
Building Control Logic: MemRead
Load Store Add Sub Nand Ori Shift Bz Bnz BPZ inst bits i3-i0 0000 0010 0100 0110 1000 X111 X011 0101 1001 1101 N X 1 Z Mem Read
46
Building Control Logic: PCSel
Load Store Add Sub Nand Ori Shift Bz Bnz BPZ inst bits i3-i0 0000 0010 0100 0110 1000 X111 X011 0101 1001 1101 N X 1 Z PCSel
47
CPU: Multicycle Implementation
ECE243 CPU: Multicycle Implementation
48
A Multicycle Datapath OpASel OpA OpB
49
Key Difference #1: Only 1 Memory
OpASel OpA OpB
50
Key Difference #2: Only 1 ALU
OpASel OpA OpB
51
Key Difference #3: Temp Regs
OpASel OpA OpB what benefit are tmp regs / multicycle?
52
Key Difference #3: Temp Regs
OpASel OpA OpB critical path is long large clock period
53
Key Difference #3: Temp Regs
OpASel OpA OpB smaller critical pathsshorter clock period
54
Key Difference #3: Temp Regs
OpASel OpA OpB let’s examine these one at a time
55
IR: Instruction Register
OpASel OpA OpB holds inst encoding
56
MDR: Memory Data Register
OpASel OpA OpB holds the value returned from Memory
57
hold values from the register file
OpA and OpB OpASel OpA OpB hold values from the register file
58
holds the result calculcated by the ALU
ALUout OpASel OpA OpB holds the result calculcated by the ALU
59
Cycle by Cycle Operation
OpASel OpA OpB
60
All Insts Cycle1: Fetch and Increment PC
IR ← mem[PC]; PC ← PC + 1; OpASel OpA OpB increment PC fetch next inst into the IR
61
All Insts Cycle2: Decoding Inst & Reading Reg File
OpA ← rx; OpB ← ry OpASel OpA OpB Note: not all insts need OpA and OpB
62
Add, Sub, Nand Cycle3: Calculate
ALUout ← OpA op OpB OpASel OpA OpB
63
Add, Sub, Nand Cycle4: Write to Reg FIle
OpASel OpA OpB rx ← ALUout
64
Shift Cycle3: Calculate
ALUout ← OpA op IMM3 OpASel OpA OpB
65
Shift Cycle4: Write to Reg FIle
rx ← ALUout OpASel OpA OpB
66
ORI Cycle3: Read r1 from Reg File
OpA ← r1 OpASel OpA OpB
67
ORI Cycle4: Calculate ALUout ← OpA op IMM5 OpASel OpA OpB
68
ORI Cycle5: Write to Reg FIle
r1 ← ALUout OpASel OpA OpB
69
Load Cycle3: addr to Mem, value into MDR
MDR ← mem[OpB] OpASel OpA OpB
70
Load Cycle4: write value into reg file
rx ← MDR OpASel OpA OpB
71
Store Cycle3: addr to Mem, value to Mem
mem[OpB] ← OpA OpASel OpA OpB
72
Branches Cycle3 PC ← PC + IMM4 OpASel OpA OpB
73
Summary Instructions Single Cycle Eg: 1 MHz Multicycle Eg: 4 MHz
Store, BZ, BNZ, BPZ 1 cycle 3 cycles Add, Sub, Nand, Load 4 cycles ORI 5 cycles Example: total time to execute one of each instruction: Single cycle: 1*4 + 1*4+1*1 = 9 cycles; 9 cycles / 1MHz = 9us Multicycle: 3*4 + 4*4 + 1*5 = 33 cycles; 33 cycles / 4MHz = 8.25us
74
Implementing Multicycle Control
Add, Sub, Nand Shift Ori Load Store Bnz, Bz, Bpz 1 IR = [PC] PC = PC + 1 2 OpA = RF[rx] OpB = RF[ry] 3 ALUout = OpA op OpB ALUout = OpA shift Imm3 OpA = RF[1] MDR = mem[OpB] Mem[OpB] = OpA PC = PC + SE(Imm4) 4 RF[rx] = ALUout ALUout = OpA OR Imm5 RF[rx] = MDR X 5 RF[1] = ALUout
75
Control: An FSM need a state transition diagram
how many states are there? how many bits to represent state?
76
Multicycle Control as an FSM
77
Multicycle Control Hardware
IR N Ctrl logic Z State Register (4 bits) IR:3..0 Pcwrite Pcsel ALUop … Next_state Current_state
78
CPU: Adding a New Instruction
ECE243 CPU: Adding a New Instruction
79
EXAMPLE QUESTION: ADDING A NEW INSTRUCTION
Implement a post-increment load: Load rx, (ry)+ Does: RF[rx] = MEM[RF[ry]] RF[ry] = RF[ry] + 1 ry is permanently changed to be ry+1
80
Implementing: RF[rx] = MEM[RF[ry]]; RF[ry] = RF[ry] + 1
Recall: load rx, (ry) IR= mem[PC] , PC = PC + 1 OpA = RF[rx], OpB = RF[ry] MDR = mem[ry] RF[rx] = MDR
81
Modifying the Datapath
RF[ry] = RF[ry] + 1 OpASel OpA OpB
82
ECE243 CPU: Pipelining
83
A Fast-Food Sandwich Shop
cook take order select bun add ingredients wrap and bag cash and change
84
With One Cook one customer is serviced at a time cook take order
select bun add ingredients wrap and bag cash and change customer1 customer1 customer1 customer1 customer1 one customer is serviced at a time
85
Like the single-cycle CPU
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8 Add r1, r2 one instruction flows through at a time
86
With Two Cooks? cook cook take order select bun add ingredients
wrap and bag cash and change
87
Pipelining Like an assembly line
Doesn’t change the interface or result improves performance
88
Pipelining a CPU (rough idea)
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8
89
Pipelining Details: MEMwrite MEMread Data OpASel REGwrite? RFin N Z
FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8
90
With Three Cooks? cook cook cook take order select bun add ingredients
wrap and bag cash and change
91
Pipelining a CPU (rough idea)
REG FILE Rw 2 OpA OpB 5 A L U N Z Out1 Out2 in REGwrite? 8 inst INST MEM PC addr + PCwrite? 1 OpASel ALU2 IMM5 ALUop Data 00 01 10 11 IMM3 3 Din MEMread MEMwrite RFin 4 SE8 IMM4 PCsel ZE8
92
Visualizing Pipelining
Fetch (inst mem) Decode (reg file) Execute (ALU and data mem) Cycle Fetch Decode Execute 1 2 3 4
93
Visualizing Pipelining (again)
Fetch (inst mem) Decode (reg file) Execute (ALU and data mem) Cycle 1 2 3 4 5 inst1 inst2 inst3 inst4
94
Fast Food Hazards What if: c1 and c2 are friends, c2 has no money, and
cook cook cook take order select bun add ingredients wrap and bag cash and change customer3 customer2 customer1 What if: c1 and c2 are friends, c2 has no money, and c2 needs to know how much change c1 will get before ordering (to ensure c2 can afford his order)?
95
Fast Food Hazards cook cook cook take order select bun add ingredients
wrap and bag cash and change customer2 customer1
96
CPU Hazards called a data hazard
Fetch (inst mem) Decode (reg file) Execute (ALU and data mem) called a data hazard must be observed to ensure correct execution there are two solutions to data hazards
97
Solution1: Stalling Cycle 1 2 3 4 5 Execute Fetch Decode (ALU and
(inst mem) Decode (reg file) Execute (ALU and data mem) Cycle 1 2 3 4 5 add r1,r2 add r3,r1 sub r0,r2 add r2,r2
98
How to insert bubbles option1: hardware stalls the pipeline
need extra logic to do so happens ‘automatically’ for any code option2: compiler inserts “no-ops” a no-op is an instruction that does nothing ex: add r0,r0,r0 (NIOS) compiler must do it right or wrong results! example: inserting a bubble with a no-op: add r1, r2 noop add r3, r1
99
Solution2: Forwarding Lines
Fetch (inst mem) Decode (reg file) Execute (ALU and data mem) add “forwarding” logic to pass values directly between stages Cycle 1 2 3 4 5 add r1,r2 add r3,r1 sub r0,r2 add r2,r2
100
Control Hazards Cycle 1 2 3 4 5 cpu predicts each branch is not taken
add r1,r2 bnz -2 add r3,r1 add r2,r2 cpu predicts each branch is not taken Better: predict taken why?---loops are common, usually taken More advanced: remember what each branch did last time “branch predictor”: a table that remembers what each branch did the last time uses this to make a prediction next time
101
Some Real CPU Pipelines
21264 Pipeline (Alpha) Microprocessor Report 10/28/96 Pentium IV’s Pipeline: TC nxt IP TC fetch Drv Alloc Rename Que Sch Disp RF Ex Flgs BrCk
102
CPU: Alternate Architectures
ECE243 CPU: Alternate Architectures
103
ANOTHER MULTICYCLE CPU
CONTROL IR PC MDR Regs r0..r3 Y Z 1 Control Signals to All components Internal bus MEM addr Din Dout MAR ALU Select 111 … 000 MEMRead MEMWrite Imm3,4,5 ALUop
104
SOME CONTROL SIGNALS PCout: PCin: MDRinBus: MDRinMem: MDRoutBus:
write PC value to bus PCin: read bus value into PC MDRinBus: read value from bus into MDR MDRinMem: write value from Dout of MEM into MDR MDRoutBus: write value from MDR onto bus
105
Ex: Ctrl: Add r1, r2 # r1 = r1 + r2 CONTROL IR PC MDR Y Z 1 Control
Regs r0..r3 Y Z 1 Control Signals to All components Internal bus MEM addr Din Dout MAR ALU Select 111 … 000 MEMRead MEMWrite Imm3,4,5 ALUop
106
Ex: Ctrl: Add r1, r2 # r1 = r1 + r2 CONTROL IR PC MDR Y Z 1 Control
Regs r0..r3 Y Z 1 Control Signals to All components Internal bus MEM addr Din Dout MAR ALU Select 111 … 000 MEMRead MEMWrite Imm3,4,5 ALUop
107
CHARACTERIZATION OF ISAs
attribute #1: number of explicit operands Attribute #2: are registers general purpose? Attribute #3: Can an operand be a memory location? Attribute #4: RISC vs CISC Attribute #5: Relation between instructions and data
108
att1: num of explicit operands
focus on calculation instructions (add,sub…) running example: A = B + C (C-code) assume A, B, C are memory locations 0 operands: eg., stack based (like first calculator CPUs) push and pop operations, refer to top of stack
109
att1: num of explicit operands
eg., accumulator based; accumulator is a reg inside cpu instructions use accum as destination.
110
att1: num of explicit operands
eg: 68k, ia32
111
att1: num of explicit operands
eg: MIPS, SPARC, POWERpc How many operands is NIOS?
112
Att2: are regs general purpose?
if yes: you can use any register for any purpose special registers are by convention only if no: some registers have hardwired purposes ex: in 68k, A7 is hardwired to be stack pointer used implicitly for jsr, rts, link instructions Are NIOS registers general purpose?
113
Att3: operand = mem location?
with respect to calculation insts (add, sub) if yes: one operand can be in memory, the other in a register maybe: can can also write result to memory if no: called a load/store architecture only load/store insts can get/put memory values to/from regs Can a NIOS operand be a mem location?
114
Att4: RISC vs CISC Are there instructions with many steps?
a vague and debatable question CISC: complex instruction set computer Many, complex instructions can be hard to pipeline! ex: 68k, x86, PowerPC? RISC: reduced instruction set computer Fewer, simple instructions easy to pipeline ex: MIPS, alpha, Powerpc? Which is NIOS? Quandry: x86 is a CISC but pentiumIV has a 20-stage pipeline! How’d they do it?
115
Att5: Relation bet. insts & data
SISD: single instruction, single data everyting we have seen so far an inst only writes one reg/memory location SIMD: single instruction, multiple data one instruction tells CPU to operate on an array of regs or memory locations ex: multimedia extensions: MMX, SSE, 3Dnow (intel); altivec (powerpc) ex: IBM/Sony/toshiba Cell processor (vector processor) MIMD: multiple instruction, multiple data ex: Cluster of workstations, SMP servers, multicores, hyperthreading Which is NIOS?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.