Download presentation
Presentation is loading. Please wait.
1
Multicycle and Microcode
CS/COE 0447 Jarrett Billingsley
2
Class announcements how's the project coming along? :^)
my god, last week of lecture thank god CS447
3
Multicycle CPUs CS447
4
Chop chop not all instructions take the same amount of time, so…
make different instructions take different amounts of time! and by that, we mean different numbers of clock cycles 3 cycles 4 cycles 100 cycles j j or lw or lw CS447
5
The instructions' steps
remember the five phases of execution? all instructions have IF, ID, EX, but only some write to memory/regs F D X beq/j F D X W add/sub etc. F D X M …….. W lw CS447
6
The multicycle datapath from a bird's-eye view
each phase of execution has its own unit, and between phases, we insert registers to hold onto the data for the next phase. Instruction Memory F Control Register File D X Data Memory M W CS447
7
Watching an add (animated)
let's watch an add instruction flow through the datapath! Instruction Memory F Clock! D Clock! X Clock! Data Memory M set all control signals... add... add Control Register File W Clock! data flows back to registers... CS447
8
CPI (and IPC) CPI (Cycles Per Instruction) measures the average number of cycles it takes to complete one instruction IPC (instructions per cycle) is its reciprocal multi-issue CPUs can execute multiple instructions in one clock cycle! WOAH 8O so, what's the CPI for the single-cycle implementation? uh, 1. by definition. what about for a multicycle implementation? ……????? hmmm CS447
9
So what the heck has this bought us?
let's say our clock cycle time decreased from 15ns to 1ns! that's from 66 MHz to 1 GHz! :D ...buuut our CPI (cycles per instruction) increased a lot. with the single-cycle datapath, CPI was always 1. now the CPI is... well... uh... variable? IF ID EX WB M beq/j add/sub etc. lw if instructions vary in length, how do we calculate CPI? (let's just say lw is 10 cycles) CS447
10
Calculating Average CPI
every program is different, and every program has a different instruction mix – how many of each kind of instruction it uses let's say we have a program where 60% of the instructions are ALU, 20% are branches, 15% are loads, and 5% are stores. ALU Branches Loads Stores % 60% 20% 15% 5% Cycles 4 3 10 9 CPI 2. Now sum the CPIs 2.4 + 0.6 + 1.5 + 0.45 = 4.95 1. for each category, multiply the proportion (percentage) by the number of cycles for that category to get the per-category CPI this is the Average CPI for THIS program. different mixes give different CPIs! CS447
11
The performance equation
if we have n instructions, and each instruction takes CPI cycles, and each cycle takes t seconds, how long does it take to execute all the instructions? Total time=𝑛 instructions× 𝐶𝑃𝐼 cycles instruction × 𝑡 seconds cycle =𝑛×𝐶𝑃𝐼×𝑡 seconds or in English, it's the product of the instruction count, the CPI, and the length of one clock cycle CS447
12
So how much better is it?!??!? say we execute 500 million (5 × 108) instructions for the single-cycle datapath: CPI = 1 cycle time = 15ns (15 x 10-9 s) total time = n × CPI × cycle time = (5 × 108) × (1) × (15 × 10-9) = 7.5 seconds. for the multicycle datapath: CPI = 4.95 (much higher!) cycle time = 1ns (much lower!) total time = (5 × 108) × (4.95) × (1 × 10-9) = seconds! CS447
13
doo-doo-doo-doo-doo-doo
not bad! I guess? I mean, we had to increase the clock speed by a factor of 15... and we only got triple the performance out of it. if our CPI were also close to 1, it'd be 15 times as fast as the single-cycle machine... HMMMMMMMMMMMMMMMMMMMMMM. Pipelining! CS447
14
Multi-cycle control CS447
15
Things are more complicated now
each instruction has multiple steps Instruction Memory F D X Data Memory M add Control Register File W CS447
16
we've transformed into a Von Neumann architecture.
One problem solved loads and stores are no longer trying to access memory at the same time as the instruction fetch! F D X M Control Register File Memory Instruction Memory Data Memory W we've transformed into a Von Neumann architecture. CS447
17
What's a load look like now? (animated)
the numbers are clock cycles. 1. Fetch using PC 2. Decode 3. Calculate s0 + 4 PC (pipeline registers) Control Register File instruction s0 Memory lw t0, 4(s0) s0+4 data 17 4 W 4. Load value using the computed address 5. copy into register computed address CS447
18
So how do we control it all?
what about other instructions? loads, stores, branches… they all have different sequences of steps fetched instruction MemWrite ALUSrc ALUOp etc… RegDataSrc Control step register oh no this looks familiar this "step register" keeps track of what step of the instruction we're on CS447
19
these FSMs just will not leave us alone
in a multicycle implementation, control is an FSM what's the first step of ANY instruction? and then? lw and then? Fetch Decode add but eventually… etc… CS447
20
Generalizing if we were to write a program to describe this control unit… while(true) { inst = memory[PC] switch(inst.opcode) { case LW: addr = REG[inst.rs] + inst.imm value = MEM[addr] REG[inst.rd] = value break case ADD: ... } this isn't just an illustration. we could literally make our control unit run programs. CS447
21
Microprogramming CS447
22
What is microprogramming?
it's making a control unit into a tiny CPU that runs programs. these programs are called microcode (µCode). the sequencer decides what step to do next. MIPS instruction Sequencer each µInstruction is mostly a set of control signals. µPC µCode ROM etc… MemWrite ALUOp RegDataSrc ALUSrc the µCode ROM holds the µPrograms for each instruction. CS447
23
Read-only memory? Well…
the real power comes when we make it possible to reprogram the µcode ROM! CPU µCode ROM it's inside the CPU, and it can be made of EEPROM or Flash firmware.bin so we can change it! firmware is software which serves a very important function and is hard to change (get it? it's softer than hardware but harder than software…) so what could you do if you could reprogram how your CPU works? CS447
24
The sky ROM size is the limit
add new instructions! fix gigantic security holes in old ones! vpmultishiftqb phminposuw sqrmaxrstdec improve performance! make add instructions subtract instead!! CPI before: 3.3 CPI now: 2.8 haha pranked CS447
25
WHO WOULD WIN? So what's the catch? µCode ROM µPC
microcoded control is flexible, but that comes at a cost: speed. Sequencer µPC µCode ROM a complicated mini-CPU inside your CPU a handful of G A T E Y B O Y S hardwired control (like your project) can be very fast. CS447
26
control The origins of RISC CISC CPU RISC CPU
by the late 70s and early 80s, CISC architectures were reaching their peak of complexity, and that meant big complex control units CISC CPU control registers ALU RISC CPU control registers ALU and this was the right risk to take! it wasn't unusual for the control to take up the majority of the chip RISC and MIPS took the… risk of simplifying control CS447
27
One tool of many microcode is absolutely still in use, mostly in CISC architectures inc [eax + ecx*16 + 4] what the CPU fetches translation to "µOps" (microcode assisted) shl tmp1, ecx, 4 add tmp1, eax add tmp1, 4 load tmp2, [tmp1] add tmp2, 1 store [tmp1], tmp2 what the CPU core might actually execute CS447
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.