CS152 Computer Architecture and Engineering Lecture 12 Introduction to Pipelining: Datapath and Control March 8 th, 2004 John Kubiatowicz (www.cs.berkeley.edu/~kubitron)

Slides:



Advertisements
Similar presentations
1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
Advertisements

1 IKI20210 Pengantar Organisasi Komputer Kuliah no. 25: Pipeline 10 Januari 2003 Bobby Nazief Johny Moningka
Pipelining - Hazards.
Instruction-Level Parallelism (ILP)
ECE 232 L22.Pipeline3.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 22 Pipelining,
CS 61C L19 Pipelining II (1) A Carle, Summer 2005 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #19: Pipelining II
Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
ECE 361 Computer Architecture Lecture 13: Designing a Pipeline Processor Start X:40.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan
CS152 / Kubiatowicz Lec13.1 3/17/03©UCB Spring 2003 CS152 Computer Architecture and Engineering Lecture 13 Introduction to Pipelining: Datapath and Control.
Computer ArchitectureFall 2007 © October 24nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining,
Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
Pipelining Datapath Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley) and Hank Walker (TAMU)
CS152 / Kubiatowicz Lec13.1 3/17/03©UCB Spring 2003 CS152 Computer Architecture and Engineering Lecture 13 Introduction to Pipelining: Datapath and Control.
Pipelining - II Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Ceg3420 L1 4.1 DAP Fa97,  U.CB CEG3420 Computer Design Introduction to Pipelining.
CS152 / Kubiatowicz Lec /17/01©UCB Fall 2001 CS152 Computer Architecture and Engineering Lecture 13 Introduction to Pipelining: Datapath and Control.
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Pipelining - II Rabi Mahapatra Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
Lecture 12: Pipeline Datapath Design Professor Mike Schulte Computer Architecture ECE 201.
1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.
1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
CMPE 421 Parallel Computer Architecture
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.
CECS 440 Pipelining.1(c) 2014 – R. W. Allison [slides adapted from D. Patterson slides with additional credits to M.J. Irwin]

Cs 152 L1 3.1 DAP Fa97,  U.CB Pipelining Lessons °Pipelining doesn’t help latency of single task, it helps throughput of entire workload °Multiple tasks.
Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
CSE431 L07 Overcoming Data Hazards.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 07: Overcoming Data Hazards Mary Jane Irwin (
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
CPE 442 hazards.1 Introduction to Computer Architecture CpE 442 Designing a Pipeline Processor (lect. II)
Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions.
CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and
CSE431 L06 Basic MIPS Pipelining.1Irwin, PSU, 2005 MIPS Pipeline Datapath Modifications  What do we need to add/modify in our MIPS datapath? l State registers.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.
Computer Organization
Exceptions Another form of control hazard Could be caused by…
ECE232: Hardware Organization and Design
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
CpE 442 Designing a Pipeline Processor (lect. II)
Chapter 4 The Processor Part 3
Review: MIPS Pipeline Data and Control Paths
Chapter 4 The Processor Part 2
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
CS 704 Advanced Computer Architecture
CS152 – Computer Architecture and Engineering Lecture 11 –
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
The Processor Lecture 3.6: Control Hazards
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
The Processor Lecture 3.5: Data Hazards
Instruction Execution Cycle
Key to pipelining: smooth flow Hazards limit performance
CS152 Computer Architecture and Engineering Lecture 14 Pipelining Control Continued Introduction to Advanced Pipelining.
Pipeline Control unit (highly abstracted)
Give qualifications of instructors: DAP
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
CMCS Computer Architecture Lecture 20 Pipelined Datapath and Control April 11, CMSC411.htm Mohamed.
Guest Lecturer: Justin Hsia
Recall: Performance Evaluation
Presentation transcript:

CS152 Computer Architecture and Engineering Lecture 12 Introduction to Pipelining: Datapath and Control March 8 th, 2004 John Kubiatowicz ( lecture slides:

CS152 / Kubiatowicz Lec12.2 3/8/04©UCB Spring 2004 °The Five Classic Components of a Computer °Today’s Topics: Recap last lecture/finish datapath Pipelined Control/ Do it yourself Pipelined Control Administrivia Hazards/Forwarding Exceptions Review MIPS R3000 pipeline The Big Picture: Where are We Now? Control Datapath Memory Processor Input Output

CS152 / Kubiatowicz Lec12.3 3/8/04©UCB Spring 2004 Can pipelining get us into trouble? °Yes: Pipeline Hazards structural hazards: attempt to use the same resource two different ways at the same time -E.g., combined washer/dryer would be a structural hazard or folder busy doing something else (watching TV) data hazards: attempt to use item before it is ready -E.g., one sock of pair in dryer and one in washer; can’t fold until get sock from washer through dryer -instruction depends on result of prior instruction still in the pipeline control hazards: attempt to make a decision before condition is evaulated -E.g., washing football uniforms and need to get proper detergent level; need to see after dryer before next load in -branch instructions °Can always resolve hazards by waiting pipeline control must detect the hazard take action (or delay action) to resolve hazards

CS152 / Kubiatowicz Lec12.4 3/8/04©UCB Spring 2004 Recap: Data Hazards I-Fet ch DCD MemOpFetch OpFetch Exec Store IFetch DCD ° ° ° Structural Hazard I-Fet ch DCD OpFetch Jump IFetch DCD ° ° ° Control Hazard IF DCD EX Mem WB IF DCD OF Ex Mem RAW (read after write) Data Hazard WAW Data Hazard (write after write) IF DCD OF Ex RSWAR Data Hazard (write after read) IF DCD EX Mem WB

CS152 / Kubiatowicz Lec12.5 3/8/04©UCB Spring 2004 Recall: Single cycle control! Data Out Clk 5 RwRaRb bit Registers Rd ALU Clk Data In Data Address Ideal Data Memory Instruction Address Ideal Instruction Memory Clk PC 5 Rs 5 Rt 32 A B Next Address Control Datapath Control Signals Conditions

CS152 / Kubiatowicz Lec12.6 3/8/04©UCB Spring 2004 Data Stationary Control °The Main Control generates the control signals during Reg/Dec Control signals for Exec (ExtOp, ALUSrc,...) are used 1 cycle later Control signals for Mem (MemWr Branch) are used 2 cycles later Control signals for Wr (MemtoReg MemWr) are used 3 cycles later IF/ID Register ID/Ex Register Ex/Mem Register Mem/Wr Register Reg/DecExecMem ExtOp ALUOp RegDst ALUSrc Branch MemWr MemtoReg RegWr Main Control ExtOp ALUOp RegDst ALUSrc MemtoReg RegWr MemtoReg RegWr MemtoReg RegWr Branch MemWr Branch MemWr Wr

CS152 / Kubiatowicz Lec12.7 3/8/04©UCB Spring 2004 Datapath + Data Stationary Control Exec Reg. File Mem Acces s Data Mem ABS Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl M rsrt op rs rt fun im ex me wb rw v me wb rw v wb rw v

CS152 / Kubiatowicz Lec12.8 3/8/04©UCB Spring 2004 Let’s Try it Out 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 these addresses are octal

CS152 / Kubiatowicz Lec12.9 3/8/04©UCB Spring 2004 Start: Fetch 10 Exec Reg. File Mem Acces s Data Mem ABS Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M rsrt im 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 IF PC Next PC 10 = nnnn

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Fetch 14, Decode 10 Exec Reg. File Mem Acces s Data Mem ABS Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M 2rt im 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 lw r1, r2(35) ID IF PC Next PC 14 = nnn

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Fetch 20, Decode 14, Exec 10 Exec Reg. File Mem Acces s Data Mem r2 BS Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M 2rt 35 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 lw r1 addI r2, r2, 3 ID IF EX PC Next PC 20 = n n

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Fetch 24, Decode 20, Exec 14, Mem 10 Exec Reg. File Mem Acces s Data Mem r2 B r2+35 Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 lw r1 sub r3, r4, r5 addI r2, r2, 3 ID IF EX M PC Next PC 24 = n

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Fetch 30, Dcd 24, Ex 20, Mem 14, WB 10 Exec Reg. File Mem Acces s Data Mem r4 r5 r2+3 Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl M[r2+35] 67 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 lw r1 beq r6, r7 100 addI r2 sub r3 ID IF EX M WB PC Next PC 30 = Note Delayed Branch: always execute ori after beq

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Fetch 100, Dcd 30, Ex 24, Mem 20, WB 14 Exec Reg. File Mem Acces s Data Mem r6 r7 r2+3 Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl r1=M[r2+35] 9xx 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 beq addI r2 sub r3 r4-r5 100 ori r8, r9 17 ID IF EX M WB PC Next PC 100 =

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Fetch 104, Dcd 100, Ex 30, Mem 24, WB 20 Exec Reg. File Mem Acces s Data Mem Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 ID EX M WB PC Next PC ___ = Fill it in yourself! ?

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Fetch 110, Dcd 104, Ex 100, Mem 30, WB 24 Exec Reg. File Mem Acces s Data Mem Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 EX M WB PC Next PC ___ = Fill it in yourself! ?? ? ? ?

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Fetch 114, Dcd 110, Ex 104, Mem 100, WB 30 Exec Reg. File Mem Acces s Data Mem Reg File IR Inst. Mem D Decode Mem Ctrl WB Ctrl 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, orir8, r9, 17 34addr10, r11, r12 100andr13, r14, 15 M WB PC Next PC ___ = Fill it in yourself! ?? ? ? ? ?

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Pipelined Processor °Separate control at each stage °Stalls propagate backwards to freeze previous stages °Bubbles in pipeline introduced by placing “Noops” into local stage, stall previous stages. Exec Reg. File Mem Acces s Data Mem A B S M Reg File Equal PC Next PC IR Inst. Mem Valid IRex Dcd Ctrl IRmem Ex Ctrl IRwb Mem Ctrl WB Ctrl D Stalls Bubbles

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Recap: Data Hazards °Avoid some “by design” eliminate WAR by always fetching operands early (DCD) in pipe eleminate WAW by doing all WBs in order (last stage, static) °Detect and resolve remaining ones stall or forward (if possible) IF DCD EX Mem WB IF DCD OF Ex Mem RAW Data Hazard WAW Data Hazard IF DCD OF Ex RSWAR Data Hazard IF DCD EX Mem WB

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Is CPI = 1 for our pipeline? °Remember that CPI is an “Average # cycles/inst °CPI here is 1, since the average throughput is 1 instruction every cycle. °What if there are stalls or multi-cycle execution? °Usually CPI > 1. How close can we get to 1?? IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Administrivia °Midterm I: Wednesday! (3/10) 310 Soda Hall, 5:30 – 8:30 One sheet of notes (both sides) Afterwards, pizza at LaVals (I’ll buy!)b °Topics: Chapters 1-5, some knowledge about about pipelining Should know material in Appendices A-C Could be questions about state machine design (Prereq quiz material is fair game!) °Review Session Tuesday: 7:00 – 9:00 Location TBA (405?) °Homework 4: Not out yet I may move the deadline °Lab 3 due Thursday! (3/11) Demonstrate to Tas in section Report due by midnight that day

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Hazard Detection °Suppose instruction i is about to be issued and a predecessor instruction j is in the instruction pipeline. °A RAW hazard exists on register  if  Rregs( i )  Wregs( j ) Keep a record of pending writes (for inst's in the pipe) and compare with operand regs of current instruction. When instruction issues, reserve its result register. When on operation completes, remove its write reservation. °A WAW hazard exists on register  if  Wregs( i )  Wregs( j ) °A WAR hazard exists on register  if  Wregs( i )  Rregs( j ) Window on execution: Only pending instructions can cause hazards Inst J Inst I New Inst Instruction Movement:

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 “Forward” result from one stage to another “or” OK if define read/write properly Data Hazard Solution: Forwarding I n s t r. O r d e r Time (clock cycles) add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 IFIF ID/R F EXEX ME M WBWB ALU Im Reg Dm Reg ALU Im Reg DmReg ALU Im Reg DmReg Im ALU Reg DmReg ALU Im Reg DmReg

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Record of Pending Writes In Pipeline Registers °Current operand registers °Pending writes °hazard <= ((rs == rw ex) & regW ex ) OR ((rs == rw mem) & regW me ) OR ((rs == rw wb) & regW wb ) OR ((rt == rw ex) & regW ex ) OR ((rt == rw mem) & regW me ) OR ((rt == rw wb ) & regW wb ) npc I mem Regs B alu S D mem m IAU PC Regs A imoprwn oprwn oprwn op rw rs rt

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Resolve RAW by “forwarding” (or bypassing) °Detect nearest valid write op operand register and forward into op latches, bypassing remainder of the pipe Increase muxes to add paths from pipeline registers Data Forwarding = Data Bypassing npc I mem Regs B alu S D mem m IAU PC Regs A imoprwn oprwn oprwn op rw rs rt Forward mux

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Question: Critical Path??? °Bypass path is invariably trouble °Options? Make logic really fast Move forwarding after muxes -Problem: screws up branches that require forwarding! -Use same tricks as “carry-skip” adder to fix this? -This option may just push delay around….! Insert an extra cycle for branches that need forwarding? -Or: hit common case of forwarding from EX stage and stall for forward from memory? Regs B alu S D mem m Regs A im Forward mux Equal PC Sel

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 What about memory operations? AB op Rd Ra Rb Rd to reg file R Rd ºIf instructions are initiated in order and operations always occur in the same stage, there can be no hazards between memory operations! º What does delaying WB on arithmetic operations cost? – cycles ? – hardware ? ºWhat about data dependence on loads? R1 <- R4 + R5 R2 <- Mem[ R2 + I ] R3 <- R2 + R1  “Delayed Loads” ºCan recognize this in decode stage and introduce bubble while stalling fetch stage (hint for lab 4!) ºTricky situation: R1 <- Mem[ R2 + I ] Mem[R3+34] <- R1 Handle with bypass in memory stage! D Mem T

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Compiler Avoiding Load Stalls: °Recall: MIPS I had no pipeline stalls “Microprocessor without Interlocking Pipeline Stages Consequently, the “Unscheduled” code above would be wrong % loads stalling pipeline 0%20%40%60%80% tex spice gcc 25% 14% 31% 65% 42% 54% scheduledunscheduled

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 What about Interrupts, Traps, Faults? °External Interrupts: Allow pipeline to drain, Fill with NOPs Load PC with interrupt address °Faults (within instruction, restartable) Force trap instruction into IF disable writes till trap hits WB must save multiple PCs or PC + state °Recall: Precise Exceptions  State of the machine is preserved as if program executed up to the offending instruction All previous instructions completed Offending instruction and all following instructions act as if they have not even started Same system code will work on different implementations

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Exception/Interrupts: Implementation questions 5 instructions, executing in 5 different pipeline stages! °Who caused the interrupt? StageProblem interrupts occurring IFPage fault on instruction fetch; misaligned memory access; memory-protection violation IDUndefined or illegal opcode EXArithmetic exception MEMPage fault on data fetch; misaligned memory access; memory-protection violation; memory error °How do we stop the pipeline? How do we restart it? °Do we interrupt immediately or wait? °How do we sort all of this out to maintain preciseness?

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Exception Handling npc I mem Regs B alu S D mem m IAU PC lw $2,20($5) Regs A imoprwn detect bad instruction address detect bad instruction detect overflow detect bad data address Allow exception to take effect Excp

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Another look at the exception problem °Use pipeline to sort this out! Pass exception status along with instruction. Keep track of PCs for every instruction in pipeline. Don’t act on exception until it reache WB stage °Handle interrupts through “faulting noop” in IF stage °When instruction reaches end of MEM stage: Save PC  EPC, Interrupt vector addr  PC Turn all instructions in earlier stages into noops! Program Flow Time IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB Data TLB Bad Inst Inst TLB fault Overflow

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Examples of stalls/bubbles °Exceptions: Flush everything above (later) Prevent instructions following exception from commiting state Put faulting flag on current instruction Freeze fetch until exception resolved °Stalls: Introduce brief stalls into pipeline Decode stage recognizes that current instruction cannot proceed Freeze fetch stage Introduce “bubble” into EX stage (instead of forwarding stalled inst) Can stall until condition is resolved Examples: -mfhi, mflo: need to wait for multiply/divide unit to finish -“Break” instruction for Lab5: stall until release line received -Load delay slot handled this way as well

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Example: Delayed Load – Freeze above & Bubble Below °Flush accomplished by setting “invalid” bit in pipeline npc I mem Regs B alu S D mem m IAU PC Regs A imoprwn oprwn oprwn op rw rs rt bubble freeze

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Summary °Hazards limit performance Structural: need more HW resources Data: need forwarding, compiler scheduling Control: early evaluation & PC, delayed branch, prediction °Data hazards must be handled carefully: RAW data hazards handled by forwarding WAW and WAR hazards don’t exist in 5-stage pipeline Some hazards handled by stalling °MIPS I instruction set architecture made pipeline visible (delayed branch, delayed load) °Exceptions in 5-stage pipeline recorded when they occur, but acted on only at WB (end of MEM) stage Must flush all previous instructions °Compiler optimizations may be used to avoid stalls Loop unrolling  Multiple iterations of loop in software: -Amortizes loop overhead over several iterations -Gives more opportunity for scheduling around stalls

CS152 / Kubiatowicz Lec /8/04©UCB Spring 2004 Summary: Where this class is going °We’ll build a simple pipeline and look at these issues Lab 4  Pipelined Processor Lab 5  With caches °We’ll talk about modern processors and what’s really hard: Branches (control hazards) are really hard! Exception handling Trying to improve performance with out-of-order execution, etc. Trying to get CPI < 1 (Superscalar execution)