Computer Organization & Design 计算机组成与设计

Slides:



Advertisements
Similar presentations
Morgan Kaufmann Publishers The Processor
Advertisements

1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.
February 2, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 5 - Pipelining II Krste Asanovic Electrical Engineering and Computer.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Computer Organization and Architecture
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Pipelining III Steve Ko Computer Sciences and Engineering University at Buffalo.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Chapter 12 Pipelining Strategies Performance Hazards.
Pipeline Exceptions & ControlCSCE430/830 Pipeline: Exceptions & Control CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.
EENG449b/Savvides Lec 4.1 1/22/04 January 22, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
February 2, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 5 - Pipelining II Krste Asanovic Electrical Engineering and Computer.
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
What are Exception and Interrupts? MIPS terminology Exception: any unexpected change in the internal control flow – Invoking an operating system service.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
CMPE 421 Parallel Computer Architecture
Constructive Computer Architecture Interrupts/Exceptions/Faults Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
CMPE 421 REVIEW: MIDTERM 1. A MODIFIED FIVE-Stage Pipeline PC A Y R MD1 addr inst Inst Memory Imm Ext add rd1 GPRs rs1 rs2 ws wd rd2 we wdata addr wdata.
LECTURE 10 Pipelining: Advanced ILP. EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls,
ECE/CS 552: Pipeline Hazards © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim.
Data Hazards Dependent instructions add %g1, %g2, %g3 sub %l1, %g3, %o0 Forwarding helps, but not all hazards can be avoided.
Computer Organization & Design 计算机组成与设计
Interrupts and exceptions
Exceptions Another form of control hazard Could be caused by…
Computer Organization CS224
CS252 Graduate Computer Architecture Fall 2015 Lecture 4: Pipelining
William Stallings Computer Organization and Architecture 8th Edition
Pipelining Wrapup Brief overview of the rest of chapter 3
Chapter 3 Top Level View of Computer Function and Interconnection
Computer Organization & Design 计算机组成与设计
Morgan Kaufmann Publishers The Processor
Peng Liu Lecture 7 Pipelining Peng Liu
CS 152 Computer Architecture and Engineering Lecture 4 - Pipelining
Exceptions & Multi-cycle Operations
CS 152 Computer Architecture and Engineering Lecture 4 - Pipelining
Pipelining: Advanced ILP
Dr. George Michelogiannakis EECS, University of California at Berkeley
Chapter 4 The Processor Part 3
Morgan Kaufmann Publishers The Processor
Computer Organization & Design 计算机组成与设计
Morgan Kaufmann Publishers The Processor
Lecture 6: Advanced Pipelines
The processor: Pipelining and Branching
Lecture 8: ILP and Speculation Contd. Chapter 2, Sections 2. 6, 2
Dept. of Info. Sci. & Elec. Engg.
Pipelining in more detail
Computer Organization & Design 计算机组成与设计
The Processor Lecture 3.6: Control Hazards
BIC 10503: COMPUTER ARCHITECTURE
Control unit extension for data hazards
The Processor Lecture 3.5: Data Hazards
Project Instruction Scheduler Assembler for DLX
Computer Architecture
CS203 – Advanced Computer Architecture
ECE 445 – Computer Organization
Control Hazards Branches (conditional, unconditional, call-return)
Control unit extension for data hazards
Control unit extension for data hazards
Chapter 11 Processor Structure and function
Interrupts and exceptions
Guest Lecturer: Justin Hsia
CMSC 611: Advanced Computer Architecture
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
Presentation transcript:

Computer Organization & Design 计算机组成与设计 Weidong Wang (王维东) wdwang@zju.edu.cn College of Information Science & Electronic Engineering 信息与通信网络工程研究所(ICAN) Zhejiang University

Course Information Instructor: Weidong WANG TA: Email: wdwang@zju.edu.cn Tel(O): 0571-87953170; Office Hours: TBD, Yuquan Campus, Xindian (High-Tech) Building 306 Mobile: 13605812196 TA: mobile,Email: 陈 彬彬 Binbin CHEN, 13071888906; 15091831397@163.com; 陈 佳云 Jiayun CHEN,13161700140; chenjy93@outlook.com; Office Hours: Wednesday & Saturday 14:00-16:30 PM. Xindian (High-Tech) Building 308.(也可以短信邮件联系) 微信号-“2017计组群”

Lecture 8 Pipeline Hazards and Exception Processing

Review: Pipelined Execution: ALU Instructions PC A B Y R MD1 MD2 addr inst Inst Memory 0x4 Add IR Imm Ext ALU rd1 GPRs rs1 rs2 ws wd rd2 we wdata rdata Data IR 31 Not quite correct! We need an Instruction Reg (IR) for each stage CS252 S05

Review: Implementing Control

Review: Pipelined Processor Datapath + Control

Can Pipelining Lead to an Arbitrary Short Clock Cycle?

Dependencies and Hazards

Structural Hazards Structural hazards occurs when two instruction need same hardware resource at same time Can resolve in hardware by stalling newer instruction till older instruction finished with resource A structural hazard can always be avoided by adding more hardware to design E.g., if two instructions both need a port to memory at same time, could avoid hazard by adding second port to memory Our 5-stage pipe has no structural hazards by design Thanks to MIPS ISA, which was designed for pipelining

Avoiding Structural Hazards

Structural Hazard Example

Delayed Write-back

Data Dependencies

Dependency Examples True dependency=> RAW hazard addu $t0, $t1, $t2 subu $t3, $t4, $t0 Output dependency=> WAW hazard subu $t0, $t4, $t5 Anti dependency=> WAR hazard subu $t1, $t4, $t5

Data Hazards数据冲突 r1 is stale. Oops! r4  r1 … r1 … ... r1 r0 + 10 IR 31 PC A B Y R MD1 MD2 addr inst Inst Memory 0x4 Add Imm Ext ALU rd1 GPRs rs1 rs2 ws wd rd2 we wdata rdata Data ... r1 r0 + 10 r4 r1 + 17 r1 is stale. Oops! CS252 S05

RAW Hazards

RAW Hazard Example

Solutions for RAW Hazards

Data Hazards ----Stalls

Resolving Data Hazards (1) Strategy 1: Wait for the result to be available by freezing earlier pipeline stages  interlocks CS252 S05

How to Stall the Pipeline How to Insert a NOP or a Bubble

Interlocks to resolve Data Hazards Stall Condition IR 31 PC A B Y R MD1 MD2 addr inst Inst Memory 0x4 Add Imm Ext ALU rd1 GPRs rs1 rs2 ws wd rd2 we wdata rdata Data nop ... r1 r0 + 10 r4 r1 + 17 CS252 S05

Review: Interlock Control Logic ignoring jumps & branches IR PC A B Y R MD1 MD2 addr inst Inst Memory 0x4 Add Imm Ext ALU rd1 GPRs rs1 rs2 ws wd rd2 we wdata rdata Data 31 nop stall Cstall rs rt ? we ws we Cdest re1 re2 Cre Cdest Should we always stall if the rs field matches some rd? not every instruction writes a register we not every instruction reads a register re CS252 S05

Review: Hazards due to Loads & Stores IR 31 PC A B Y R MD1 MD2 addr inst Inst Memory 0x4 Add Imm Ext ALU rd1 GPRs rs1 rs2 ws wd rd2 we wdata rdata Data nop Stall Condition What if (r1)+7 = (r3)+5 ? ... M[(r1)+7]  (r2) r4  M[(r3)+5] Is there any possible data hazard in this instruction sequence? CS252 S05

Performance Effect

Reducing Stalls

RAW Hazards ---Register file

Reducing Stalls ----one step beyond

Data Hazards ---Forward

Resolving Data Hazards (2) Strategy 2: Route data as soon as possible after it is calculated to the earlier pipeline stage  bypass CS252 S05

Review: Fully Bypassed Datapath ASrc IR PC A B Y R MD1 MD2 addr inst Inst Memory 0x4 Add ALU Imm Ext rd1 GPRs rs1 rs2 ws wd rd2 we wdata rdata Data 31 nop stall D E M W PC for JAL, ... BSrc Is there still a need for the stall signal ? stall = (rsD=wsE). (opcodeE=LWE).(wsE0 ).re1D + (rtD=wsE). (opcodeE=LWE).(wsE0 ).re2D CS252 S05

Performance Effect

Forwarding Limitations

Data Hazards with Forwarding

Hardware Stall

Discovering the Forwarding Datapaths

Discovering the Forwarding Control

Discovering the Forwarding Control

Forwarding path

Forwarding Conditions Example (Stall case not shown)

Double Data Hazard

Revised Forwarding Condition

Datapath with Forwarding

Load-Use Data Hazard

Load-Use Data Hazard Detection

Datapath with Hazard Detection

Example: Load-Use Stall

Example: Load-Use Stall 1 cycle later

Resolving Data Hazards (3) Strategy 3: Speculate推测on the dependence. Two cases: Guessed correctly  do nothing Guessed incorrectly  kill and restart …. We’ll later see examples of this approach in more complex processors. CS252 S05

Compiler Optimizations & Data Hazards

Code Scheduling to Avoid Stalls

Pipeline and CPI Pipelining of instructions is complicated by HAZARDS: Pipelining increases clock frequency, while growing CPI more slowly, hence giving greater performance Time = Instructions Cycles Time Program Program * Instruction * Cycle Reduces because fewer logic gates on critical paths between flip-flops Increases because of pipeline bubbles Pipelining of instructions is complicated by HAZARDS: Structural hazards (two instructions want same hardware resource) Data hazards (earlier instruction produces value needed by later instruction) Control hazards (instruction changes control flow, e.g., branches or exceptions) Techniques to handle hazards: Interlock (hold newer instruction until older instructions drain out of pipeline and write back results) Bypass (transfer value from older instruction to newer instruction as soon as available somewhere in machine) Speculate (guess effect of earlier instruction) CS252 S05

Control Hazards

Control Hazards What do we need to calculate next PC? For Jumps Opcode, offset and PC For Jump Register Opcode and Register value For Conditional Branches Opcode, PC, Register (for condition), and offset For all other instructions Opcode and PC have to know it’s not one of above! CS252 S05

Opcode Decoding Bubble (assuming no branch delay slots for now) time t0 t1 t2 t3 t4 t5 t6 t7 . . . . (I1) r1 (r0) + 10 IF1 ID1 EX1 MA1 WB1 (I2) r3 (r2) + 17 IF2 IF2 ID2 EX2 MA2 WB2 (I3) IF3 IF3 ID3 EX3 MA3 WB3 (I4) IF4 IF4 ID4 EX4 MA4 WB4 time t0 t1 t2 t3 t4 t5 t6 t7 . . . . IF I1 nop I2 nop I3 nop I4 ID I1 nop I2 nop I3 nop I4 EX I1 nop I2 nop I3 nop I4 MA I1 nop I2 nop I3 nop I4 WB I1 nop I2 nop I3 nop I4 Resource Usage nop  pipeline bubble CS252 S05

Speculate next address is PC+4 104 IR PC addr inst Inst Memory 0x4 Add nop E M Jump? PCSrc (pc+4 / jabs / rind/ br) stall I1 096 ADD I2 100 J 304 I3 104 ADD I4 304 ADD A jump instruction kills (not stalls) the following instruction kill How? CS252 S05

Pipelining Jumps PCSrc (pc+4 / jabs / rind/ br) stall To kill a fetched instruction -- Insert a mux before IR Add E M 0x4 nop IR IR Add Jump? I2 I1 304 nop I2 I1 104 Any interaction between stall and jump? nop IRSrcD addr PC inst IR Inst Memory Kill takes precedence over stall. IRSrcD = Case opcodeD J, JAL nop ... IM I1 096 ADD I2 100 J 304 I3 104 ADD I4 304 ADD kill CS252 S05

Jump Pipeline Diagrams time t0 t1 t2 t3 t4 t5 t6 t7 . . . . (I1) 096: ADD IF1 ID1 EX1 MA1 WB1 (I2) 100: J 304 IF2 ID2 EX2 MA2 WB2 (I3) 104: ADD IF3 nop nop nop nop (I4) 304: ADD IF4 ID4 EX4 MA4 WB4 time t0 t1 t2 t3 t4 t5 t6 t7 . . . . IF I1 I2 I3 I4 I5 ID I1 I2 nop I4 I5 EX I1 I2 nop I4 I5 MA I1 I2 nop I4 I5 WB I1 I2 nop I4 I5 Resource Usage nop  pipeline bubble CS252 S05

Branch Control Hazard

Pipelining Conditional Branches 104 stall IR PC addr inst Inst Memory 0x4 Add nop E M PCSrc (pc+4 / jabs / rind / br) IRSrcD BEQZ? A Y ALU zero? Branch condition is not known until the execute stage what action should be taken in the decode stage ? I1 096 ADD I2 100 BEQZ r1 +200 I3 104 ADD 108 … I4 304 ADD CS252 S05

Pipelining Conditional Branches PCSrc (pc+4 / jabs / rind / br) stall BEQZ? ? Add E M 0x4 nop A Y ALU zero? IR IR Add I2 I1 108 I3 nop IRSrcD addr PC inst IR Inst Memory If the branch is taken - kill the two following instructions - the instruction at the decode stage is not valid  stall signal is not valid I1 096 ADD I2 100 BEQZ r1 +200 I3 104 ADD 108 … I4 304 ADD CS252 S05

Pipelining Conditional Branches PCSrc (pc+4/jabs/rind/br) stall Add PC BEQZ? E M IRSrcE 0x4 nop A Y ALU zero? IR IR Add Jump? I2 I1 108 I3 IRSrcD addr PC nop inst IR Inst Memory If the branch is taken - kill the two following instructions - the instruction at the decode stage is not valid  stall signal is not valid I1 096 ADD I2 100 BEQZ r1 +200 I3 104 ADD 108 … I4 304 ADD CS252 S05

New Stall Signal Don’t stall if the branch is taken. Why? stall = ( ((rsD =wsE).weE + (rsD =wsM).weM + (rsD =wsW).weW).re1D + ((rtD =wsE).weE + (rtD =wsM).weM + (rtD =wsW).weW).re2D ) . !((opcodeE=BEQZ).z + (opcodeE=BNEZ).!z) Don’t stall if the branch is taken. Why? Instruction at the decode stage is invalid CS252 S05

Control Equations for PC and IR Muxes PCSrc = Case opcodeE BEQZ.z, BNEZ.!z br ...  Case opcodeD J, JAL  jabs JR, JALR  rind ...  pc+4 Give priority to the older instruction, i.e., execute-stage instruction over decode-stage instruction IRSrcD = Case opcodeE BEQZ.z, BNEZ.!z nop ...  Case opcodeD J, JAL, JR, JALR nop ... IM IRSrcE = Case opcodeE BEQZ.z, BNEZ.!z nop ... stall.nop + !stall.IRD CS252 S05

Branch Pipeline Diagrams (resolved in execute stage) time t0 t1 t2 t3 t4 t5 t6 t7 . . . . (I1) 096: ADD IF1 ID1 EX1 MA1 WB1 (I2) 100: BEQZ +200 IF2 ID2 EX2 MA2 WB2 (I3) 104: ADD IF3 ID3 nop nop nop (I4) 108: xxx IF4 nop nop nop nop (I5) 304: ADD IF5 ID5 EX5 MA5 WB5 time t0 t1 t2 t3 t4 t5 t6 t7 . . . . IF I1 I2 I3 I4 I5 ID I1 I2 I3 nop I5 EX I1 I2 nop nop I5 MA I1 I2 nop nop I5 WB I1 I2 nop nop I5 Resource Usage nop  pipeline bubble CS252 S05

Reducing Branch Penalty损失 (resolve in decode stage) One pipeline bubble can be removed if an extra comparator is used in the Decode stage But might elongate cycle time PCSrc (pc+4 / jabs / rind/ br) Add IR nop E 0x4 Add Zero detect on register file output rd1 GPRs rs1 rs2 ws wd rd2 we nop addr PC inst IR Inst Memory D Pipeline diagram now same as for jumps CS252 S05

Branch Delay Slots (expose control hazard to software) Change the ISA semantics so that the instruction that follows a jump or branch is always executed gives compiler the flexibility to put in a useful instruction where normally a pipeline bubble would have resulted. I1 096 ADD I2 100 BEQZ r1 +200 I3 104 ADD I4 304 ADD Delay slot instruction executed regardless of branch outcome Other techniques include more advanced branch prediction, which can dramatically reduce the branch penalty... to come later CS252 S05

Branch Pipeline Diagrams (branch delay slot) time t0 t1 t2 t3 t4 t5 t6 t7 . . . . (I1) 096: ADD IF1 ID1 EX1 MA1 WB1 (I2) 100: BEQZ +200 IF2 ID2 EX2 MA2 WB2 (I3) 104: ADD IF3 ID3 EX3 MA3 WB3 (I4) 304: ADD IF4 ID4 EX4 MA4 WB4 time t0 t1 t2 t3 t4 t5 t6 t7 . . . . IF I1 I2 I3 I4 ID I1 I2 I3 I4 EX I1 I2 I3 I4 MA I1 I2 I3 I4 WB I1 I2 I3 I4 Resource Usage CS252 S05

Control Hazard Solutions

Optimizing: What if we Execute Branch Earlier?

Reducing Branch Delay

Example: Branch Taken

Example: Branch Taken

Why an Instruction may not be dispatched发送 every cycle (CPI>1) Full bypassing may be too expensive to implement typically all frequently used paths are provided some infrequently used bypass paths may increase cycle time and counteract the benefit of reducing CPI Loads have two-cycle latency Instruction after load cannot use load result MIPS-I ISA defined load delay slots, a software-visible pipeline hazard (compiler schedules independent instruction or inserts NOP to avoid hazard). MIPS:“Microprocessor without Interlocked Pipeline Stages” Removed in MIPS-II (pipeline interlocks added in hardware) Conditional branches may cause bubbles kill following instruction(s) if no delay slots CS252 S05

Performance Impact影响of Branch Stalls The new effective CPI is 1+1*0.15=1.15 The old effective CPI was 1+3*0.15=1.45

Data Hazards for Branch

Data Hazards for Branch

Data Hazards for Branch

Delayed Branches

Iron Law with Software-Visible NOPs Time = Instructions Cycles Time Program Program * Instruction * Cycle If software has to insert NOP instructions for hazard avoidance, instructions/program increases average cycles/instruction decreases - doing nothing fast is easy! But performance (time/program) worse or same as if hardware instead uses interlocks to avoid hazard Hardware-generated interlocks (bubbles) don’t change instructions/program, but only add to cycles/instruction Hardware interlocks don’t take space in instruction cache

Exceptions异常and Interrupters

Exceptions: altering the normal flow of control Ii-1 HI1 exception handler program Ii HI2 Ii+1 HIn An exception transfers control to special handler code run in privileged特权mode. Exceptions are usually unexpected or rare from program’s point of view. CS252 S05

Exception: an event that requests the attention of the processor Causes of Exceptions Exception: an event that requests the attention of the processor Asynchronous: an external interrupt input/output device service request timer expiration power disruptions, hardware failure Synchronous: an internal exception (a.k.a. traps) undefined opcode, privileged instruction arithmetic overflow, FPU exception misaligned memory access virtual memory exceptions: page faults, TLB misses(translation lookaside buffer ), protection violations software exceptions: system calls, e.g., jumps into kernel CS252 S05

Handling Exceptions

History of Exception Handling异常处理 First system with exceptions was Univac-I, 1951 Arithmetic overflow would either 1. trigger the execution a two-instruction fix-up routine at address 0, or 2. at the programmer's option, cause the computer to stop Later Univac 1103, 1955, modified to add external interrupts Used to gather real-time wind tunnel data First system with I/O interrupts was DYSEAC, 1954 Had two program counters, and I/O signal caused switch between two PCs Also, first system with DMA (direct memory access by I/O device) [Courtesy Mark Smotherman] CS252 S05

DYSEAC, first mobile computer! Carried in two tractor trailers, 12 tons + 8 tons Built for US Army Signal Corps [Courtesy Mark Smotherman] CS252 S05

Alternate Machinism

Asynchronous Interrupts: invoking the interrupt handler An I/O device requests attention by asserting one of the prioritized优先化interrupt request lines When the processor decides to process the interrupt It stops the current program at instruction Ii, completing all the instructions up to Ii-1 (a precise interrupt) It saves the PC of instruction Ii in a special register (EPC) It disables interrupts and transfers control to a designated interrupt handler running in the kernel mode CS252 S05

MIPS Interrupt Handler Code Saves EPC before re-enabling interrupts to allow nested interrupts  need an instruction to move EPC into GPRs need a way to mask further interrupts at least until EPC can be saved Needs to read a status register that indicates the cause of the interrupt Uses a special indirect jump instruction RFE (return-from-exception) to resume user code, this: enables interrupts restores the processor to the user mode restores hardware status and control state CS252 S05

Handler Actions

Synchronous Exception A synchronous exception is caused by a particular instruction In general, the instruction cannot be completed and needs to be restarted after the exception has been handled requires undoing the effect of one or more partially executed instructions In the case of a system call trap, the instruction is considered to have been completed syscall is a special jump instruction involving a change to privileged kernel mode特权内核模式 Handler resumes at instruction after system call CS252 S05

Precise Exceptions

Exceptions in a Pipeline

Exception Handling 5-Stage Pipeline PC Inst. Mem D Decode E M Data Mem W + Illegal Opcode Overflow Data address Exceptions PC address Exception Asynchronous Interrupts How to handle multiple simultaneous exceptions in different pipeline stages? How and where to handle external asynchronous interrupts? CS252 S05

Exception Handling 5-Stage Pipeline Commit Point 提交点 PC D E M W Inst. Mem Decode Data Mem + Kill D Stage Kill F Stage Kill E Stage Select Handler PC Kill Writeback Illegal Opcode Overflow Data address Exceptions PC address Exception Exc D Exc E Exc M Cause PC D PC E PC M EPC Asynchronous Interrupts CS252 S05

Exception Handling 5-Stage Pipeline Hold exception flags in pipeline until commit point (M stage) Exceptions in earlier pipe stages override later exceptions for a given instruction Inject external interrupts at commit point (override others) If exception at commit: update Cause and EPC registers, kill all stages, inject handler PC into fetch stage CS252 S05

Nullifying使无效Instructions

5-Stage Pipeline with Exceptions

Exception Example

Exception Example

Exception Example

Dealing Multiple Exceptions

Inprecise Exceptions

Speculating投机on Exceptions Prediction mechanism Exceptions are rare, so simply predicting no exceptions is very accurate! Check prediction mechanism Exceptions detected at end of instruction execution pipeline, special hardware for various exception types Recovery mechanism Only write architectural state at commit point, so can throw away partially executed instructions after exception Launch exception handler after flushing pipeline Bypassing allows use of uncommitted instruction results by following instructions CS252 S05

Exception Pipeline Diagram time t0 t1 t2 t3 t4 t5 t6 t7 . . . . (I1) 096: ADD IF1 ID1 EX1 MA1 nop overflow! (I2) 100: XOR IF2 ID2 EX2 nop nop (I3) 104: SUB IF3 ID3 nop nop nop (I4) 108: ADD IF4 nop nop nop nop (I5) Exc. Handler code IF5 ID5 EX5 MA5 WB5 time t0 t1 t2 t3 t4 t5 t6 t7 . . . . IF I1 I2 I3 I4 I5 ID I1 I2 I3 nop I5 EX I1 I2 nop nop I5 MA I1 nop nop nop I5 WB nop nop nop nop I5 Resource Usage CS252 S05

Beyond 5-stage Pipeline

What to Know More? Advanced Processor Architecture Topics Out-of-order and wide-issue processors Multi-threaded and multi-core processors Data-parallel processors (GPUs) Other advanced topics

Are We Done with Hardware?

HomeWork Readings: / exercises / Read Chapter 4.8-4.16; familiar with SPIM / exercises / http://mypage.zju.edu.cn/wdwd do exercises, 4.8, 4.13, 4.14, 4.17. Computer Organization and Design (COD) (Fifth Edition)

Acknowledgements These slides contain material from courses: UCB CS152 Stanford EE108B Also MIT course 6.823