Tutorial 7: SMIPS Labs and Epochs Constructive Computer Architecture

Slides:

Advertisements

Similar presentations

Constructive Computer Architecture: Data Hazards in Pipelined Processors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.

Advertisements

Computer Structure 2014 – Out-Of-Order Execution 1 Computer Structure Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.

Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.

Constructive Computer Architecture: Branch Prediction: Direction Predictors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.

EECS 470 Pipeline Control Hazards Lecture 5 Coverage: Chapter 3 & Appendix A.

1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.

Superscalar SMIPS Processor Andy Wright Leslie Maldonado.

Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.

Constructive Computer Architecture: Branch Prediction: Direction Predictors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.

Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab.

1 Dynamic Branch Prediction. 2 Why do we want to predict branches? MIPS based pipeline – 1 instruction issued per cycle, branch hazard of 1 cycle. –Delayed.

Pipeline Hazards. CS5513 Fall Pipeline Hazards Situations that prevent the next instructions in the instruction stream from executing during its.

Electrical and Computer Engineering University of Cyprus LAB 2: MIPS.

Computer Architecture: A Constructive Approach Branch Prediction - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of.

Computer Architecture: A Constructive Approach Next Address Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts.

11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.

Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration Joel Emer Computer Science & Artificial Intelligence.

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)

COMPSYS 304 Computer Architecture Speculation & Branching Morning visitors - Paradise Bay, Bay of Islands.

Constructive Computer Architecture: Control Hazards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October.

1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,

Constructive Computer Architecture Tutorial 6: Five Details of SMIPS Implementations Andy Wright 6.S195 TA October 7, 2013http://csg.csail.mit.edu/6.s195T05-1.

6.375 Tutorial 4 RISC-V and Final Projects Ming Liu March 4, 2016http://csg.csail.mit.edu/6.375T04-1.

Computer Architecture: A Constructive Approach Data Hazards and Multistage Pipelines Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*,

October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.

Electrical and Computer Engineering University of Cyprus

Computer Architecture Chapter (14): Processor Structure and Function

CS203 – Advanced Computer Architecture

6.175: Constructive Computer Architecture Tutorial 5 Epochs, Debugging, and Caches Quan Nguyen (Troubled by the two biggest problems in computer science…

Control Hazards Constructive Computer Architecture: Arvind

Tutorial 7: SMIPS Epochs Constructive Computer Architecture

PowerPC 604 Superscalar Microprocessor

CS 704 Advanced Computer Architecture

Constructive Computer Architecture Tutorial 6: Discussion for lab6

Branch Prediction Constructive Computer Architecture: Arvind

Multistage Pipelined Processors and modular refinement

in Pipelined Processors

Morgan Kaufmann Publishers The Processor

Constructive Computer Architecture Tutorial 7 Final Project Overview

Modular Refinement Arvind

Modular Refinement Arvind

Constructive Computer Architecture Tutorial 5 Epoch & Branch Predictor

Lab 4 Overview: 6-stage SMIPS Pipeline

in Pipelined Processors

Pipelining Lessons 6 PM T a s k O r d e B C D A 30

Control Hazards Constructive Computer Architecture: Arvind

Branch Prediction Constructive Computer Architecture: Arvind

6.375 Tutorial 5 RISC-V 6-Stage with Caches Ming Liu

Bypassing Computer Architecture: A Constructive Approach Joel Emer

Multistage Pipelined Processors and modular refinement

Modular Refinement Arvind

Realistic Memories and Caches

Branch Prediction: Direction Predictors

Branch Prediction: Direction Predictors

Control unit extension for data hazards

Pipelined Processors Arvind

Control Hazards Constructive Computer Architecture: Arvind

Pipelined Processors Constructive Computer Architecture: Arvind

Branch Prediction: Direction Predictors

Recovery: Redirect fetch unit to T path if actually T.

Tutorial 4: RISCV modules Constructive Computer Architecture

Modeling Processors Arvind

Control unit extension for data hazards

Modular Refinement Arvind

Control Hazards Constructive Computer Architecture: Arvind

Modular Refinement Arvind

Control unit extension for data hazards

Pipelined Processors: Control Hazards

Presentation transcript:

Tutorial 7: SMIPS Labs and Epochs Constructive Computer Architecture Andy Wright 6.S195 TA November 1, 2013 http://csg.csail.mit.edu/6.s195

Introduction Lab 6 Lab 7 6 Stage SMIPS Processor Due today Complex Branch Predictors Posted online Due next Friday November 1, 2013 http://csg.csail.mit.edu/6.s195

Lab 6 Does low IPC => low grade? What is a ‘mistake’? Only if low IPC is from a ‘mistake’ in your processor. What is a ‘mistake’? Not updating your BTB with redirect information Using too small of a scoreboard Having schedule conflicts between pipeline stages November 1, 2013 http://csg.csail.mit.edu/6.s195

Lab 6 What questions do you have? November 1, 2013 http://csg.csail.mit.edu/6.s195

Lab 7 Adding history bits to BTB Implement BHT Synthesize for FPGA Combines target and direction prediction Implement BHT Separates direction prediction from target prediction Synthesize for FPGA Used to calculate IPS November 1, 2013 http://csg.csail.mit.edu/6.s195

Fixed slides from L16 Epoch Management November 1, 2013 http://csg.csail.mit.edu/6.s195

Multiple predictors in a pipeline At each stage we need to take two decisions: Whether the current instruction is a wrong path instruction. Requires looking at epochs Whether the prediction (ppc) following the current instruction is good or not. Requires consulting the prediction data structure (BTB, BHT, …) Fetch stage must correct the pc unless the redirection comes from a known wrong path instruction Redirections from Execute stage are always correct, i.e., cannot come from wrong path instructions October 28, 2013 http://csg.csail.mit.edu/6.S195

Dropping or poisoning an instruction Once an instruction is determined to be on the wrong path, the instruction is either dropped or poisoned Drop: If the wrong path instruction has not modified any book keeping structures (e.g., Scoreboard) then it is simply removed Poison: If the wrong path instruction has modified book keeping structures then it is poisoned and passed down for book keeping reasons (say, to remove it from the scoreboard) Subsequent stages know not to update any architectural state for a poisoned instruction October 28, 2013 http://csg.csail.mit.edu/6.S195

N-Stage pipeline – BTB only fEpoch eEpoch recirect {pc, newPc, taken mispredict, ...} attached to every fetched instruction BTB miss pred? {pc, ppc, epoch} Fetch PC Decode f2d d2e Execute ... At Execute: (pc) if (epoch!=eEpoch) then mark instruction as poisoned (ppc) if (no poisoning) & mispred then change eEpoch; send <pc, newPc, ...> to Fetch At Fetch: msg from execute: train BTB with <pc, newPc, taken, mispredict> if msg from execute indicates misprediction then set pc, change fEpoch In a similar way to FXep and Xep, we distribute Dep into Dep and FDep. October 28, 2013 http://csg.csail.mit.edu/6.S195 9

N-Stage pipeline: Two predictors eEpoch feEpoch eRecirect redirect PC fdEpoch dEpoch dRecirect deEpoch redirect PC miss pred? miss pred? Execute d2e Fetch PC Decode f2d ... Suppose both Decode and Execute can redirect the PC; Execute redirect should have priority, i.e., Execute redirect should never be overruled We will use separate epochs for each redirecting stage feEpoch and deEpoch are estimates of eEpoch at Fetch and Decode, respectively fdEpoch is Fetch’s estimates of dEpoch Initially set all epochs to 0 October 28, 2013 http://csg.csail.mit.edu/6.S195

N-Stage pipeline: Two predictors Redirection logic eEpoch feEpoch eRecirect {pc, newPc, taken mispredict, ...} fdEpoch dEpoch dRecirect deEpoch {pc, newPc, idEp, ideEp...} miss pred? miss pred? {pc, ppc, ieEp, idEp} Execute d2e {..., ieEp} Fetch PC Decode f2d ... At execute: (pc) if (ieEp!=eEp) then poison the instruction (ppc) if (no poisoning) & mispred then change eEp; (ppc) for every control instruction send <pc, target pc, taken, mispred…> to fetch At fetch: msg from execute: if (mispred) set pc, change feEp, msg from decode: If (no redirect message from Execute) if (ideEp=feEp) then set pc, change fdEp to idEp At decode: … make sure that the msg from Decode is not from a wrong path instruction October 28, 2013 http://csg.csail.mit.edu/6.S195

Decode stage Redirection logic eEpoch feEpoch eRecirect {pc, newPc, taken mispredict, ...} fdEpoch dEpoch dRecirect deEpoch {pc, newPc, idEp, ideEp...} miss pred? miss pred? {pc, ppc, ieEp, idEp} Execute d2e {..., ieEp} Fetch PC Decode f2d ... Is ieEp = deEp ? yes no Is idEp = dEp ? Current instruction is OK but Execute has redirected the pc; Set <deEp, dEp> to <ieEp, idEp> check the ppc prediction via BHT, Switch dEp if misprediction yes no Current instruction is OK; check the ppc prediction via BHT, Switch dEp if misprediction Wrong path instruction; drop it October 28, 2013 http://csg.csail.mit.edu/6.S195

Another way to manage epochs Write the rules as simple as possible (guarded atomic actions), then add EHRs if necessary October 28, 2013 http://csg.csail.mit.edu/6.S195

Fetch Rule fInst.pc = pc; fInst.ppc = prediction( pc ); fInst.eEpoch = eEpoch; fInst.dEpoch = dEpoch; … pc <= fInst.ppc; f2dFifo.enq( fInst ); October 28, 2013 http://csg.csail.mit.edu/6.S195

Decode Rule if( dInst.eEpoch != eEpoch ) kill fInst else if( dInst.dEpoch != dEpoch ) else begin let newpc = prediction( dInst ); if( newpc != dInst.ppc ) begin pc <= newpc dEpoch <= !dEpoch; end … October 28, 2013 http://csg.csail.mit.edu/6.S195

Execute Rule if( eInst.eEpoch != eEpoch ) poison eInst else begin if( mispredict ) begin pc <= newpc; eEpoch <= !eEpoch; train branch predictors end … October 28, 2013 http://csg.csail.mit.edu/6.S195

Conflicts PC read < PC write dEpoch read < dEpoch write fetch < {decode, execute} dEpoch read < dEpoch write fetch < decode eEpoch read < eEpoch write {fetch, decode} < execute PC write C PC write fetch C decode C execute C fetch None of these stages can execute in the same clock cycle! October 28, 2013 http://csg.csail.mit.edu/6.S195

Now add EHRs Choose an ordering between the rules and assign the corresponding EHR ports (fetch, decode, execute) Change conflicting registers into EHRs (pc) Ehr#(3, Addr) pc -> mkEhr(?); October 28, 2013 http://csg.csail.mit.edu/6.S195

Fetch Rule – port 0 fInst.pc = pc[0]; fInst.ppc = prediction( pc[0] ); fInst.eEpoch = eEpoch; fInst.dEpoch = dEpoch; … pc[0] <= fInst.ppc; f2dFifo.enq( fInst ); October 28, 2013 http://csg.csail.mit.edu/6.S195

Decode Rule – port 1 if( dInst.eEpoch != eEpoch ) kill fInst else if( dInst.dEpoch != dEpoch ) else begin let newpc = prediction( dInst ); if( newpc != dInst.ppc ) begin pc[1] <= newpc; dEpoch <= !dEpoch; end … October 28, 2013 http://csg.csail.mit.edu/6.S195

Execute Rule – port 2 if( eInst.eEpoch != eEpoch ) poison eInst else begin if( mispredict ) begin pc[2] <= newpc; eEpoch <= !eEpoch; train branch predictors end … October 28, 2013 http://csg.csail.mit.edu/6.S195

Another Ordering Choose an ordering between the rules and assign the corresponding EHR ports (execute, decode, fetch) Change conflicting registers into EHRs (pc, dEpoch, eEpoch) Ehr#(3, Addr) pc -> mkEhr(?); Ehr#(3, Bool) dEpoch -> mkEhr(False); Ehr#(3, Bool) eEpoch -> mkEhr(False); October 28, 2013 http://csg.csail.mit.edu/6.S195

Fetch Rule – port 2 fInst.pc = pc[2]; fInst.ppc = prediction( pc[2] ); fInst.eEpoch = eEpoch[2]; fInst.dEpoch = dEpoch[2]; … pc[2] <= fInst.ppc; f2dFifo.enq( fInst ); October 28, 2013 http://csg.csail.mit.edu/6.S195

Decode Rule – port 1 if( dInst.eEpoch != eEpoch[1] ) kill fInst else if( dInst.dEpoch != dEpoch[1] ) else begin let newpc = prediction( dInst ); if( newpc != dInst.ppc ) begin pc[1] <= newpc; dEpoch[1] <= !dEpoch[1]; end … October 28, 2013 http://csg.csail.mit.edu/6.S195

Execute Rule – port 0 if( eInst.eEpoch != eEpoch[0] ) poison eInst else begin if( mispredict ) begin pc[0] <= newpc; eEpoch[0] <= !eEpoch[0]; train branch predictors end … October 28, 2013 http://csg.csail.mit.edu/6.S195

Different View of EHR This transformation makes more sense when you think of an EHR as sub-cycle register. This is explained more in the paper “The Ephemeral History Register: Flexible Scheduling for Rule-Based Designs” by Daniel L. Rosenband October 28, 2013 http://csg.csail.mit.edu/6.S195

Questions? November 1, 2013 http://csg.csail.mit.edu/6.s195

November 1, 2013 http://csg.csail.mit.edu/6.s195

6 stage SMIPS pipeline Register File Scoreboard DMem IMem eEpoch IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect November 1, 2013 http://csg.csail.mit.edu/6.s195

Poisoning Pipeline Register File Scoreboard DMem IMem eEpoch fEpoch PC IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect Poison Kill November 1, 2013 http://csg.csail.mit.edu/6.s195

Correcting PC in Decode and Execute IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch feEpoch PC Redirect fdEpoch 6 5 4 3 2 1 Decoding Executing Write Back dEpoch feEpoch Fetch has local estimates of eEpoch and dEpoch Decode has a local estimate of eEpoch November 1, 2013 http://csg.csail.mit.edu/6.s195

fEpoch and PC feedback Register File Scoreboard DMem IMem Epoch [0] IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem Epoch [0] Epoch [1] PC [1] PC [0] Make the PC an EHR too! Whenever Execute sees a misprediction, IFetch reads the correct next instruction in the same cycle! November 1, 2013 http://csg.csail.mit.edu/6.s195

RFile and SB feedback Bypass Register File Pipeline Scoreboard DMem IFetch Decode WB RFetch Exec Memory Bypass Register File Pipeline Scoreboard DMem IMem eEpoch fEpoch PC Redirect You can use a scoreboard that removes before searching (called a pipeline scoreboard because it is similar to pipeline fifo’s deq<enq behavior) November 1, 2013 http://csg.csail.mit.edu/6.s195