1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Slides:

Advertisements

Similar presentations

Constructive Computer Architecture: Data Hazards in Pipelined Processors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.

Advertisements

Multi-cycle Implementations Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology January 13, 2012L6-1

Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.

Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L07-1

Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab.

March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Realistic Memories and Caches Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 21, 2012L13-1

1 Tutorial: Lab 4 Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

B10001 Pipelining Hazards ENGR xD52 Eric VanWyk Fall 2012.

Constructive Computer Architecture: Branch Prediction Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October.

Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.

Computer Architecture: A Constructive Approach Branch Prediction - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of.

Computer Architecture: A Constructive Approach Next Address Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts.

11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.

Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration Joel Emer Computer Science & Artificial Intelligence.

Constructive Computer Architecture: Control Hazards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October.

February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,

October 22, 2009http://csg.csail.mit.edu/korea Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Realistic Memories and Caches – Part III Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 4, 2012L15-1.

Constructive Computer Architecture Tutorial 6: Five Details of SMIPS Implementations Andy Wright 6.S195 TA October 7, 2013http://csg.csail.mit.edu/6.s195T05-1.

6.375 Tutorial 4 RISC-V and Final Projects Ming Liu March 4, 2016http://csg.csail.mit.edu/6.375T04-1.

Computer Architecture: A Constructive Approach Data Hazards and Multistage Pipelines Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*,

October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.

Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 1, 2010

Elastic Pipelines: Concurrency Issues

Bluespec-3: A non-pipelined processor Arvind

Control Hazards Constructive Computer Architecture: Arvind

Bluespec-6: Modeling Processors

Bluespec-6: Modules and Interfaces

Tutorial 7: SMIPS Epochs Constructive Computer Architecture

Constructive Computer Architecture Tutorial 6: Discussion for lab6

Branch Prediction Constructive Computer Architecture: Arvind

Caches-2 Constructive Computer Architecture Arvind

Multistage Pipelined Processors and modular refinement

in Pipelined Processors

Performance Specifications

Modular Refinement Arvind

Modular Refinement Arvind

Constructive Computer Architecture Tutorial 5 Epoch & Branch Predictor

Lab 4 Overview: 6-stage SMIPS Pipeline

Non-Pipelined and Pipelined Processors

in Pipelined Processors

Control Hazards Constructive Computer Architecture: Arvind

Modeling Processors: Concurrency Issues

Branch Prediction Constructive Computer Architecture: Arvind

6.375 Tutorial 5 RISC-V 6-Stage with Caches Ming Liu

Bypassing Computer Architecture: A Constructive Approach Joel Emer

Multistage Pipelined Processors and modular refinement

Modular Refinement Arvind

Elastic Pipelines: Concurrency Issues

Bluespec-3: A non-pipelined processor Arvind

Branch Prediction: Direction Predictors

Modular Refinement - 2 Arvind

Elastic Pipelines: Concurrency Issues

Pipelined Processors Arvind

Control Hazards Constructive Computer Architecture: Arvind

Pipelined Processors Constructive Computer Architecture: Arvind

Tutorial 4: RISCV modules Constructive Computer Architecture

Modeling Processors Arvind

Modeling Processors Arvind

Modular Refinement Arvind

Control Hazards Constructive Computer Architecture: Arvind

Implementing for Correct Concurrency

Modular Refinement Arvind

Tutorial 7: SMIPS Labs and Epochs Constructive Computer Architecture

Branch Predictor Interface

Bluespec-8: Modules and Interfaces

Pipelined Processors: Control Hazards

Presentation transcript:

1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

June 3, What you’ve been given pc epoch PCGe n Exec WB ICache DCache RFile Only one instruction at a time PCGenExecWBPCGen

June 3, First Task: Pipelined CPU pc epoch PCGe n Exec WB ICache DCache RFile pred

June 3, Isolate Tasks to stages Initially writeback and execute both write to the register file This will cause a conflict between the two stages Need to all delay register file writes to the writeback stage

June 3, Add Interstage Buffering Problem: Multiple stages read/write the same stage Only one stage should read each data element Write each data element

June 3, Stage stage usage PC Gen: PC/epoch: read IMem: request pc2execQ (enq) Execute PC/epoch: write pc2execQ: (first/deq) IMem: response DMem: request exec2wbQ(enq) rf (read) Writeback exec2wbQ: (first/deq) DMem: response rf:write

June 3, Add Stalling Logic: We can’t read if there is data in the WB stage waiting to be executed Data in exec2wbQ SFIFO: WWb{.dst,.val} WLd{.dst} WStvoid WNopvoid Stall execute if we need to read one of those registers

June 3, Now we can allow exec & WB to happen concurrently No longer go to the Writeback stage. Go straight to PCGen stage

June 3, Speculation of PC We want PCGen and Exec to happen concurrently but: pcGen reads PC/epoch Exec writes PC/epoch Have PCGen guess the next PC Pass guess to exec to verify

June 3, Original Rules rule pcgen(stage == PCGen); imem.req(pc); pc2execQ.enq(tuple2(pc, epoch)); stage <= Execute; endrule rule exec(!stall(exec2wbQ) && stage == Execute); match {.pc,.epoch} = pc2execQ.first(); pc2execQ.deq(); let inst <- imem.response(); Addr nextPC = epc + 4; case (inst) matches … pc <= nextPC; endrule

June 3, New Rules rule pcgen(True); imem.req(pc); let predPC = pc + 4; pc <= predPC; pc2execQ.enq(tuple2(pc, predPC, epoch)); endrule match {.epc,.predPC,.eepoch} = pc2execQ.first(); rule exec(!stall(exec2wbQ) && eepoch == execEpoch); pc2execQ.deq(); let inst <- imem.response(); Addr nextPC = epc + 4; case (inst) matches … if (nextPC != predPC) begin pc <= nextPC; epoch <= epoch+1; execEpoch <= execEpoch+1; end endrule rule discard(eepoch != execEpoch); pc2execQ.deq(); let inst <- imem.response(); endrule

June 3, Getting Performance Expected Order per cycle WB < Exec < PCGen Requires: rf: write < read exec2wbQ: first,deq < enq, find pc2execQ: first,deq < enq epoch: write < read pc: write < read < write Write from exec, then read & write from pcGen

June 3, The new PC module module mkPCReg(ExtendedReg#(a)); Reg#(a) r <- mkReg(0); RWire#(a) rw1 <- mkRWire(); RWire#(a) rw2 <- mkRWire(); rule doWrite(isJust(rw1.wget) || isJust(rw2.wget)); r <= fromMaybe(rw2.wget, fromMaybe(rw1.wget, ?)); endrule method Action write1(x); rw.wset(x); endmethod method a read(); return fromMaybe(rw1.wget, r); endmethod method Action write2(x); rw2.wset(x); endmethod endmodule

June 3,

June 3, First Task: Pipelined CPU pc epoch PCGe n Exec WB ICache DCache RFile New Instruction can start before the first finishes Need more buffering

June 3, Branch Prediction Pipeline has simple speculation rule pcGen (True); pc <= pc + 4; otherActions; endrule Simplest prediction: Always not-taken

June 3, Branch Prediction interface BranchPredictor; method Addr getNextPC(Addr pc); method Action update (Addr pc, Addr correct_next_pc); endinterface rule pcGen (True); pc <= pred.getNextPC(pc); otherActions; endrule rule execute … if (nextPC != correctPC) pred.update(curPc, nextPC); case (instr) matches … BzTaken: if (mispredicted) … endrule Update predictions Make prediction