1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Slides:



Advertisements
Similar presentations
Constructive Computer Architecture: Data Hazards in Pipelined Processors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.
Advertisements

Multi-cycle Implementations Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology January 13, 2012L6-1
Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.
Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L07-1
Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Realistic Memories and Caches Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 21, 2012L13-1
1 Tutorial: Lab 4 Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
B10001 Pipelining Hazards ENGR xD52 Eric VanWyk Fall 2012.
Constructive Computer Architecture: Branch Prediction Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October.
Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.
Computer Architecture: A Constructive Approach Branch Prediction - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of.
Computer Architecture: A Constructive Approach Next Address Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts.
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration Joel Emer Computer Science & Artificial Intelligence.
Constructive Computer Architecture: Control Hazards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October.
February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,
October 22, 2009http://csg.csail.mit.edu/korea Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Realistic Memories and Caches – Part III Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 4, 2012L15-1.
Constructive Computer Architecture Tutorial 6: Five Details of SMIPS Implementations Andy Wright 6.S195 TA October 7, 2013http://csg.csail.mit.edu/6.s195T05-1.
6.375 Tutorial 4 RISC-V and Final Projects Ming Liu March 4, 2016http://csg.csail.mit.edu/6.375T04-1.
Computer Architecture: A Constructive Approach Data Hazards and Multistage Pipelines Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*,
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 1, 2010
Elastic Pipelines: Concurrency Issues
Bluespec-3: A non-pipelined processor Arvind
Control Hazards Constructive Computer Architecture: Arvind
Bluespec-6: Modeling Processors
Bluespec-6: Modules and Interfaces
Tutorial 7: SMIPS Epochs Constructive Computer Architecture
Constructive Computer Architecture Tutorial 6: Discussion for lab6
Branch Prediction Constructive Computer Architecture: Arvind
Caches-2 Constructive Computer Architecture Arvind
Multistage Pipelined Processors and modular refinement
in Pipelined Processors
Performance Specifications
Modular Refinement Arvind
Modular Refinement Arvind
Constructive Computer Architecture Tutorial 5 Epoch & Branch Predictor
Lab 4 Overview: 6-stage SMIPS Pipeline
Non-Pipelined and Pipelined Processors
in Pipelined Processors
Control Hazards Constructive Computer Architecture: Arvind
Modeling Processors: Concurrency Issues
Branch Prediction Constructive Computer Architecture: Arvind
6.375 Tutorial 5 RISC-V 6-Stage with Caches Ming Liu
Bypassing Computer Architecture: A Constructive Approach Joel Emer
Multistage Pipelined Processors and modular refinement
Modular Refinement Arvind
Elastic Pipelines: Concurrency Issues
Bluespec-3: A non-pipelined processor Arvind
Branch Prediction: Direction Predictors
Modular Refinement - 2 Arvind
Elastic Pipelines: Concurrency Issues
Pipelined Processors Arvind
Control Hazards Constructive Computer Architecture: Arvind
Pipelined Processors Constructive Computer Architecture: Arvind
Tutorial 4: RISCV modules Constructive Computer Architecture
Modeling Processors Arvind
Modeling Processors Arvind
Modular Refinement Arvind
Control Hazards Constructive Computer Architecture: Arvind
Implementing for Correct Concurrency
Modular Refinement Arvind
Tutorial 7: SMIPS Labs and Epochs Constructive Computer Architecture
Branch Predictor Interface
Bluespec-8: Modules and Interfaces
Pipelined Processors: Control Hazards
Presentation transcript:

1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

June 3, What you’ve been given pc epoch PCGe n Exec WB ICache DCache RFile Only one instruction at a time PCGenExecWBPCGen

June 3, First Task: Pipelined CPU pc epoch PCGe n Exec WB ICache DCache RFile pred

June 3, Isolate Tasks to stages Initially writeback and execute both write to the register file This will cause a conflict between the two stages Need to all delay register file writes to the writeback stage

June 3, Add Interstage Buffering Problem: Multiple stages read/write the same stage Only one stage should read each data element Write each data element

June 3, Stage stage usage PC Gen: PC/epoch: read IMem: request pc2execQ (enq) Execute PC/epoch: write pc2execQ: (first/deq) IMem: response DMem: request exec2wbQ(enq) rf (read) Writeback exec2wbQ: (first/deq) DMem: response rf:write

June 3, Add Stalling Logic: We can’t read if there is data in the WB stage waiting to be executed Data in exec2wbQ SFIFO: WWb{.dst,.val} WLd{.dst} WStvoid WNopvoid Stall execute if we need to read one of those registers

June 3, Now we can allow exec & WB to happen concurrently No longer go to the Writeback stage. Go straight to PCGen stage

June 3, Speculation of PC We want PCGen and Exec to happen concurrently but: pcGen reads PC/epoch Exec writes PC/epoch Have PCGen guess the next PC Pass guess to exec to verify

June 3, Original Rules rule pcgen(stage == PCGen); imem.req(pc); pc2execQ.enq(tuple2(pc, epoch)); stage <= Execute; endrule rule exec(!stall(exec2wbQ) && stage == Execute); match {.pc,.epoch} = pc2execQ.first(); pc2execQ.deq(); let inst <- imem.response(); Addr nextPC = epc + 4; case (inst) matches … pc <= nextPC; endrule

June 3, New Rules rule pcgen(True); imem.req(pc); let predPC = pc + 4; pc <= predPC; pc2execQ.enq(tuple2(pc, predPC, epoch)); endrule match {.epc,.predPC,.eepoch} = pc2execQ.first(); rule exec(!stall(exec2wbQ) && eepoch == execEpoch); pc2execQ.deq(); let inst <- imem.response(); Addr nextPC = epc + 4; case (inst) matches … if (nextPC != predPC) begin pc <= nextPC; epoch <= epoch+1; execEpoch <= execEpoch+1; end endrule rule discard(eepoch != execEpoch); pc2execQ.deq(); let inst <- imem.response(); endrule

June 3, Getting Performance Expected Order per cycle WB < Exec < PCGen Requires: rf: write < read exec2wbQ: first,deq < enq, find pc2execQ: first,deq < enq epoch: write < read pc: write < read < write Write from exec, then read & write from pcGen

June 3, The new PC module module mkPCReg(ExtendedReg#(a)); Reg#(a) r <- mkReg(0); RWire#(a) rw1 <- mkRWire(); RWire#(a) rw2 <- mkRWire(); rule doWrite(isJust(rw1.wget) || isJust(rw2.wget)); r <= fromMaybe(rw2.wget, fromMaybe(rw1.wget, ?)); endrule method Action write1(x); rw.wset(x); endmethod method a read(); return fromMaybe(rw1.wget, r); endmethod method Action write2(x); rw2.wset(x); endmethod endmodule

June 3,

June 3, First Task: Pipelined CPU pc epoch PCGe n Exec WB ICache DCache RFile New Instruction can start before the first finishes Need more buffering

June 3, Branch Prediction Pipeline has simple speculation rule pcGen (True); pc <= pc + 4; otherActions; endrule Simplest prediction: Always not-taken

June 3, Branch Prediction interface BranchPredictor; method Addr getNextPC(Addr pc); method Action update (Addr pc, Addr correct_next_pc); endinterface rule pcGen (True); pc <= pred.getNextPC(pc); otherActions; endrule rule execute … if (nextPC != correctPC) pred.update(curPc, nextPC); case (instr) matches … BzTaken: if (mispredicted) … endrule Update predictions Make prediction