Download presentation
Presentation is loading. Please wait.
Published byMyra Nichols Modified over 8 years ago
1
1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology
2
June 3, 2008 2 What you’ve been given pc epoch PCGe n Exec WB ICache DCache RFile Only one instruction at a time PCGenExecWBPCGen
3
June 3, 2008 3 First Task: Pipelined CPU pc epoch PCGe n Exec WB ICache DCache RFile pred
4
June 3, 2008 4 Isolate Tasks to stages Initially writeback and execute both write to the register file This will cause a conflict between the two stages Need to all delay register file writes to the writeback stage
5
June 3, 2008 5 Add Interstage Buffering Problem: Multiple stages read/write the same stage Only one stage should read each data element Write each data element
6
June 3, 2008 6 Stage stage usage PC Gen: PC/epoch: read IMem: request pc2execQ (enq) Execute PC/epoch: write pc2execQ: (first/deq) IMem: response DMem: request exec2wbQ(enq) rf (read) Writeback exec2wbQ: (first/deq) DMem: response rf:write
7
June 3, 2008 7 Add Stalling Logic: We can’t read if there is data in the WB stage waiting to be executed Data in exec2wbQ SFIFO: WWb{.dst,.val} WLd{.dst} WStvoid WNopvoid Stall execute if we need to read one of those registers
8
June 3, 2008 8 Now we can allow exec & WB to happen concurrently No longer go to the Writeback stage. Go straight to PCGen stage
9
June 3, 2008 9 Speculation of PC We want PCGen and Exec to happen concurrently but: pcGen reads PC/epoch Exec writes PC/epoch Have PCGen guess the next PC Pass guess to exec to verify
10
June 3, 2008 10 Original Rules rule pcgen(stage == PCGen); imem.req(pc); pc2execQ.enq(tuple2(pc, epoch)); stage <= Execute; endrule rule exec(!stall(exec2wbQ) && stage == Execute); match {.pc,.epoch} = pc2execQ.first(); pc2execQ.deq(); let inst <- imem.response(); Addr nextPC = epc + 4; case (inst) matches … pc <= nextPC; endrule
11
June 3, 2008 11 New Rules rule pcgen(True); imem.req(pc); let predPC = pc + 4; pc <= predPC; pc2execQ.enq(tuple2(pc, predPC, epoch)); endrule match {.epc,.predPC,.eepoch} = pc2execQ.first(); rule exec(!stall(exec2wbQ) && eepoch == execEpoch); pc2execQ.deq(); let inst <- imem.response(); Addr nextPC = epc + 4; case (inst) matches … if (nextPC != predPC) begin pc <= nextPC; epoch <= epoch+1; execEpoch <= execEpoch+1; end endrule rule discard(eepoch != execEpoch); pc2execQ.deq(); let inst <- imem.response(); endrule
12
June 3, 2008 12 Getting Performance Expected Order per cycle WB < Exec < PCGen Requires: rf: write < read exec2wbQ: first,deq < enq, find pc2execQ: first,deq < enq epoch: write < read pc: write < read < write Write from exec, then read & write from pcGen
13
June 3, 2008 13 The new PC module module mkPCReg(ExtendedReg#(a)); Reg#(a) r <- mkReg(0); RWire#(a) rw1 <- mkRWire(); RWire#(a) rw2 <- mkRWire(); rule doWrite(isJust(rw1.wget) || isJust(rw2.wget)); r <= fromMaybe(rw2.wget, fromMaybe(rw1.wget, ?)); endrule method Action write1(x); rw.wset(x); endmethod method a read(); return fromMaybe(rw1.wget, r); endmethod method Action write2(x); rw2.wset(x); endmethod endmodule
14
June 3, 2008 14
15
June 3, 2008 15 First Task: Pipelined CPU pc epoch PCGe n Exec WB ICache DCache RFile New Instruction can start before the first finishes Need more buffering
16
June 3, 2008 16 Branch Prediction Pipeline has simple speculation rule pcGen (True); pc <= pc + 4; otherActions; endrule Simplest prediction: Always not-taken
17
June 3, 2008 17 Branch Prediction interface BranchPredictor; method Addr getNextPC(Addr pc); method Action update (Addr pc, Addr correct_next_pc); endinterface rule pcGen (True); pc <= pred.getNextPC(pc); otherActions; endrule rule execute … if (nextPC != correctPC) pred.update(curPc, nextPC); case (instr) matches … BzTaken: if (mispredicted) … endrule Update predictions Make prediction
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.