A Configurable Simulator for OOO Speculative Execution Design & Implementation Presented by Mustafa Imran Ali ID#230203 Fall 2004 COE 501
Architecture Modeled Fetch logic Issue Logic Trace driven execution. Branches outcome explicitly specified. Issue Logic Issue width configurable Functional Units’ Reservations Stations (RS) RS count configurable Execution Units modeled after MIPS R4000 Pipeline (Hennessy & Peterson Computer Architecture 3rd Ed.) No. of pipeline stages configurable Common Data Buses No. of CDBs configurable ROB and commit logic ROB size and commit capacity configurable
Simulation Methodology A program trace file written in comma separated variable (CSV) format A configuration file to specify values of configurable parameters Trace file and configuration file input to the simulator
Architectural Assumptions Only load misses supported. Stores are committed in a single cycle Stores use a direct bus to transfer the calculated Effective Address into the ROB Branch outcomes are written to ROB using the CDB Branch mispredict is handled when the branch instruction reaches the Head of ROB
Architectural Assumptions (cont.) Dynamic memory disambiguation implemented by using a Store EA cache A load is only allowed to proceed if there are no pending Stores with the same effective address Reservations Stations issue the first ready instruction detected Not necessarily the oldest Instruction
Architectural Assumptions (cont.) The number of CDBs available are arbitrated When a request for CDB arrives, the following priority order is used to grant the requests Branch FU Div FU LD/ST MULT FU FPADD FU INT ALU FU
List of Configurable Parameters ISSUE SIZE The maximum number of instructions examined for parallel issue COMMIT SIZE The maximum number of instructions examined in ROB for commit ROB SIZE The number of entries in Reorder Buffer NUM CDB Number of Common Data Buses LSQ SIZE Number of entries in load store buffer STORE CACHE SIZE Number of entries in store EA lookup table
List of Configurable Parameters NUMRSBU Number of reservation stations in branch prediction unit NUMRSINTALU Number of reservations stations in integer ALU NUMRSMULT Number of reservations stations in integer multiplier MULTSTAGES Number of pipeline stages in integer multiplier NUMRSDIV Number of reservations stations in integer division unit
List of Configurable Parameters DIVCYCLES Number of stages in integer division NUMRSFPADD Number of reservations stations in floating point adder FPADDSTAGES Number of pipeline stages in floating point adder MISSPROB The load miss probability MPPROB Branch mispredict probability
Simulator Structure main() { readtracefile(); readconfigfile(); while(NOT EXIT) commit(); ROB_update(); RS_update(); CDB_Arbiter(); writeback(); execute(); issue(); fetch(); } printStatistics();
Block Diagram Issue Unit Instructions Trace INT ALU RS BR UNIT RS LSQ DIV UNIT RS MULT UNIT RS ROB Arbiter Functional Units CDB RF
Metrics Measured Cycles to Complete Issue Stall Cycles Cycles when no instructions can be issued to RS FU utilizations (for each FU) No. of FU type Instructions / Total Cycles CDB utilizations (for each CDB) No. broadcasts / Total Cycles Cycles Per Instruction
Metrics Measured (cont.) Frequency of Various Issue Count over all execution cycles Frequency of Various Commit Count over all execution cycles RS occupancy Frequency over all cycles ROB occupancy Frequency over all cycles
Simulator Design Coded in C++ Compiled using MS VC++ 6.0
Execution Demonstration Registers State Initializations REGS[1].valid=1 REGS[2].valid=1 REGS[3].valid=1 REGS[8].valid=1 REGS[9].valid=1 REGS[11].valid=1 REGS[12].valid=1 REGS[15].valid=1 REGS[16].valid=1 REGS[17].valid=1 Sample Program ADD R0,R1,R2; ADD R4,R0,R3; ADD R7,R4,R0; ADD R10,R11,R12; ADD R13,R10,R15; ADD R13,R16,R17; ADD R15,R11,R12; ADD R17,R15,R12; EXIT RAW{{ }RAW }RAW }WAR WAW{ RAW{
Results: Cycles
Present Implementation Completely Configurable Simulator
Immediate Extensions