Modeling CPU’s using Different MOC’s: a Case Study Trevor C. Meyerowitz Advisor: Alberto Sangiovanni-Vincentelli 290n Final Presentation May 15 2002.

Modeling CPU’s using Different MOC’s: a Case Study Trevor C. Meyerowitz Advisor: Alberto Sangiovanni-Vincentelli 290n Final Presentation May 15 2002

2 Outline Introduction Introduction  Motivation  The Simple CPU to be modeled  The Domains Investigated Modeling a Non-Pipelined Processor Modeling a Non-Pipelined Processor Modeling a Pipelined Processor Modeling a Pipelined Processor Demo Demo Conclusions Conclusions

3 Motivation Processor Designs are becoming much larger and more complicated Processor Designs are becoming much larger and more complicated  Many instructions in flight at a single time  Strange Orderings, Speculation  This can be very hard to verify  We are developing a methodology to help alleviate these problems. Using Different Models of Computation can Potentially Simplify the Design Task Using Different Models of Computation can Potentially Simplify the Design Task PtolemyII Allows us to Compare a Variety of these MOC’s in a Unified Framework PtolemyII Allows us to Compare a Variety of these MOC’s in a Unified Framework

4 The Simple CPU Processor Statistics Processor Statistics  Small Instruction Set  ADD, SUB, ADDI, SUBI, and BNE  Only Integer Operations  128 registers, 128 entry instruction memory This is enough to be interesting This is enough to be interesting  Data dependencies  Control flow

5 The Domains Investigated Process Networks Process Networks  Untimed Model  Kahn-Macqueen Semantics  Infinite Queue’s  Blocking Reads  Fully Deterministic  Schedule Independent Synchronous Reactive  Untimed Model  Instantaneous Communication and Computation  Iterates Until a Fixed Point is Found  Signals must be monotonic

6 The Nonpipelined Processor Code and netlist reusued for both domains (I.e. these are domain polymorphic actors) Code and netlist reusued for both domains (I.e. these are domain polymorphic actors) Represented in PtolemyII as: Fetch, Regfile, Execute and a Delay. Represented in PtolemyII as: Fetch, Regfile, Execute and a Delay. Fetch only after previous instruction has completed Fetch only after previous instruction has completed

7 Non-Pipelined Processor Pseudocode (Fetch + Regfile) public class Fetch … { … public fire() { pc = input_pc.get(0); = readIMEM(pc); output_inst.send(0, inst); output_regs.send(rs, rt); … } … } public class Reg … { … public fire() { if (read_mode) { inst = input_get_op_codes(); = read_regs(); output_regs.send(0, inst); output_regs.send(rs_v, rt_v); } else { rd_v = input_get_write_vals(); write_values(); } read_mode = !read_mode; } … }

8 Non-Pipelined Processor Pseudocode (Execute) public class Exec … { … public fire() { if (write_mode=false) { reg_vals = input_reg_vals(); inst_type = read_inst(); results = exec_inst(inst_type, reg_vals); } else { write_values(rd, results); write_next_pc(results); } write_mode = !write_mode; } … }

9 Non-Pipelined Processor: Differences between Domains SR required that we put the register read and register write in different iterations as well as split of execution and writing its results SR required that we put the register read and register write in different iterations as well as split of execution and writing its results Process networks cannot query port status Process networks cannot query port status SR requires use of prefire and postfire conditions SR requires use of prefire and postfire conditions We shared code between the two domains, SR probably has more flexibility. We shared code between the two domains, SR probably has more flexibility.

10 Pipelined Processor Only required recoding of fetch behavior Only required recoding of fetch behavior  Fetch every “iteration”  Only stall after branches (no branch prediction) No forwarding logic is required!? No forwarding logic is required!?  This is because two register reads can’t occur without a register write happening between them  Due to PN deterministic requirement  Also true because of SR because of states  Probably could structure SR to require forwarding logic (lower level of abstraction!!)

11 Pipelined Processor – Fetch pseudo-code public class Fetch … { … public fire() { if (initial_firing || prev_inst_is_branch) { pc = input_pc.get(0); } = readIMEM(pc); output_inst.send(0, inst); output_regs.send(rs, rt); … pc = pc+1; } … } Causes you to stall until the branch is finished. Immediately fires again if there is no branch!

12 Pipelining and Forwarding (t=0) FetchReg File Exec Inst_2: R3 = R1(?) + R1(?) Inst_1: R1 = R2(4) + R3(5) Inst. id Assembly code Logical meaning Inst_1ADD R1, R2, R3 R1 = R2 + R3 Inst_2ADD R3, R1, R1 R3 = R1 + R1 Register File State: R1 = 2 R2 = 4 R3 = 5 Program:

13 Pipelining and Forwarding (t=1) FetchReg File Exec Inst_2: R3 = R1(2) + R1(2) Inst_1: R1(9) = R2(4) + R3(5) Inst. id Assembly code Logical meaning Inst_1ADD R1, R2, R3 R1 = R2 + R3 Inst_2ADD R3, R1, R1 R3 = R1 + R1 Register File State: R1 = 2 R2 = 4 R3 = 5 Program: This is an error!! It should read R1 as 9. We can solve this by adding forwarding logic, or stalling the pipeline The PN and SR models don’t have this problem because they enforce the order: read inst_1, write inst_1, read inst_2

14 Pipelined Processor with Branch Prediction Still in order, but branches are predicted instead of stalling. Still in order, but branches are predicted instead of stalling. Requires recoding of Fetch and the Register File Requires recoding of Fetch and the Register File  Fetch  Performs branch prediction  Handles mispredicts  Register File  Keeps a queue of instructions  Stall on dependencies  Only write resolved instructions to regfile This represents one refinement path This represents one refinement path  Biased towards Process Networks

15 Inst RD, RS, RT (Val) ADD 5 5 5 ADD 6 5 5 BNE 5 20 -3 ADD 7 6 6 ADD 8 7 7 ADD 9 8 8 ADD 10 9 9 SUB 11 10 50 Program Code:

16 Outline Introduction Introduction Modeling a Non-Pipelined Processor Modeling a Non-Pipelined Processor Modeling a Pipelined Processor Modeling a Pipelined Processor Demo Demo Conclusions Conclusions  Other Architectural Features  Observations  Future Work

17 Other Architectural Features Out of Order Execution Out of Order Execution  Requires “breaking” of PN model Superscalar execution Superscalar execution  Multiple fetches at once.. Might be problematic to do in PN. Memory systems Memory systems  Initially simple, more complicated when refinements are added.

18 Observations Process Networks are relatively easy to use and are quite predictable. Process Networks are relatively easy to use and are quite predictable. Process Networks are great for initial abstract models. Process Networks are great for initial abstract models. Synchronous Reactive is simpler than DE to work with, but more complicated to design than PN’s. Synchronous Reactive is simpler than DE to work with, but more complicated to design than PN’s. PN doesn’t deal well with ordering refinements, but SR can handle them better. PN doesn’t deal well with ordering refinements, but SR can handle them better. We envision a methodology where you start with a PN model and then move to an SR model. We envision a methodology where you start with a PN model and then move to an SR model.

19 Future Work Look at implementing other architectural features Look at implementing other architectural features Examine relaxing PN’s requirements Examine relaxing PN’s requirements Look at domain specific actors Look at domain specific actors Examine composing different MOC’s Examine composing different MOC’s Introduce timing Introduce timing

Modeling CPU’s using Different MOC’s: a Case Study Trevor C. Meyerowitz Advisor: Alberto Sangiovanni-Vincentelli 290n Final Presentation May 15 2002.

Similar presentations

Presentation on theme: "Modeling CPU’s using Different MOC’s: a Case Study Trevor C. Meyerowitz Advisor: Alberto Sangiovanni-Vincentelli 290n Final Presentation May 15 2002."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Modeling CPU’s using Different MOC’s: a Case Study Trevor C. Meyerowitz Advisor: Alberto Sangiovanni-Vincentelli 290n Final Presentation May 15 2002.

Similar presentations

Presentation on theme: "Modeling CPU’s using Different MOC’s: a Case Study Trevor C. Meyerowitz Advisor: Alberto Sangiovanni-Vincentelli 290n Final Presentation May 15 2002."— Presentation transcript:

Similar presentations

About project

Feedback