Download presentation
Presentation is loading. Please wait.
1
Sunshine Slam Khian Hao Lim Haywood Ho Soe Myint Leo Ting Ka Hou Chan
2
Overview Datapath of 10 stages deep pipeline Cache Branch prediction and jump target prediction Critical Path Xilinx Tools, Testing Methodology Performance Current status, Conclusion
3
Datapath I IF0IF1 IF2 Next PC logic Inst cache Branch prediction PC
4
Datapath II IDFOEX1EX2 Ctrl Reg file decode Fwd logic Mux A Mux B ALU Branch Verify FO muxes
5
Datapath III ME1ME2WB Data cache mux FO muxes monitor statistics FO muxes
6
Cache Architecture BlockRams Write Buffer = = ME1EX2 ME2 SDRAM Processor datapath tags data tag data Cache Meister 2-way set-associativity Random cache-line replacement policy Cache miss detected in ME2 Write buffer congestion detected in ME1
7
Branch, Jump Prediction
8
Critical Path PathLogic Delay (ns)Route Delay(ns) Total Delay(ns) Write Buffer 6.192 (28.7%)15.353 (71.3%)21.545 Stalling Logic 4.019 (19.8%)16.256 (80.2%)20.275 ALU7.177 (36.5%)12.476 (63.5%)19.653 Forwarding logic 6.715 (35.1%)12.406 (64.9%)19.121 Branch Verifier 7.130 (37.7%)11.790 (62.3%)18.920 Branch Predictor 4.765 (24.7%)14.545 (75.3%)19.310 Forwarding muxes 3.158 (22.8%)10.666 (77.2%)13.824
9
Xilinx Tools Read up on tutorials on the Xilinx website to become more familiar with the tools Added timing constraints to the clock and other paths Critical path shortened (37ns 25 ns) after adding constraints and constraining fanout Guide design files
10
Testing Methodology Black-box tests for each module Verify memory controller functionality on board Replaced caches with block RAMs Tested entire processor in simulation Made changes to help alleviate clock skew problems on board “Shadow” register file so we could more easily debug on board
11
Performance Test Programs CPIBranch Prediction correct % Quick sort2.264% Extra3.8470.11% Base3.0669.06% Measurement Results How did we measure our processor’s performance? Add a Statistics Module Count the numbers of right or wrong predicted branches I Cache and D Cache WB stage Stalling Logic Statistic Module Count the total number of cycles Count the number of valid instructions executed CPI = total cycles / number of valid instructions executed Collect data from different modules In the branch predictor….
12
Lessons/Evaluation/ Further Improvements Simpler write buffer design Use Smaller write buffer to reduce logic Use random replacement instead of FIFO Pipeline the stalling logic Do the necessary computation in IF2 stage and then decide whether to stall in ID stage Pipeline Branch Verifier Do the computation in EX1 or FO stage and then compare or look up table in the next stage IF2ID FOEX1 EX2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.