Sunshine Slam Khian Hao Lim Haywood Ho Soe Myint Leo Ting Ka Hou Chan.

Slides:



Advertisements
Similar presentations
Lecture 4: CPU Performance
Advertisements

Adding the Jump Instruction
Morgan Kaufmann Publishers The Processor
Final Project : Pipelined Microprocessor Joseph Kim.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Instructor: Yuzhuang Hu Final Exam! The final exam is scheduled on 7 th, August, Friday 7:00 pm – 10:00 pm.
COMP25212 Further Pipeline Issues. Cray 1 COMP25212 Designed in 1976 Cost $8,800,000 8MB Main Memory Max performance 160 MFLOPS Weight 5.5 Tons Power.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.
MIPS Pipelined Datapath
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Pipeline Hazards See: P&H Chapter 4.7.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
EECS 470 Pipeline Hazards Lecture 4 Coverage: Appendix A.
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?
Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr CS-447– Computer Architecture.
1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.
Computer ArchitectureFall 2007 © October 31, CS-447– Computer Architecture M,W 10-11:20am Lecture 17 Review.
Lec 9: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University.
Appendix A Pipelining: Basic and Intermediate Concepts
Computer Architecture - A Pipelined Datapath A Pipelined Datapath  Resisters are used to save data between stages. 1/14.
Computer Architecture Project Team A Sergio Rico, Ertong Zhang, Vlad Chiriacescu, ZhongYin Zhang.
Memory/Storage Architecture Lab Computer Architecture Pipelining Basics.
1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.
1 COMP541 Multicycle MIPS Montek Singh Apr 4, 2012.
COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.
CMPE 421 Parallel Computer Architecture
COMP25212 Lecture 51 Pipelining Reducing Instruction Execution Time.
COMP541 Multicycle MIPS Montek Singh Mar 25, 2010.
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)
CMPE 421 Parallel Computer Architecture Part 3: Hardware Solution: Control Hazard and Prediction.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CMPE 421 REVIEW: MIDTERM 1. A MODIFIED FIVE-Stage Pipeline PC A Y R MD1 addr inst Inst Memory Imm Ext add rd1 GPRs rs1 rs2 ws wd rd2 we wdata addr wdata.
PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.
EECS 370 Discussion 1 Calvin and Hobbes by Bill Watterson.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Interstage Buffers 1 Computer Organization II © McQuain Pipeline Timing Issues Consider executing: add $t2, $t1, $t0 sub $t3, $t1, $t0 or.
Pipeline Timing Issues
Exceptions Another form of control hazard Could be caused by…
CS2100 Computer Organization
CDA3101 Recitation Section 8
Variable Word Width Computation for Low Power
Morgan Kaufmann Publishers
Single Clock Datapath With Control
Appendix C Pipeline implementation
Chapter 4 The Processor Part 4
ECS 154B Computer Architecture II Spring 2009
Dr. Javier Navaridas Pipelining Dr. Javier Navaridas COMP25212 System Architecture.
Morgan Kaufmann Publishers The Processor
Pipelining review.
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Current Design.
Pipelining in more detail
CSC 4250 Computer Architectures
Data Hazards Data Hazard
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
The Processor Lecture 3.6: Control Hazards
Control unit extension for data hazards
T.H.A.D.D. GROUP TOM DUAN HELEN YU ANDY LEE DANNY HUANG DAWEY HUANG
The Processor Lecture 3.5: Data Hazards
CS 286 Computer Architecture & Organization
Pipelining (II).
Control unit extension for data hazards
Morgan Kaufmann Publishers The Processor
Introduction to Computer Organization and Architecture
Control unit extension for data hazards
MIPS Pipelined Datapath
Problem ??: (?? marks) Consider executing the following code on the MIPS pipelined datapath: add $t5, $t6, $t8 add $t9, $t5, $t4 lw $t3, 100($t9) sub $t2,
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
Presentation transcript:

Sunshine Slam Khian Hao Lim Haywood Ho Soe Myint Leo Ting Ka Hou Chan

Overview Datapath of 10 stages deep pipeline Cache Branch prediction and jump target prediction Critical Path Xilinx Tools, Testing Methodology Performance Current status, Conclusion

Datapath I IF0IF1 IF2 Next PC logic Inst cache Branch prediction PC

Datapath II IDFOEX1EX2 Ctrl Reg file decode Fwd logic Mux A Mux B ALU Branch Verify FO muxes

Datapath III ME1ME2WB Data cache mux FO muxes monitor statistics FO muxes

Cache Architecture BlockRams Write Buffer = = ME1EX2 ME2 SDRAM Processor datapath tags data tag data Cache Meister 2-way set-associativity Random cache-line replacement policy Cache miss detected in ME2 Write buffer congestion detected in ME1

Branch, Jump Prediction

Critical Path PathLogic Delay (ns)Route Delay(ns) Total Delay(ns) Write Buffer (28.7%) (71.3%) Stalling Logic (19.8%) (80.2%) ALU7.177 (36.5%) (63.5%) Forwarding logic (35.1%) (64.9%) Branch Verifier (37.7%) (62.3%) Branch Predictor (24.7%) (75.3%) Forwarding muxes (22.8%) (77.2%)13.824

Xilinx Tools Read up on tutorials on the Xilinx website to become more familiar with the tools Added timing constraints to the clock and other paths Critical path shortened (37ns  25 ns) after adding constraints and constraining fanout Guide design files

Testing Methodology Black-box tests for each module Verify memory controller functionality on board Replaced caches with block RAMs Tested entire processor in simulation Made changes to help alleviate clock skew problems on board “Shadow” register file so we could more easily debug on board

Performance Test Programs CPIBranch Prediction correct % Quick sort2.264% Extra % Base % Measurement Results How did we measure our processor’s performance? Add a Statistics Module Count the numbers of right or wrong predicted branches I Cache and D Cache WB stage Stalling Logic Statistic Module Count the total number of cycles Count the number of valid instructions executed CPI = total cycles / number of valid instructions executed Collect data from different modules In the branch predictor….

Lessons/Evaluation/ Further Improvements Simpler write buffer design Use Smaller write buffer to reduce logic Use random replacement instead of FIFO Pipeline the stalling logic Do the necessary computation in IF2 stage and then decide whether to stall in ID stage Pipeline Branch Verifier Do the computation in EX1 or FO stage and then compare or look up table in the next stage IF2ID FOEX1 EX2