Ken Barr, Ken Conley and Serhii Zhak Checkpoint I: October 19, 2000

Slides:



Advertisements
Similar presentations
Machine cycle.
Advertisements

Computer Organization and Architecture
Dynamic History-Length Fitting: A third level of adaptivity for branch prediction Toni Juan Sanji Sanjeevan Juan J. Navarro Department of Computer Architecture.
SimpleScalar v3.0 Tutorial U. of Wisconsin, CS752, Fall 2004 Andrey Litvin (main source: Austin & Burger) (also Dana Vantrease’ slides)
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
1 Improving Branch Prediction by Dynamic Dataflow-based Identification of Correlation Branches from a Larger Global History CSE 340 Project Presentation.
VLSI Project Neural Networks based Branch Prediction Alexander ZlotnikMarcel Apfelbaum Supervised by: Michael Behar, Spring 2005.
Computer Architecture 2011 – out-of-order execution (lec 7) 1 Computer Architecture Out-of-order execution By Dan Tsafrir, 11/4/2011 Presentation based.
Energy Efficient Instruction Cache for Wide-issue Processors Alex Veidenbaum Information and Computer Science University of California, Irvine.
Lec 17 Nov 2 Chapter 4 – CPU design data path design control logic design single-cycle CPU performance limitations of single cycle CPU multi-cycle CPU.
Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr CS-447– Computer Architecture.
5.2 Mathematical Power, Convenience, and Cost The set of operations represents a tradeoff among the cost of the hardware, the convenience for a programmer,
1 Lecture 8: Instruction Fetch, ILP Limits Today: advanced branch prediction, limits of ILP (Sections , )
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.
Computer Architecture 2010 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Requirements Determine processor core Determine the number of hardware profiles and the benefits of each profile Determine functionality of each profile.
Analysis of Branch Predictors
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Instruction Issue Logic for High- Performance Interruptible Pipelined Processors Gurinder S. Sohi Professor UW-Madison Computer Architecture Group University.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 12 Overview and Concluding Remarks.
1/25 June 28 th, 2006 BranchTap: Improving Performance With Very Few Checkpoints Through Adaptive Speculation Control BranchTap Improving Performance With.
1  1998 Morgan Kaufmann Publishers Where we are headed Performance issues (Chapter 2) vocabulary and motivation A specific instruction set architecture.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Combining Software and Hardware Monitoring for Improved Power and Performance Tuning Eric Chi, A. Michael Salem, and R. Iris Bahar Brown University Division.
Fundamentals of Programming Languages-II
CPU Overview Computer Organization II 1 February 2009 © McQuain & Ribbens Introduction CPU performance factors – Instruction count n Determined.
Exploiting Value Locality in Physical Register Files Saisanthosh Balakrishnan Guri Sohi University of Wisconsin-Madison 36 th Annual International Symposium.
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Application Domains for Fixed-Length Block Structured Architectures ACSAC-2001 Gold Coast, January 30, 2001 ACSAC-2001 Gold Coast, January 30, 2001.
Out-of-order execution Lihu Rappoport 11/ MAMAS – Computer Architecture Out-Of-Order Execution Dr. Lihu Rappoport.
Simulator Outline of MIPS Simulator project  Write a simulator for the MIPS five-stage pipeline that does the following: Implements a subset of.
1/25 HIPEAC 2008 TurboROB TurboROB A Low Cost Checkpoint/Restore Accelerator Patrick Akl 1 and Andreas Moshovos AENAO Research Group Department of Electrical.
§ Georgia Institute of Technology, † Intel Corporation Initial Observations of Hardware/Software Co-simulation using FPGA in Architecture Research Taeweon.
CSE431 L13 SS Execute & Commit.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 13: SS Backend (Execute, Writeback & Commit) Mary Jane.
Computer Organization and Architecture Lecture 1 : Introduction
Dynamic Scheduling Why go out of style?
ATLAS Pre-Production ROD Status SCT Version
Amir Roth and Gurindar S. Sohi University of Wisconsin-Madison
Introduction CPU performance factors
Systems Architecture I
Morgan Kaufmann Publishers The Processor
Morgan Kaufmann Publishers
CS2100 Computer Organisation
A Review of Processor Design Flow
Taeweon Suh § Hsien-Hsin S. Lee § Shih-Lien Lu † John Shen †
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Lecture 8: ILP and Speculation Contd. Chapter 2, Sections 2. 6, 2
Understanding Performance Counter Data - 1
Lecture 18: Pipelining Today’s topics:
Intro to Architecture & Organization
Yingmin Li Ting Yan Qi Zhao
Checkpoint 2: 11/21/00 Kenneth Barr Kenneth Conley Serhii Zhak
Alpha Microarchitecture
Lecture: Out-of-order Processors
15-740/ Computer Architecture Lecture 5: Precise Exceptions
Morgan Kaufmann Publishers The Processor
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Systems Architecture I
Patrick Akl and Andreas Moshovos AENAO Research Group
October 29 Review for 2nd Exam Ask Questions! 4/26/2019
Systems Architecture I
rePLay: A Hardware Framework for Dynamic Optimization
October 9, 2003.
Conceptual execution on a processor which exploits ILP
Predication ECE 721 Prof. Rotenberg.
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
CS2100 Computer Organisation
Presentation transcript:

Ken Barr, Ken Conley and Serhii Zhak Checkpoint I: October 19, 2000 6.893 Project Checkpoint I Ken Barr, Ken Conley, and Serhii Zhak Correlation Between State and Control Signals in Out-of-Order Issue Logic Ken Barr, Ken Conley and Serhii Zhak Checkpoint I: October 19, 2000

Modifications to Project Goals 6.893 Project Checkpoint I Ken Barr, Ken Conley, and Serhii Zhak Modifications to Project Goals More data mining, less implementation Abstracted from particular hardware architecture Correlation statistics What control signals occur at moment instruction is fetched? How consistent is instruction’s path through pipeline? Register renaming statistics Quantitative measure of how quickly register resources freed Use as metric in correlation statistics

6.893 Project Checkpoint I Ken Barr, Ken Conley, and Serhii Zhak Preliminary Results Correlation statistic does not include register renaming, yet Initial statistics show 50-70% rate of consistency for instruction path Run with SpecInt95 benchmark suite Measurement based on mathematical mode Bimodal distribution giving lower-than-expected results

Register Renaming Issues 6.893 Project Checkpoint I Ken Barr, Ken Conley, and Serhii Zhak Register Renaming Issues Register renaming or large register file necessary for OOO performance Should/does register renaming need to be included in prediction? Prediction for register renaming depends on consistent renaming Perfectly consistent renaming equivalent to no renaming (ignoring ISA benefits) SimpleScalar register renaming not as versatile as we had hoped

6.893 Project Checkpoint I Ken Barr, Ken Conley, and Serhii Zhak Examining Renaming Simplescalar RUU scheme ties renaming to all aspects of OOO execution. Hard to isolate renamer We extend Simplescalar to manage a “free list” of physical registers Also, add per-instruction renaming statistics Min/max/avg length of “renamedness” Number of “remaps” Re-execution counter Following slides: 500K instructions of a Perl benchmark with 16 RUU_Stations.

6.893 Project Checkpoint I Ken Barr, Ken Conley, and Serhii Zhak Reservation Length Every instruction needs physical register for 3 cycles. 80% relinquish their mapping within 7 cycles. This gives hope that a scheme could quickly reassign a mapping We need to compare this with average CPI

Remap Rate remap rate = samename / repetitions Discouraging chart! 6.893 Project Checkpoint I Ken Barr, Ken Conley, and Serhii Zhak Remap Rate remap rate = samename / repetitions Discouraging chart! But there’s some good news: “Samename” counts only mappings that match the first mapping. Perhaps two-bit-counter or confidence mechanism could enhance this.

Progress to Checkpoint 2 and Beyond! 6.893 Project Checkpoint I Ken Barr, Ken Conley, and Serhii Zhak Progress to Checkpoint 2 and Beyond! Final version of data mining engines Investigate bimodal distributions Build custom test patterns Run more, longer simulations Continue work on SimpleScalar for collection of statistics