7. Microarchitecture of Superscalars (5) Dynamic Instruction Issue

Slides:



Advertisements
Similar presentations
Scoreboarding & Tomasulos Approach Bazat pe slide-urile lui Vincent H. Berk.
Advertisements

Hardware-Based Speculation. Exploiting More ILP Branch prediction reduces stalls but may not be sufficient to generate the desired amount of ILP One way.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Speeding it up Part 3: Out-Of-Order and SuperScalar execution dr.ir. A.C. Verschueren.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Oct 19, 2005 Topic: Instruction-Level Parallelism (Multiple-Issue, Speculation)
A scheme to overcome data hazards
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.
Superscalar Organization Prof. Mikko H. Lipasti University of Wisconsin-Madison Lecture notes based on notes by John P. Shen Updated by Mikko Lipasti.
Spring 2003CSE P5481 Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers)
Microarchitecture of Superscalars (7) Preserving sequential consistency Dezső Sima Fall 2007 (Ver. 2.0)  Dezső Sima, 2007.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Oct. 14, 2002 Topic: Instruction-Level Parallelism (Multiple-Issue, Speculation)
EEL 5708 Speculation. Branch prediction. Superscalar processors. Lotzi Bölöni.
DAP Spr.‘98 ©UCB 1 Lecture 6: ILP Techniques Contd. Laxmi N. Bhuyan CS 162 Spring 2003.
National & Kapodistrian University of Athens Dep.of Informatics & Telecommunications MSc. In Computer Systems Technology Advanced Computer Architecture.
1 Microprocessor-based Systems Course 4 - Microprocessors.
CPSC614 Lec 5.1 Instruction Level Parallelism and Dynamic Execution #4: Based on lectures by Prof. David A. Patterson E. J. Kim.
Mult. Issue CSE 471 Autumn 011 Multiple Issue Alternatives Superscalar (hardware detects conflicts) –Statically scheduled (in order dispatch and hence.
1 Lecture 9: More ILP Today: limits of ILP, case studies, boosting ILP (Sections )
CIS 629 Fall 2002 Multiple Issue/Speculation Multiple Instruction Issue: CPI < 1 To improve a pipeline’s CPI to be better [less] than one, and to utilize.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
Computer ArchitectureFall 2007 © October 29th, 2007 Majd F. Sakr CS-447– Computer Architecture.
The PowerPC Architecture  IBM, Motorola, and Apple Alliance  Based on the IBM POWER Architecture ­Facilitate parallel execution ­Scale well with advancing.
Lecture 8 Shelving in Superscalar Processors (Part 1)
Microarchitecture of Superscalars (4) Decoding Dezső Sima Fall 2007 (Ver. 2.0)  Dezső Sima, 2007.
Microarchitecture of Superscalars (5) Dynamic Instruction Issue Dezső Sima Fall 2007 (Ver. 2.0)  Dezső Sima, 2007.
Evolution of the ILP Processing Dezső Sima Fall 2007 (Ver. 2.0)  Dezső Sima, 2007.
Computer Architecture Computer Architecture Superscalar Processors Ola Flygt Växjö University +46.
1 Sixth Lecture: Chapter 3: CISC Processors (Tomasulo Scheduling and IBM System 360/91) Please recall:  Multicycle instructions lead to the requirement.
Complexity-Effective Superscalar Processors S. Palacharla, N. P. Jouppi, and J. E. Smith Presented by: Jason Zebchuk.
Anshul Kumar, CSE IITD CSL718 : Superscalar Processors Issue and Despatch 23rd Jan, 2006.
Trace cache and Back-end Oper. CSE 4711 Instruction Fetch Unit Using I-cache I-cache I-TLB Decoder Branch Pred Register renaming Execution units.
CS5222 Advanced Computer Architecture Part 3: VLIW Architecture
A. Moshovos ©ECE Fall ‘07 ECE Toronto Out-of-Order Execution Structures.
1 Lecture 7: Speculative Execution and Recovery Branch prediction and speculative execution, precise interrupt, reorder buffer.
Computer Architecture: Out-of-Order Execution II
1 Lecture: Out-of-order Processors Topics: a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
15-740/ Computer Architecture Lecture 12: Issues in OoO Execution Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/7/2011.
Samira Khan University of Virginia Feb 9, 2016 COMPUTER ARCHITECTURE CS 6354 Precise Exception The content and concept of this course are adapted from.
Microarchitecture of Superscalars (6) Register renaming Dezső Sima Spring 2008 (Ver. 2.0)  Dezső Sima, 2008.
CS203 – Advanced Computer Architecture ILP and Speculation.
15-740/ Computer Architecture Lecture 7: Out-of-Order Execution Prof. Onur Mutlu Carnegie Mellon University.
ECE/CS 552: Introduction to Superscalar Processors and the MIPS R10000 © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill,
Precise Exceptions and Out-of-Order Execution
Design of Digital Circuits Lecture 18: Out-of-Order Execution
COMP 740: Computer Architecture and Implementation
CS161 – Design and Architecture of Computer Systems
PowerPC 604 Superscalar Microprocessor
Out of Order Processors
Dynamic Scheduling and Speculation
Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014
Flow Path Model of Superscalars
Power-Aware Operand Delivery
I. Evolution of the ILP Processing
Out of Order Processors
Superscalar Processors & VLIW Processors
Lecture 8: ILP and Speculation Contd. Chapter 2, Sections 2. 6, 2
Adapted from the slides of Prof
Lecture 7: Dynamic Scheduling with Tomasulo Algorithm (Section 2.4)
Reduction of Data Hazards Stalls with Dynamic Scheduling
Adapted from the slides of Prof
Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/30/2011
1. Evolution of ILP-processing
Microarchitecture of Superscalars (4) Decoding
15-740/ Computer Architecture Lecture 10: Out-of-Order Execution
Chapter 3: ILP and Its Exploitation
Prof. Onur Mutlu Carnegie Mellon University
CSL718 : Superscalar Processors
Lecture 7 Dynamic Scheduling
Conceptual execution on a processor which exploits ILP
Design of Digital Circuits Lecture 16: Out-of-Order Execution
Presentation transcript:

7. Microarchitecture of Superscalars (5) Dynamic Instruction Issue Dezső Sima Fall 2006  D. Sima, 2006

Overview 1 The principle of dynamic instruction issue 2 Design space 2.2 Types of issue buffers 2.3 Operand fetch policies 3 Principle of operation of dynamic instruction issue 3.1 Dispatch bound operand fetching 3.2 Issue bound operand fetching 4 Implementation of dynamic instruction issue in superscalars 4.1 The introduction of dynamic instruction issue 4.2 Basic implementation schemes 5 Case examples

1. Principle of dynamic instruction issue (1) Aim: To eliminate the issue bottleneck of early (first generation) supercalars

1. Principle of dynamic instruction issue (2) The issue bottleneck Icache I-buffer Instr. window (3) Issue Decode, check, Dependent instructions issue block instruction issue EU EU (a): Simplified structure of the mikroarchitecture assuming unbuffered issue (b): The issue process Figure 1.1: The principle of dynamic instruction issue

1. Principle of dynamic instruction issue (3) Eliminating the issue bottleneck Dynamic instruction issue (shelving, buffered issue) (a): Simplified structure of the mikroarchitecture assuming buffered issue (shelving) (b): The issue process Figure 1.2: Principle of dynamic instruction issue

2. Design space of dynamic instruction issue 2.1 Overview Dynamic instruction issue Scope of dynamic instr. issue Layout of the issue buffers Operand fetch policy Instruction issue scheme Types of issue buffers

Reservation stations (RS) 2.2 Types of issue buffers Types of issue buffers Reservation stations (RS) Issue buffers in the ROB Individual RSs Group RSs Central RS FX EU FP FX EU RS FP RS FX EU FP RS FX EU FP Power1 (1990) PowerPC 603 (1993) PowerPC 604 (1995) Power4 (2001) Power5 (2004) K5 (1995) K7 (1999), K8 (2003) ES/9000 (1992) Power2 (1993) R10000 (1996) PM1(Sparc64)(1995) Alpha 21264 (1997) Pentium Pro (1995) Pentium II (1997) Pentium III (1999) Pentium IV (2000) Pentium M (2003) Core (2006) Lightning (1991)p K6 (1997)

Dynamic instruction issue Scope of buffered issue Layout of the issue buffers Operand fetch policy Instruction issue scheme Types of issue buffers

Operand fetch policies Dispatch bound operand fetch policy Issue bound operand fetch policy I-buffer I-buffer Decode / Issue Decode / Issue Source reg. identifiers Source reg. identifiers Dispatch Dispatch Reg. file IB IB Opcodes, destination reg. identifiers Issue Source 1 operands OC Rd Rs1 Rs2 OC Rd Rs1 Rs2 Source reg. identifiers Source 2 operands IB Rd Op1/Rs1 Op2/Rs2 IB Opcodes, destination reg. identifiers Reg. file Issue OC OC Rd Op1/Rs1 Op2/Rs2 Source 1 operands Source 2 operands EU EU EU EU Rd, result Figure 2.1: Operand fetch policies

3 Principle of operation of dynamic instruction issue 3.1 Dispatch bound operand fetching (1) Checking the availability of operands I-buffer Decode / Issue Source reg. identifiers Dispatch V Reg. file Opcodes, destination reg. identifiers Source 1 operands Source 2 operands V V V V IB Rd Op1/Rs1 Op2/Rs2 IB Issue OC OC Rd Op1/Rs1 Op2/Rs2 EU EU Rd, result

3.1 Dispatch bound operand fetching (2) Updating the issue buffers I-buffer Decode / Issue Source reg. identifiers Dispatch V Reg. file Opcodes, destination reg. identifiers Source 1 operands Source 2 operands V V V V IB Rd Op1/Rs1 Op2/Rs2 IB Issue OC OC Rd Op1/Rs1 Op2/Rs2 EU EU Rd, result

3.2 Issue bound operand fetching Checking the availability of operands I-buffer Decode / Issue Source reg. identifiers Dispatch IB IB Issue OC Rd Rs1 Rs2 OC Rd Rs1 Rs2 Source reg. identifiers V Opcodes, destination reg. identifiers Reg. file Source 1 operands Source 2 operands EU EU

4. Implementation of dynamic instruction issue in superscalars 4.1 The introduction of dynamic instruction issue Figure 4.1: The introduction of dynamic instruction issue

Basic issue buffer schemes Reservation stations (RS) 4.2 Basic implementation schemes Basic issue buffer schemes Reservation stations (RS) Issue buffers in the ROB Types of issue buffers Individual RSs Group RSs Central RS Dispatch bound Issue bound Dispatch bound Issue bound Dispatch bound Issue bound Dispatch bound Issue bound Operand fetch policy PowerPC 603 (1993) PowerPC 604 (1995) K5 (1995) PM1(Sparc64) (1995) Pentium Pro (1995) Pentium II (1997) Pentium III (1999) Power1 (1990) Power4 (2001) Power5 (2004) Nx586 (1994) K7 (1999), K8 (2003) ES/9000 (1992) Power2 (1993) R10000 (1996) Alpha 21264 (1997) Pentium IV (2000) Pentium M (2003) Core (2006) Lightning (1991)p K6 (1997)

Individual issue buffers 5. Case example (1) Individual issue buffers Figure 5.1: The microarchitecture of the Athlon

Individual issue buffers (2) 5. Case example (1) Individual issue buffers (2) Decoders Issue buffers EUs Figure 5.2: Integer issue buffers of the K8L Source: Malich, Y.„AMD's Next Generation Microarchitecture Preview: from K8 to K8L”, Aug. 2006.

Figure 5.3: The microarchitecture of the Alpha 21264 5. Case example (2) Group issue buffers Figure 5.3: The microarchitecture of the Alpha 21264 Source: Kessler, R.E. et al. .„The Alpha 21264 Microprocessor Architecture”, h18002.www1.hp.com/alphaserver

Central reservation station (1) 5. Case example (3) Central reservation station (1) Figure 5.3: The microarchitecture of the Core processor Source: Kanter, D., „Intel’s next Generation Microarchitecture Unveiled”, Real World Tech., 2006 March 9.