Unit II Intel IA-64 and Itanium Processor By N.R.Rejin Paul Lecturer/VIT/CSE CS2354 Advanced Computer Architecture.

Slides:



Advertisements
Similar presentations
Intro to the “c6x” VLIW processor
Advertisements

® IA-64 Architecture Innovations John Crawford Architect & Intel Fellow Intel Corporation Jerry Huck Manager & Lead Architect Hewlett Packard Co.
Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.
Dynamic Branch Prediction (Sec 4.3) Control dependences become a limiting factor in exploiting ILP So far, we’ve discussed only static branch prediction.
1 Lecture 5: Static ILP Basics Topics: loop unrolling, VLIW (Sections 2.1 – 2.2)
CPE 631: ILP, Static Exploitation Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenkovic,
CPE 731 Advanced Computer Architecture ILP: Part V – Multiple Issue Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture VLIW Steve Ko Computer Sciences and Engineering University at Buffalo.
CS252 Graduate Computer Architecture Spring 2014 Lecture 9: VLIW Architectures Krste Asanovic
1 Lecture: Static ILP Topics: compiler scheduling, loop unrolling, software pipelining (Sections C.5, 3.2)
1 COMP 740: Computer Architecture and Implementation Montek Singh Tue, Feb 24, 2009 Topic: Instruction-Level Parallelism IV (Software Approaches/Compiler.
Rung-Bin Lin Chapter 4: Exploiting Instruction-Level Parallelism with Software Approaches4-1 Chapter 4 Exploiting Instruction-Level Parallelism with Software.
Pipelining 5. Two Approaches for Multiple Issue Superscalar –Issue a variable number of instructions per clock –Instructions are scheduled either statically.
1 Advanced Computer Architecture Limits to ILP Lecture 3.
1 Lecture 10: Static ILP Basics Topics: loop unrolling, static branch prediction, VLIW (Sections 4.1 – 4.4)
CPE432 Chapter 4C.1Dr. W. Abu-Sufah, UJ Chapter 4C: The Processor, Part C Read Section 4.10 Parallelism and Advanced Instruction-Level Parallelism Adapted.
3.13. Fallacies and Pitfalls Fallacy: Processors with lower CPIs will always be faster Fallacy: Processors with faster clock rates will always be faster.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 3 (and Appendix C) Instruction-Level Parallelism and Its Exploitation Computer Architecture.
Chapter 4 Exploiting Instruction-Level Parallelism with Software Approaches 吳俊興 高雄大學資訊工程學系 November 2004 EEF011 Computer Architecture 計算機結構.
Intel Itanium 2 Processor Intel’s Server Solution Raymond Ball April 2, 2004.
Instruction Level Parallelism (ILP) Colin Stevens.
Chapter 15 IA-64 Architecture No HW, Concentrate on understanding these slides Next Monday we will talk about: Microprogramming of Computer Control units.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
COMP381 by M. Hamdi 1 Superscalar Processors. COMP381 by M. Hamdi 2 Recall from Pipelining Pipeline CPI = Ideal pipeline CPI + Structural Stalls + Data.
Chapter 2 Instruction-Level Parallelism and Its Exploitation
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 3.
Microprocessors Introduction to ia64 Architecture Jan 31st, 2002 General Principles.
Chapter 15 IA-64 Architecture. Reflection on Superscalar Machines Superscaler Machine: A Superscalar machine employs multiple independent pipelines to.
Chapter 21 IA-64 Architecture (Think Intel Itanium)
IA-64 Architecture (Think Intel Itanium) also known as (EPIC – Extremely Parallel Instruction Computing) a new kind of superscalar computer HW 5 - Due.
COMP381 by M. Hamdi 1 Commercial Superscalar and VLIW Processors.
Chapter 15 IA-64 Architecture or (EPIC – Extremely Parallel Instruction Computing)
IA-64 ISA A Summary JinLin Yang Phil Varner Shuoqi Li.
 Arun Hariharan (N.M.S.U). MOTIVATION  Need for high speed computing and Architecture More complex compilers (JAVA) Large Database Systems Distributed.
The Arrival of the 64bit CPUs - Itanium1 นายชนินท์วงษ์ใหญ่รหัส นายสุนัยสุขเอนกรหัส
Is Out-Of-Order Out Of Date ? IA-64’s parallel architecture will improve processor performance William S. Worley Jr., HP Labs Jerry Huck, IA-64 Architecture.
Anshul Kumar, CSE IITD CS718 : VLIW - Software Driven ILP Example Architectures 6th Apr, 2006.
10/27: Lecture Topics Survey results Current Architectural Trends Operating Systems Intro –What is an OS? –Issues in operating systems.
Hardware Support for Compiler Speculation
Spring 2003CSE P5481 VLIW Processors VLIW (“very long instruction word”) processors instructions are scheduled by the compiler a fixed number of operations.
Introducing The IA-64 Architecture - Kalyan Gopavarapu - Kalyan Gopavarapu.
IA-64 Architecture RISC designed to cooperate with the compiler in order to achieve as much ILP as possible 128 GPRs, 128 FPRs 64 predicate registers of.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
CIS 662 – Computer Architecture – Fall Class 16 – 11/09/04 1 Compiler Techniques for ILP  So far we have explored dynamic hardware techniques for.
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
Next Generation ISA Itanium / IA-64. Operating Environments IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS IA-64 Instruction Set.
Embedded Computer Architectures Hennessy & Patterson Chapter 4 Exploiting ILP with Software Approaches Gerard Smit (Zilverling 4102),
1 Lecture 7: Speculative Execution and Recovery Branch prediction and speculative execution, precise interrupt, reorder buffer.
Recap Multicycle Operations –MIPS Floating Point Putting It All Together: the MIPS R4000 Pipeline.
1 Lecture 12: Advanced Static ILP Topics: parallel loops, software speculation (Sections )
Lecture 1: Introduction Instruction Level Parallelism & Processor Architectures.
IA64 Complier Optimizations Alex Bobrek Jonathan Bradbury.
IA-64 Architecture Muammer YÜZÜGÜLDÜ CMPE /12/2004.
CS 352H: Computer Systems Architecture
A Closer Look at Instruction Set Architectures
VLIW Architecture FK Boachie..
CPE 731 Advanced Computer Architecture ILP: Part V – Multiple Issue
Henk Corporaal TUEindhoven 2009
The EPIC-VLIW Approach
Some Real Machines Intel® M80C186 CHMOS High Integration 16-bit Microprocessor Intel® Itanium™ 64-bit Microprocessor (IA-64)
Yingmin Li Ting Yan Qi Zhao
Henk Corporaal TUEindhoven 2011
Sampoorani, Sivakumar and Joshua
CC423: Advanced Computer Architecture ILP: Part V – Multiple Issue
Midterm 2 review Chapter
VLIW direct descendant of horizontal microprogramming
CSC3050 – Computer Architecture
William Stallings Computer Organization and Architecture
Presentation transcript:

Unit II Intel IA-64 and Itanium Processor By N.R.Rejin Paul Lecturer/VIT/CSE CS2354 Advanced Computer Architecture

2 4.7 Intel IA-64 and Itanium Processor Designed to benefit VLIW approach IA-64 Register Model bit GPR (65 bits actually) bit floating-point registers – two extra exponent bits over the standard 80-bit IEEE format 64 1-bit predicate register 8 64-bit branch registers, used for indirect branches a variety of registers used for system control, etc. other supports: – register stack frame: like register window in SPARC current frame pointer (CFM) register stack engine

3 Five Execution Unit Slots in IA-64

4 Instruction Groups and Bundle Two concepts to achieve the benefits of implicit parallelism and ease of instruction decode instruction group: a sequence of consecutive instructions without register data dependences – instructions in the group can be executed in parallel – arbitrarily long, but the compiler explicitly indicates the boundary by placing a stop bundle: fixed formatting of multiple instructions (3) – IA-64 instructions are encoded in bundles – 128 bits wide: 5-bit template field and three 41-bit instructions the template field describes the presence of stops and specifies types of execution units for each instruction

5 24 Possible Template Values and Formats 8 possible values are reserved Stops are indicated by heavy lines

6 Example: Unroll x[i]=x[i]+s Seven Times 9 bundles 21 cycles 85% of slots filled (23/27) 11 bundles 12 cycles 70% of slots filled (23/33)

7 Some Instruction Formats of IA- 64 See Textbook or Intel’s manuals for more information

8 Predication and Speculation Support instr 1 instr 2 : br ld r1=… use …=r1 ld.s r1=… instr 1 instr 2 : br chk.s r1 use …=r1 ld.s r1=… instr 1 use …=r1instr 2 : br chk.s use Recovery code ld r1=… use …=r1 br Traditional arch. with branch barrier Itanium Support for Explicit Parallelism LOAD moved above branch by compiler Uses moved above branch by compiler

9 Summary Chapter 4Exploiting Instruction-Level Parallelism with Software Approaches 4.1Basic Compiler Techniques for Exposing ILP 4.2Static Branch Prediction 4.3Static Multiple Issue: The VLIW Approach 4.4Advanced Compiler Support for ILP 4.7Intel IA-64 Architecture and Itanium Processor