IA-64 Architecture RISC designed to cooperate with the compiler in order to achieve as much ILP as possible 128 GPRs, 128 FPRs 64 predicate registers of.

Slides:

Advertisements

Similar presentations

The IA-64 Architectural Innovations Hardware Support for Software Pipelining José Nelson Amaral 1.

Advertisements

Intro to the “c6x” VLIW processor

® IA-64 Architecture Innovations John Crawford Architect & Intel Fellow Intel Corporation Jerry Huck Manager & Lead Architect Hewlett Packard Co.

IMPACT Second Generation EPIC Architecture Wen-mei Hwu IMPACT Second Generation EPIC Architecture Wen-mei Hwu Department of Electrical and Computer Engineering.

Anshul Kumar, CSE IITD CSL718 : VLIW - Software Driven ILP Hardware Support for Exposing ILP at Compile Time 3rd Apr, 2006.

Chapter 4 Predication CSE 820. Michigan State University Computer Science and Engineering Go over midterm exam.

ENGS 116 Lecture 111 ILP: Software Approaches 2 Vincent H. Berk October 14 th Reading for monday: 3.10 – 3.15, Reading for today: 4.2 – 4.6.

Loop Unrolling & Predication CSE 820. Michigan State University Computer Science and Engineering Software Pipelining With software pipelining a reorganized.

CPE 731 Advanced Computer Architecture ILP: Part V – Multiple Issue Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.

Rung-Bin Lin Chapter 4: Exploiting Instruction-Level Parallelism with Software Approaches4-1 Chapter 4 Exploiting Instruction-Level Parallelism with Software.

1 Advanced Computer Architecture Limits to ILP Lecture 3.

Limits on ILP. Achieving Parallelism Techniques – Scoreboarding / Tomasulo’s Algorithm – Pipelining – Speculation – Branch Prediction But how much more.

1 Lecture: Static ILP Topics: predication, speculation (Sections C.5, 3.2)

1 Lecture 7: Static ILP, Branch prediction Topics: static ILP wrap-up, bimodal, global, local branch prediction (Sections )

Instruction Level Parallelism (ILP) Colin Stevens.

CS 152 Computer Architecture & Engineering Andrew Waterman University of California, Berkeley Section 8 Spring 2010.

The IA-64 architecture and Itanium processors Explicitly Parallel Instruction Computing Frans Dondorp Presentation et 4 074, January 8 th 2001 Frans Dondorp.

Performance Potentials of Compiler- directed Data Speculation Author: Youfeng Wu, Li-Ling Chen, Roy Ju, Jesse Fang Programming Systems Research Lab Intel.

Compiler Optimizations for Modern Hardware Architectures - Part II Bob Wall CS 550 (Fall 2003) Class Presentation Compiling for the Intel® Itanium® – A.

Microprocessors Introduction to ia64 Architecture Jan 31st, 2002 General Principles.

Multiscalar processors

NYU DARPA DIS kick-off September 24, Comparing IA-64 and HPL-PD NYU.

Chapter 15 IA-64 Architecture. Reflection on Superscalar Machines Superscaler Machine: A Superscalar machine employs multiple independent pipelines to.

1 Lecture 6: Static ILP Topics: loop analysis, SW pipelining, predication, speculation (Section 2.2, Appendix G) Assignment 2 posted; due in a week.

Chapter 21 IA-64 Architecture (Think Intel Itanium)

IA-64 Architecture (Think Intel Itanium) also known as (EPIC – Extremely Parallel Instruction Computing) a new kind of superscalar computer HW 5 - Due.

1 Lecture 7: Static ILP and branch prediction Topics: static speculation and branch prediction (Appendix G, Section 2.3)

Compiler Speculative Optimizations Wei Hsu 7/05/2006.

Static Optimizations (aka: the complier) Dr. Mark Brehob EECS 470.

IA-64 ISA A Summary JinLin Yang Phil Varner Shuoqi Li.

Instruction-Level Parallelism for Low-Power Embedded Processors January 23, 2001 Presented By Anup Gangwar.

The Arrival of the 64bit CPUs - Itanium1 นายชนินท์วงษ์ใหญ่รหัส นายสุนัยสุขเอนกรหัส

Anshul Kumar, CSE IITD CS718 : VLIW - Software Driven ILP Example Architectures 6th Apr, 2006.

Transmeta and Dynamic Code Optimization Ashwin Bharambe Mahim Mishra Matthew Rosencrantz.

Hardware Support for Compiler Speculation

Spring 2003CSE P5481 VLIW Processors VLIW (“very long instruction word”) processors instructions are scheduled by the compiler a fixed number of operations.

Introducing The IA-64 Architecture - Kalyan Gopavarapu - Kalyan Gopavarapu.

CS 211: Computer Architecture Lecture 6 Module 2 Exploiting Instruction Level Parallelism with Software Approaches Instructor: Morris Lancaster.

Microprocessor system architectures – IA64 Jakub Yaghob.

Advanced Computer Architecture Lab University of Michigan Compiler Controlled Value Prediction with Branch Predictor Based Confidence Eric Larson Compiler.

Transmeta’s New Processor Another way to design CPU By Wu Cheng

Caltech CS184 Spring DeHon 1 CS184b: Computer Architecture (Abstractions and Optimizations) Day 7: April 21, 2003 EPIC, IA-64 Binary Translation.

3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,

1 Lecture 12: Advanced Static ILP Topics: parallel loops, software speculation (Sections )

Caltech CS184b Winter DeHon 1 CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] Day11:

Lecture 1: Introduction Instruction Level Parallelism & Processor Architectures.

Unit II Intel IA-64 and Itanium Processor By N.R.Rejin Paul Lecturer/VIT/CSE CS2354 Advanced Computer Architecture.

IA64 Complier Optimizations Alex Bobrek Jonathan Bradbury.

1 Lecture: Static ILP Topics: predication, speculation (Sections C.5, 3.2)

IA-64 Architecture Muammer YÜZÜGÜLDÜ CMPE /12/2004.

1 Lecture 10: Memory Dependence Detection and Speculation Memory correctness, dynamic memory disambiguation, speculative disambiguation, Alpha Example.

Value Prediction Kyaw Kyaw, Min Pan Final Project.

CS203 – Advanced Computer Architecture ILP and Speculation.

Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.

Use of Pipelining to Achieve CPI < 1

CS 352H: Computer Systems Architecture

Henk Corporaal TUEindhoven 2009

Lecture: Static ILP Topics: predication, speculation (Sections C.5, 3.2)

CS 152 Computer Architecture & Engineering

Lecture 6: Static ILP, Branch prediction

Lecture: Static ILP, Branch Prediction

Yingmin Li Ting Yan Qi Zhao

Lecture: Branch Prediction

15-740/ Computer Architecture Lecture 5: Precise Exceptions

Lecture: Static ILP Topics: predication, speculation (Sections C.5, 3.2)

Lecture: Static ILP Topics: predication, speculation (Sections C.5, 3.2)

Lecture 7: Dynamic Scheduling with Tomasulo Algorithm (Section 2.4)

Henk Corporaal TUEindhoven 2011

Sampoorani, Sivakumar and Joshua

IA-64 Vincent D. Capaccio.

Presentation transcript:

IA-64 Architecture RISC designed to cooperate with the compiler in order to achieve as much ILP as possible 128 GPRs, 128 FPRs 64 predicate registers of 1 bit Similarities with VLIW Instruccions are issued in templates of up to 3 The compiler marks the limits between groups of instructions that can be executed in parallel

Predication in IA-64 Particularly useful for branches difficult to predict Increases ILP, but can also increase critical path length Predication on registers that mark the result of the conditions for conditional branches Allows the parallel execution of several branches IA-64 allows to predicate almost all its instructions

Example

Memory References Speculation (I) Speculation on data dependences A load bypasses a store that could affect the same memory position The speculative/advanced load, ld.a, stores the address accessed and the modified register in the ALAT (Advanced Load Address Table) Addresses affected by storage operations are verified in the ALAT If the address matches an address in any entry, it is marked

Memory References Speculation (II) Before a non-speculative usse of the data generated by the speculative load, the ALAT is verified Entry is not marked  correct execution; clear entry Entry is marked  speculation failed. The action to take depends on the check instruction that is used. Two kinds of check: ld.c : On failure, reload the data, now with the correct value. It is used when the data has not been used yet chk.a : provides also the address of a routine to execute that solved the problems generated by the failure. It re- executed the load and the additional instructions that have used the preloaded value

Example Source Code Standard LoadAdvanced Load

Control Speculation for Loads Speculation on control dependences A special bit (NaT=Not a Thing) associted to the destination register marks the success of the load In the FP registers it is a value called NaTVal. The potential exceptions associated to these loads, ld.s, are defered till the control dependences are solved This is not the case for the memory reference speculation chk.s verifies the NaT bits / NatVal values Both kinds of load speculation can be combined with the instruccions ld.sa and chk.a

Example Source CodeSpeculative Load