ENGS 116 Lecture 111 ILP: Software Approaches 2 Vincent H. Berk October 14 th Reading for monday: 3.10 – 3.15, 4.7 - 4.11 Reading for today: 4.2 – 4.6.

Slides:



Advertisements
Similar presentations
Instruction-Level Parallelism
Advertisements

ILP: Software Approaches
CS 378 Programming for Performance Single-Thread Performance: Compiler Scheduling for Pipelines Adopted from Siddhartha Chatterjee Spring 2009.
ILP: IntroductionCSCE430/830 Instruction-level parallelism: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.
Anshul Kumar, CSE IITD CSL718 : VLIW - Software Driven ILP Hardware Support for Exposing ILP at Compile Time 3rd Apr, 2006.
CPE 731 Advanced Computer Architecture Instruction Level Parallelism Part I Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
Chapter 4 Predication CSE 820. Michigan State University Computer Science and Engineering Go over midterm exam.
Speculative ExecutionCS510 Computer ArchitecturesLecture Lecture 11 Trace Scheduling, Conditional Execution, Speculation, Limits of ILP.
Compiler techniques for exposing ILP
1 Lecture 5: Static ILP Basics Topics: loop unrolling, VLIW (Sections 2.1 – 2.2)
ENGS 116 Lecture 101 ILP: Software Approaches Vincent H. Berk October 12 th Reading for today: , 4.1 Reading for Friday: 4.2 – 4.6 Homework #2:
Loop Unrolling & Predication CSE 820. Michigan State University Computer Science and Engineering Software Pipelining With software pipelining a reorganized.
FTC.W99 1 Advanced Pipelining and Instruction Level Parallelism (ILP) ILP: Overlap execution of unrelated instructions gcc 17% control transfer –5 instructions.
COMP4611 Tutorial 6 Instruction Level Parallelism
1 Lecture: Static ILP Topics: compiler scheduling, loop unrolling, software pipelining (Sections C.5, 3.2)
ILP: Loop UnrollingCSCE430/830 Instruction-level parallelism: Loop Unrolling CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.
EECC551 - Shaaban #1 Fall 2003 lec# Pipelining and Exploiting Instruction-Level Parallelism (ILP) Pipelining increases performance by overlapping.
1 COMP 740: Computer Architecture and Implementation Montek Singh Tue, Feb 24, 2009 Topic: Instruction-Level Parallelism IV (Software Approaches/Compiler.
EEL Advanced Pipelining and Instruction Level Parallelism Lotzi Bölöni.
Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.
Rung-Bin Lin Chapter 4: Exploiting Instruction-Level Parallelism with Software Approaches4-1 Chapter 4 Exploiting Instruction-Level Parallelism with Software.
Dynamic Branch PredictionCS510 Computer ArchitecturesLecture Lecture 10 Dynamic Branch Prediction, Superscalar, VLIW, and Software Pipelining.
1 ILP (Recap). 2 Basic Block (BB) ILP is quite small –BB: a straight-line code sequence with no branches in except to the entry and no branches out except.
CS152 Lec15.1 Advanced Topics in Pipelining Loop Unrolling Super scalar and VLIW Dynamic scheduling.
Pipelining 5. Two Approaches for Multiple Issue Superscalar –Issue a variable number of instructions per clock –Instructions are scheduled either statically.
1 Advanced Computer Architecture Limits to ILP Lecture 3.
Lecture 3: Chapter 2 Instruction Level Parallelism Dr. Eng. Amr T. Abdel-Hamid CSEN 601 Spring 2011 Computer Architecture Text book slides: Computer Architec.
Static Scheduling for ILP Professor Alvin R. Lebeck Computer Science 220 / ECE 252 Fall 2008.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
1 Lecture 10: Static ILP Basics Topics: loop unrolling, static branch prediction, VLIW (Sections 4.1 – 4.4)
Lecture 8: More ILP stuff Professor Alvin R. Lebeck Computer Science 220 Fall 2001.
3.13. Fallacies and Pitfalls Fallacy: Processors with lower CPIs will always be faster Fallacy: Processors with faster clock rates will always be faster.
Chapter 3 Instruction-Level Parallelism and Its Dynamic Exploitation – Concepts 吳俊興 高雄大學資訊工程學系 October 2004 EEF011 Computer Architecture 計算機結構.
VLIW Compilation Techniques in a Superscalar Environment Kemal Ebcioglu, Randy D. Groves, Ki- Chang Kim, Gabriel M. Silberman and Isaac Ziv PLDI 1994.
EENG449b/Savvides Lec /17/04 February 17, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG.
EENG449b/Savvides Lec /20/04 February 12, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 29, 2003 Topic: Software Approaches for ILP (Compiler Techniques) contd.
1 Lecture 5: Pipeline Wrap-up, Static ILP Basics Topics: loop unrolling, VLIW (Sections 2.1 – 2.2) Assignment 1 due at the start of class on Thursday.
Chapter 2 Instruction-Level Parallelism and Its Exploitation
Microprocessors Introduction to ia64 Architecture Jan 31st, 2002 General Principles.
Tomasulo’s Approach and Hardware Based Speculation
Dynamic Branch Prediction
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
IA-64 ISA A Summary JinLin Yang Phil Varner Shuoqi Li.
Hardware Support for Compiler Speculation
CS 211: Computer Architecture Lecture 6 Module 2 Exploiting Instruction Level Parallelism with Software Approaches Instructor: Morris Lancaster.
Instruction Level Parallelism Pipeline with data forwarding and accelerated branch Loop Unrolling Multiple Issue -- Multiple functional Units Static vs.
CS 258 Spring The Expandable Split Window Paradigm for Exploiting Fine- Grain Parallelism Manoj Franklin and Gurindar S. Sohi Presented by Allen.
Lecture 1: Introduction Instruction Level Parallelism & Processor Architectures.
Instruction-Level Parallelism and Its Dynamic Exploitation
IBM System 360. Common architecture for a set of machines
CS 352H: Computer Systems Architecture
Instruction Level Parallelism
CS203 – Advanced Computer Architecture
Henk Corporaal TUEindhoven 2009
CSL718 : VLIW - Software Driven ILP
Morgan Kaufmann Publishers The Processor
CS 704 Advanced Computer Architecture
CS 704 Advanced Computer Architecture
Yingmin Li Ting Yan Qi Zhao
Computer Architecture
Siddhartha Chatterjee Spring 2008
Henk Corporaal TUEindhoven 2011
Adapted from the slides of Prof
Instruction Level Parallelism (ILP)
CSC3050 – Computer Architecture
Dynamic Hardware Prediction
How to improve (decrease) CPI
Loop-Level Parallelism
Static Scheduling Techniques
Presentation transcript:

ENGS 116 Lecture 111 ILP: Software Approaches 2 Vincent H. Berk October 14 th Reading for monday: 3.10 – 3.15, Reading for today: 4.2 – 4.6 Homework #2 due now.

ENGS 116 Lecture 112 Trace Scheduling Focus on critical path (trace selection) –Compiler has to decide what the critical path (the trace) is –Most likely basic blocks are put in the trace –Loops are unrolled in the trace Now speed it up (trace compaction) –Focus on limiting instruction count –Branches are seen as jumps into or out of the trace Problem: –Significant overhead for parts that are not in the trace –Unclear if it is feasible in practice

ENGS 116 Lecture 113 Superblocks Similar to Trace Scheduling but: –Single entrance, multiple exits Tail duplication: –Handle cases that exited the superblock –Residual loop handling –Could in itself be a superblock Problem: –Code size –Worth the hassle?

ENGS 116 Lecture 114

5 Conditional instructions Instruction that is executed depending on one of its arguments: Instruction is executed but results are not always written. Should only be used for very small sequences, else use normal branch. BNEZ R1, L ADDU R2, R3, R0 L: CMOVZ R2, R3, R1

ENGS 116 Lecture 116 Speculation Compiler moves instructions before branch if: –Data flow is not affected (optionally with use of renaming) –Preserve exception behavior –Avoid load/store address conflicts (no renaming for memory loc.) Preserving exception behavior –Mechanism to indicate an instruction is speculative –Poison bit: raise exception when value is used –Using Conditional instructions: Requires In-Order instruction commit Register renaming Writeback at commit Forwarding Raise exceptions at commit

ENGS 116 Lecture 117 Speculation if (A==0) A=B; else A=A+4; LDR1, 0(R3); load A BNEZR1, L1; test A LDR1, 0(R2); then JL2; skip else L1:DADDIR1, R1, #4; else L2:SDR1, 0(R3); store A LDR1, 0(R3); load A LDR14, 0(R2); load B (speculative) BEQZR1, L3; branch if DADDIR14, R1, #4; else L3:SDR14, 0(R3); store A