Optimizing Compilers for Modern Architectures Syllabus Allen and Kennedy, Preface Optimizing Compilers for Modern Architectures.

Slides:



Advertisements
Similar presentations
Code Optimization and Performance Chapter 5 CS 105 Tour of the Black Holes of Computing.
Advertisements

Optimizing Compilers for Modern Architectures Compiler Improvement of Register Usage Chapter 8, through Section 8.4.
Optimizing Compilers for Modern Architectures Coarse-Grain Parallelism Chapter 6 of Allen and Kennedy.
Compiler Support for Superscalar Processors. Loop Unrolling Assumption: Standard five stage pipeline Empty cycles between instructions before the result.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Optimizing Compilers for Modern Architectures Allen and Kennedy, Chapter 13 Compiling Array Assignments.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Parallelism & Locality Optimization.
Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Programmability Issues
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Compiler Challenges for High Performance Architectures
Enhancing Fine-Grained Parallelism Chapter 5 of Allen and Kennedy Optimizing Compilers for Modern Architectures.
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
Optimizing Compilers for Modern Architectures Preliminary Transformations Chapter 4 of Allen and Kennedy.
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
Introductory Courses in High Performance Computing at Illinois David Padua.
Parallel and Cluster Computing 1. 2 Optimising Compilers u The main specific optimization is loop vectorization u The compilers –Try to recognize such.
TM Pro64™: Performance Compilers For IA-64™ Jim Dehnert Principal Engineer 5 June 2000.
Stanford University CS243 Winter 2006 Wei Li 1 Loop Transformations and Locality.
Enhancing Fine- Grained Parallelism Chapter 5 of Allen and Kennedy Mirit & Haim.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
Compiler Optimizations for Memory Hierarchy Chapter 20 High Performance Compilers.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
2015/6/21\course\cpeg F\Topic-1.ppt1 CPEG 421/621 - Fall 2010 Topics I Fundamentals.
Introduction to Program Optimizations Chapter 11 Mooly Sagiv.
Case Studies of Compilers and Future Trends Chapter 21 Mooly Sagiv.
Optimizing Compilers for Modern Architectures Coarse-Grain Parallelism Chapter 6 of Allen and Kennedy.
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic B: Loop Restructuring José Nelson Amaral
Parallelizing Compilers Presented by Yiwei Zhang.
Data-Flow Analysis (Chapter 11-12) Mooly Sagiv Make-up class 18/ :00 Kaplun 324.
Center for Embedded Computer Systems University of California, Irvine SPARK: A High-Level Synthesis Framework for Applying.
Lecture 1CS 380C 1 380C Last Time –Course organization –Read Backus et al. Announcements –Hadi lab Q&A Wed 1-2 in Painter 5.38N –UT Texas Learning Center:
Introduction & Overview CS4533 from Cooper & Torczon.
CPSC614 Lec 6.1 Exploiting Instruction-Level Parallelism with Software Approach #1 E. J. Kim.
Dependence: Theory and Practice Allen and Kennedy, Chapter 2 Liza Fireman.
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2.
Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
Overview of the Course. Critical Facts Welcome to CISC 672 — Advanced Compiler Construction Instructor: Dr. John Cavazos Office.
Exploiting SIMD parallelism with the CGiS compiler framework Nicolas Fritz, Philipp Lucas, Reinhard Wilhelm Saarland University.
1 Advance Computer Architecture CSE 8383 Ranya Alawadhi.
CSc 453 Final Code Generation Saumya Debray The University of Arizona Tucson.
1 Optimizing compiler tools and building blocks project Alexander Drozdov, PhD Sergey Novikov, PhD.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Dependence Analysis and Loop Transformations.
1 CS 201 Compiler Construction Introduction. 2 Instructor Information Rajiv Gupta Office: WCH Room Tel: (951) Office.
High-Level Transformations for Embedded Computing
Lexical analyzer Parser Semantic analyzer Intermediate-code generator Optimizer Code Generator Postpass optimizer String of characters String of tokens.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
FPGA Hardware Synthesis Jessica Baxter. Reference M. Haldar, A. Nayak, N. Shenoy, A. Choudhary and P. Banerjee, “FPGA Hardware Synthesis from MATLAB”,
High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.
Optimizing Compilers for Modern Architectures Enhancing Fine-Grained Parallelism Part II Chapter 5 of Allen and Kennedy.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
Memory-Aware Compilation Philip Sweany 10/20/2011.
Lecture 38: Compiling for Modern Architectures 03 May 02
Topics to be covered Instruction Execution Characteristics
Code Optimization Overview and Examples
High-level optimization Jakub Yaghob
Code Optimization.
Computer Architecture Principles Dr. Mike Frank
Introduction to Advanced Topics Chapter 1 Text Book: Advanced compiler Design implementation By Steven S Muchnick (Elsevier)
Interprocedural Analysis Chapter 19
Presented by: Huston Bokinsky Ying Zhang 25 April, 2013
Preliminary Transformations
Optimizing Transformations Hal Perkins Winter 2008
Lecture 19: Code Optimisation
The SGI Pro64 Compiler Infrastructure
Presentation transcript:

Optimizing Compilers for Modern Architectures Syllabus Allen and Kennedy, Preface Optimizing Compilers for Modern Architectures

Dependence-Based Compilation Vectorization and Parallelization require a deeper analysis than optimization for scalar machines —Must be able to determine whether two accesses to the same array might be to the same location Dependence is the theory that makes this possible —There is a dependence between two statements if they might access the same location, there is a path from one to the other, and one access is a write Dependence has other applications —Memory hierarchy management—restructuring programs to make better use of cache and registers –Includes input dependences —Scheduling of instructions

Optimizing Compilers for Modern Architectures Syllabus I Introduction —Parallel and vector architectures. The problem of parallel programming. Bernstein's conditions and the role of dependence. Compilation for parallel machines and automatic detection of parallelism. Dependence Theory and Practice —Fundamentals, types of dependences. Testing for dependence: separable, gcd and Banerjee tests. Exact dependence testing. Construction of direction and distance vectors. Preliminary Transformations —Loop normalization, scalar data flow analysis, induction variable substitution, scalar renaming.

Optimizing Compilers for Modern Architectures Syllabus II Fine-Grain Parallel Code Generation —Loop distribution and its safety. The Kuck vectorization principle. The layered vector code-generation algorithm and its complexity. Loop interchange. Coarse-Grain Parallel Code Generation —Loop Interchange. Loop Skewing. Scalar and array expansion. Forward substitution. Alignment. Code replication. Array renaming. Node splitting. Pattern recognition. Threshold analysis. Symbolic dependence tests. Parallel code generation and its problems. Control Dependence —Types of branches. If conversion. Control dependence. Program dependence graph.

Optimizing Compilers for Modern Architectures Syllabus III Memory Hierarchy Management —The use of dependence in scalar register allocation and management of the cache memory hierarchy. Scheduling for Superscalar and Parallel Machines Machines —Role of dependence. List Scheduling. Software Pipelining. Work scheduling for parallel systems. Guided Self-Scheduling Interprocedural Analysis and Optimization —Side effect analysis, constant propagation and alias analysis. Flow- insensitive and flow-sensitive problems. Side effects to arrays. Inline substitution, linkage tailoring and procedure cloning. Management of interprocedural analysis and optimization. Compilation of Other Languages. —C, Verilog, Fortran 90, HPF.