Enhancing the Role of Inlining in Effective Interprocedural Parallelization Jichi Guo, Mike Stiles Qing Yi, Kleanthis Psarris.

Slides:



Advertisements
Similar presentations
Optimizing Compilers for Modern Architectures Syllabus Allen and Kennedy, Preface Optimizing Compilers for Modern Architectures.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Adapted from Scott, Chapter 6:: Control Flow Programming Language Pragmatics Michael L. Scott.
CSCI 330: Programming Language Concepts Instructor: Pranava K. Jha Control Flow-II: Execution Order.
CS 355 – Programming Languages
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
(1) ICS 313: Programming Language Theory Chapter 10: Implementing Subprograms.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
Parallel and Cluster Computing 1. 2 Optimising Compilers u The main specific optimization is loop vectorization u The compilers –Try to recognize such.
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
1 Cetus – An Extensible Compiler Infrastructure Sang Ik Lee Troy Johnson Rudolf Eigenmann ECE, Purdue University.
Telescoping Languages: A Compiler Strategy for Implementation of High-Level Domain-Specific Programming Systems Ken Kennedy Rice University.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Exceptions in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
Previous finals up on the web page use them as practice problems look at them early.
COMP205 Comparative Programming Languages Part 1: Introduction to programming languages Lecture 3: Managing and reducing complexity, program processing.
Techniques for Reducing the Overhead of Run-time Parallelization Lawrence Rauchwerger Department of Computer Science Texas A&M University
From last time: Inlining pros and cons Pros –eliminate overhead of call/return sequence –eliminate overhead of passing args & returning results –can optimize.
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Guide To UNIX Using Linux Third Edition
Chapter 2- Visual Basic Schneider1 Chapter 2 Problem Solving.
Examining the Code [Reading assignment: Chapter 6, pp ]
Semi-Automatic Composition of Data Layout Transformations for Loop Vectorization Shixiong Xu, David Gregg University of Dublin, Trinity College
Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.
1 1.1 © 2012 Pearson Education, Inc. Linear Equations in Linear Algebra SYSTEMS OF LINEAR EQUATIONS.
Just-In-Time Java Compilation for the Itanium Processor Tatiana Shpeisman Guei-Yuan Lueh Ali-Reza Adl-Tabatabai Intel Labs.
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine. Interpreter of the virtual machine is invoked to execute the.
หลักการโปรแกรม เพื่อแก้ปัญหาโดยใช้คอมพิวเตอร์
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Array Dependence Analysis COMP 621 Special Topics By Nurudeen Lameed
Designing and Debugging Batch and Interactive COBOL Programs Chapter 5.
1 Advance Computer Architecture CSE 8383 Ranya Alawadhi.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages.
Problem Solving Techniques. Compiler n Is a computer program whose purpose is to take a description of a desired program coded in a programming language.
1 Serial Run-time Error Detection and the Fortran Standard Glenn Luecke Professor of Mathematics, and Director, High Performance Computing Group Iowa State.
COP4020 Programming Languages Subroutines and Parameter Passing Prof. Xin Yuan.
Chapter 7 LOOPING OPERATIONS: ITERATION. Chapter 7 The Flow of the while Loop.
The basics of the programming process The development of programming languages to improve software development Programming languages that the average user.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Software Development Problem Analysis and Specification Design Implementation (Coding) Testing, Execution and Debugging Maintenance.
A System to Generate Test Data and Symbolically Execute Programs Lori A. Clarke Presented by: Xia Cheng.
Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza.
FORTRAN History. FORTRAN - Interesting Facts n FORTRAN is the oldest Language actively in use today. n FORTRAN is still used for new software development.
How to Program? -- Part 1 Part 1: Problem Solving –Analyze a problem –Decide what steps need to be taken to solve it. –Take into consideration any special.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
Chap. 7, Syntax-Directed Compilation J. H. Wang Nov. 24, 2015.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
Recursion Unrolling for Divide and Conquer Programs Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Memory-Aware Compilation Philip Sweany 10/20/2011.
Software Group © 2004 IBM Corporation Compiler Technology October 6, 2004 Experiments with auto-parallelizing SPEC2000FP benchmarks Guansong Zhang CASCON.
Chapter 1: Preliminaries Lecture # 2. Chapter 1: Preliminaries Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
A Single Intermediate Language That Supports Multiple Implemtntation of Exceptions Delvin Defoe Washington University in Saint Louis Department of Computer.
Code Optimization.
Topic 3-a Calling Convention 1/10/2019.
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Chapter 10: Compilers and Language Translation
Programming Languages, Preliminaries, History & Evolution
Presentation transcript:

Enhancing the Role of Inlining in Effective Interprocedural Parallelization Jichi Guo, Mike Stiles Qing Yi, Kleanthis Psarris

Problem Inter-procedural parallelization o Parallel after inlining Gain more parallelizable loops Lost of parallelized loops o Inlining messes up caller / callee Missed parallel opportunities o Inlining increases code complexity

Goal Keep the gain parallelizable loops Prevent the lost parallelism Discover the missed opportunities

Solution Summarize the code using annotation o Express the underlying information Inline the annotation before parallelization o Pass the summarized information to the compiler Reverse -inline after parallelization o Revert inlining side effects o Maintain equivalence

Outline Innovations Problems of parallel + inline strategy Annotation language Annotation-based inlining technique Experiments Summary

Outline Innovations Problems of parallel + inline strategy Annotation language Annotation-based inlining technique Experiments Summary

Problems of parallel + inlining Parallel + inlining o Conventional inlining with heuristics and pre-transformations Heuristics: code size Transformations: linearization, forward substitution o Intra-procedural loop parallelization Fortran do-all loop Goal o Gain loops in caller Problems o Lost loops in caller / callee o Missed loops in caller

Problems of parallel + inlining Lost of parallelizable loops in caller/callee o Transformations that cause the lost Forward substitution Linearization Forward substitution of non-linear subscripts o Create indirect array references Linearization of array dimensions o Mess up array shapes

Problems of parallel + inlining Forward substitution of non-linear subscripts o Create indirect array references X2(I) ⇒ T(IX(7) + I) Y2(I) ⇒ T(IX(8) + I) Z2(I) ⇒ T(IX(9) + I)

Problems of parallel + inlining Linearization of array dimensions o Mess up array shapes PP(i, j, k) ⇒ PP(i + j*4 + k*16)

Problems of parallel + inlining Missed parallelizable loops in caller o Coding styles that cause the lost Opaque compositional subroutines o A calls B, B calls C, C calls D, … Array access o When it is difficult to determine which part is killed Debugging and Error Checking o Statement that breaks the dependency is never executed I/O statements Indirect array references o ID=IDX(I), X = A(ID)

Problems of parallel + inlining Opaque compositional subroutines o A calls B, B calls C, C calls D, …

Problems of parallel + inlining Array access o Difficult to determine which part is killed CTR computed at runtime

Problems of parallel + inlining Debugging and Error Checking o Statement that breaks the dependency is never executed I/O statements

Problems of parallel + inlining Indirect array references IN=>NODE NODE=>IREL IREL=>RHSB

Outline Innovations Problems of parallel + inline strategy Annotation language Annotation-based inlining technique Experiments Summary

The annotation language Goal o Summarize information o Avoid ambiguity

The annotation language Restricted grammar Special operators Writing annotations

The annotation language Restricted grammar o Do-all loop only o No goto

The annotation language Special operators y = operator(x1, x2, …, xn) Purpose: abstract relation o Unknown operator Relation is unknown o Generic functions o Unique operator Relation is one-to-one, from X to Y

The annotation language Writing annotations o Eliminating adverse side effects Preserve caller and callee if inlining breaks the dependency o Summarize opaque subroutines Eliminate nested function calls o Array access Specify exact range get read/modified o Debugging and error handling Aggressive strategy: ignore checking statements o Indirect array references Discover unique relation

The annotation language Summarize opaque subroutines o Eliminate nested function calls

The annotation language Array access o Specify exact range get read/modified

The annotation language Debugging and error handling o Aggressive strategy: ignore checking statements

The annotation language Indirect array references o Discover unique relation

Outline Innovations Problems of parallel + inline strategy Annotation language Annotation-based inlining technique Experiments Summary

Annotation-based inlining Goal o Pass annotated information to the compiler o Eliminate inlining side effects Flow o Inline before parallelization o Reverse-inlining after parallelization o Verify and evaluate at last Implementation o POLARIS compiler for parallelization o ROSE compiler for parsing o POET transformer o PERFECT benchmark

Annotation-based inlining Workflow o Annotation inlining ⇒ Parallelization ⇒ Reverse-inlining

Annotation-based inlining Inlining annotation o Steps Annotation ⇒ source language o Translating special operators Inlinining generated source language o Avoiding linearization o Translating special operators Unknown: using uninitialized global arrays Unique: using linear expression o Avoiding linearization

Annotation-based inlining Inlining annotation

Annotation-based inlining Parallelize do-all loops

Annotation-based inlining Reverse inlining

Annotation-based inlining Reverse inlining is indispensible o Inlinining is restored to function call Avoid lost of parallelism in caller / callee Enable abstraction operators (unknown, unique)

Annotation-based inlining Verification and evaluation o Correctness, Efficiency, and Generality

Outline Innovations Problems of parallel + inline strategy Annotation language Annotation-based inlining technique Experiments Summary

Experiment Purpose o What does conventional lining bring to parallelization Gain? Lost? Missed? o How good is annotation-based inlining to avoid above issues Design o PERFECT benchmarks (except SPEC77) o Two machines 8 cores Intel Mac 4 cores AMD Operon o End compiler GFortran IFort 11.1 Result o Count of Loops o Performance

Experiment Result: Loops o Conventional inlining Having loss o Annotation-based inlining No loss, more gain

Experiment Result: Performance o Average speedup limited o Annot-based inlining always better

Summary Inter-procedural parallelization Summarize effects of conventional inlining o Gain o Lost o Missed Propose annotation-based inlining o Annotation summary o Enhanced inlining strategy o Reverse inlining

Thanks! Questions?