Optimizing Compilers for Modern Architectures Interprocedural Analysis and Optimization Chapter 11, through Section 11.2.4.

Slides:



Advertisements
Similar presentations
Optimizing Compilers for Modern Architectures Copyright, 1996 © Dale Carnegie & Associates, Inc. Dependence Testing Allen and Kennedy, Chapter 3 thru Section.
Advertisements

Optimizing Compilers for Modern Architectures Compiler Improvement of Register Usage Chapter 8, through Section 8.4.
Optimizing Compilers for Modern Architectures Syllabus Allen and Kennedy, Preface Optimizing Compilers for Modern Architectures.
Optimizing Compilers for Modern Architectures Coarse-Grain Parallelism Chapter 6 of Allen and Kennedy.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Optimizing Compilers for Modern Architectures Allen and Kennedy, Chapter 13 Compiling Array Assignments.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Data Flow Analysis. Goal: make assertions about the data usage in a program Use these assertions to determine if and when optimizations are legal Local:
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Programming Languages and Paradigms
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
1 CS 201 Compiler Construction Lecture Interprocedural Data Flow Analysis.
Lecture 16 Subroutine Calls and Parameter Passing Semantics Dragon: Sec. 7.5 Fischer: Sec Procedure declaration procedure p( a, b : integer, f :
Enhancing Fine-Grained Parallelism Chapter 5 of Allen and Kennedy Optimizing Compilers for Modern Architectures.
Dependence Analysis Kathy Yelick Bebop group meeting, 8/3/01.
Optimizing Compilers for Modern Architectures Preliminary Transformations Chapter 4 of Allen and Kennedy.
Preliminary Transformations Chapter 4 of Allen and Kennedy Harel Paz.
Optimizing Compilers for Modern Architectures More Interprocedural Analysis Chapter 11, Sections to end.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
Software Group © 2005 IBM Corporation Compiler Technology October 17, 2005 Array privatization in IBM static compilers -- technical report CASCON 2005.
Parallel and Cluster Computing 1. 2 Optimising Compilers u The main specific optimization is loop vectorization u The compilers –Try to recognize such.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section to.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
More Dataflow Analysis CS153: Compilers Greg Morrisett.
CMPUT680 - Fall 2006 Topic A: Data Dependence in Loops José Nelson Amaral
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Interprocedural Analysis and Optimization Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Intraprocedural Points-to Analysis Flow functions:
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2 pp
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2.
Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.
1 CS 201 Compiler Construction Data Flow Analysis.
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Inter-procedural Analysis.
CS 2104 Prog. Lang. Concepts Subprograms
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
Jeffrey D. Ullman Stanford University. 2 boolean x = true; while (x) {... // no change to x }  Doesn’t terminate.  Proof: only assignment to x is at.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
1 Software Testing & Quality Assurance Lecture 13 Created by: Paulo Alencar Modified by: Frank Xu.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.
Optimization Simone Campanoni
Interprocedural Analysis and Optimization
Code Optimization Overview and Examples
Data Flow Analysis Suman Jana
Interprocedural Analysis Chapter 19
University Of Virginia
Preliminary Transformations
Code Optimization Overview and Examples Control Flow Graph
Pointer analysis.
Static Single Assignment
Lecture 19: Code Optimisation
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Optimizing Compilers for Modern Architectures Interprocedural Analysis and Optimization Chapter 11, through Section

Optimizing Compilers for Modern Architectures Introduction Interprocedural Analysis —Gathering information about the whole program instead of a single procedure Interprocedural Optimization —Program transformation modifying more than one procedure using interprocedural analysis

Optimizing Compilers for Modern Architectures Overview: Interprocedural Analysis Examples of Interprocedural problems Classification of Interprocedural problems Solve two Interprocedural problems —Side-effect Analysis —Alias Analysis

Optimizing Compilers for Modern Architectures Some Interprocedural Problems Modification and Reference Side-effect COMMON X,Y... DO I = 1, N S0:CALL P S1:X(I) = X(I) + Y(I) ENDDO Can vectorize if P 1.neither modifies nor uses X 2.does not modify Y

Optimizing Compilers for Modern Architectures Modification and Reference Side Effect MOD(s): set of variables that may be modified as a side effect of call at s REF(s): set of variables that may be referenced as a side effect of call at s DO I = 1, N S0:CALL P S1:X(I) = X(I) + Y(I) ENDDO Can vectorize S1 if

Optimizing Compilers for Modern Architectures Alias Analysis SUBROUTINE S(A,X,N) COMMON Y DO I = 1, N S0:X = X + Y*A(I) ENDDO END Could have kept X and Y in different registers and stored in X outside the loop What happens when there is a call, CALL S(A,Y,N)? —Then Y is aliased to X on entry to S —Can’t delay update to X in the loop any more ALIAS(p,x): set of variables that may refer to the same location as formal parameter x on entry to p

Optimizing Compilers for Modern Architectures Call Graph Construction Call Graph G=(N,E) —N: one vertex for each procedure —E: one edge for each possible call –Edge (p,q) is in E if procedure p calls procedure q Looks easy Construction difficult in presence of procedure parameters

Optimizing Compilers for Modern Architectures Call Graph Construction SUBROUTINE S(X,P) S0:CALL P(X) RETURN END P is a procedure parameter to S What values can P have on entry to S? CALL(s): set of all procedures that may be invoked at s Resembles closely to the alias analysis problem

Optimizing Compilers for Modern Architectures Live and Use Analysis DO I = 1, N T = X(I)*C A(I) = T + B(I) C(I) = T + D(I) ENDDO This loop can be parallelized by making T a local variable in the loop PARALLEL DO I = 1, N LOCAL t t = X(I)*C A(I) = t + B(I) C(I) = t + D(I) IF(I.EQ.N) T = t ENDDO Copy of local version of T to the global version of T is required to ensure correctness What if T was not live outside the loop?

Optimizing Compilers for Modern Architectures Live and Use Analysis Solve Live analysis using Use Analysis USE(s): set of variables having an upward exposed use in procedure p called at s If a call site, s is in a single basic block(b), x is live if either —x in USE(s) or —P doesn’t assign a new value to x and x is live in some control flow successor of b

Optimizing Compilers for Modern Architectures Kill Analysis DO I = 1, N S0:CALL INIT(T,I) T = T + B(I) A(I) = A(I) + T ENDDO To parallelize the loop: —INIT must not create a recurrence with respect to the loop —T must not be upward exposed (otherwise it cannot be privatized)

Optimizing Compilers for Modern Architectures Kill Analysis DO I = 1, N S0:CALL INIT(T,I) T = T + B(I) A(I) = A(I) + T ENDDO T has to be assigned before being used on every path through INIT SUBROUTINE INIT(T,I) REAL T INTEGER I COMMON X(100) T = X(I) END If INIT is of this form we can see that T can be privatized

Optimizing Compilers for Modern Architectures Kill Analysis KILL(s): set of variables assigned on every path through procedure p called at s and through procedures invoked in p T in the previous example can be privatized under the following condition Also we can express LIVE(s) as following

Optimizing Compilers for Modern Architectures Constant Propagation SUBROUTINE S(A,B,N,IS,I1) REAL A(*), B(*) DO I = 0, N-1 S0:A(IS*I+I1) = A(IS*I+I1) + B(I+1) ENDDO END If IS=0 the loop around S0 is a reduction If IS!=0 the loop can be vectorized CONST(p): set of variables with known constant values on every invocation of p Knowledge of CONST(p) useful for interprocedural constant propagation

Optimizing Compilers for Modern Architectures Overview: Interprocedural Analysis Examples of Interprocedural problems Classification of Interprocedural problems Solve two Interprocedural problems —Side-effect Analysis —Alias Analysis

Optimizing Compilers for Modern Architectures Interprocedural Problem Classification May and Must problems —MOD, REF and USE are ‘May’ problems —KILL is a ‘Must’ problem Flow sensitive and flow insensitive problems —Flow sensitive: control flow info important —Flow insensitive: control flow info unimportant

Optimizing Compilers for Modern Architectures A B BA R1R2 Flow Sensitive vs Flow Insensitive

Optimizing Compilers for Modern Architectures Classification (continued) A problem is flow insensitive iff solution of both sequential and alternately composed regions is determined by taking union of subregions Side-effect vs Propagation problems —MOD, REF, KILL and USE are side-effect —ALIAS, CALL and CONST are propagation

Optimizing Compilers for Modern Architectures Overview: Interprocedural Analysis Examples of Interprocedural problems Classification of Interprocedural problems Solve two Interprocedural problems —Side-effect Analysis —Alias Analysis

Optimizing Compilers for Modern Architectures Flow Insensitive Side-effect Analysis Assumptions —No procedure nesting —All parameters passed by reference —Size of the parameter list bounded by a constant, We will formulate and solve MOD(s) problem

Optimizing Compilers for Modern Architectures DMOD(s): set of variables which are directly modified as side- effect of call at s GMOD(p): set of global variables and formal parameters of p that are modified, either directly or indirectly as a result of invocation of p Solving MOD

Optimizing Compilers for Modern Architectures Example: DMOD and GMOD S0: CALL P(A,B,C) … SUBROUTINE P(X,Y,Z) INTEGER X,Y,Z X = X*Z Y = Y*Z END GMOD(P)={X,Y} DMOD(S0)={A,B}

Optimizing Compilers for Modern Architectures Solving GMOD GMOD(p) contains two types of variables —Variables explicitly modified in body of P: This constitutes the set IMOD(p) —Variables modified as a side-effect of some procedure invoked in p –Global variables are viewed as parameters to a called procedure

Optimizing Compilers for Modern Architectures Solving GMOD The previous iterative method may take a long time to converge —Problem with recursive calls SUBROUTINE P(F0,F1,F2,…,Fn) INTEGER X,F0,F1,F2,..,Fn … S0: F0 = … S1:CALL P(F1,F2,…,Fn,X) … END

Optimizing Compilers for Modern Architectures Solving GMOD Decompose GMOD(p) differently to get an efficient solution Key: Treat side-effects to global variables and reference formal parameters separately

Optimizing Compilers for Modern Architectures if — or — and x is a formal parameter of p Formally defined RMOD(p): set of formal parameters in p that may be modified in p, either directly or by assignment to a reference formal parameter of q as a side effect of a call of q in p Solving for IMOD+

Optimizing Compilers for Modern Architectures Solving for RMOD RMOD(p) construction needs binding graph G B =(N B,E B ) —One vertex for each formal parameter of each procedure —Directed edge from formal parameter, f1 of p to formal parameter, f2 of q if there exists a call site s=(p,q) in p such that f1 is bound to f2 Use a marking algorithm to solve RMOD

Optimizing Compilers for Modern Architectures XYZ PQ IMOD(A)={X,Y} IMOD(B)={I} (0) Solving for RMOD SUBROUTINE A(X,Y,Z) INTEGER X,Y,Z X = Y + Z Y = Z + 1 END SUBROUTINE B(P,Q) INTEGER P,Q,I I = 2 CALL A(P,Q,I) CALL A(Q,P,I) END

Optimizing Compilers for Modern Architectures XYZ PQ IMOD(A)={X,Y} IMOD(B)={I} Worklist={X,Y} (1) (0) Solving for RMOD SUBROUTINE A(X,Y,Z) INTEGER X,Y,Z X = Y + Z Y = Z + 1 END SUBROUTINE B(P,Q) INTEGER P,Q,I I = 2 CALL A(P,Q,I) CALL A(Q,P,I) END

Optimizing Compilers for Modern Architectures XYZ PQ RMOD(A)={X,Y} RMOD(B)={P,Q} Complexity: (1) (0) (1) Solving for RMOD SUBROUTINE A(X,Y,Z) INTEGER X,Y,Z X = Y + Z Y = Z + 1 END SUBROUTINE B(P,Q) INTEGER P,Q,I I = 2 CALL A(P,Q,I) CALL A(Q,P,I) END

Optimizing Compilers for Modern Architectures Solving for IMOD + After gathering RMOD(p) for all procedures, update RMOD(p) to IMOD + (p) using this equation This can be done in O(NV+E) time

Optimizing Compilers for Modern Architectures Solving for GMOD After gathering IMOD+(p) for all procedures, calculate GMOD(p) according to the following equation This can be solved using a DFS algorithm based on Tarjan’s SCR algorithm on the Call Graph

Optimizing Compilers for Modern Architectures Solving for GMOD rq p s rq p s Initialize GMOD(p) to IMOD + (p) on discovery Update GMOD(p) computation while backing up

Optimizing Compilers for Modern Architectures Solving for GMOD rq p s rq p s Initialize GMOD(p) to IMOD + (p) on discovery Update GMOD(p) computation while backing up For each node u in a SCR update GMOD(u) in a cycle O((N+E)V) Algorithm

Optimizing Compilers for Modern Architectures Overview: Interprocedural Analysis Examples of Interprocedural problems Classification of Interprocedural problems Solve two Interprocedural problems —Side-effect Analysis —Alias Analysis

Optimizing Compilers for Modern Architectures Recall definition of MOD(s) Task is to —Compute ALIAS(p,x) —Update DMOD to MOD using ALIAS(p,x) Alias Analysis

Optimizing Compilers for Modern Architectures GMOD(Q)={Z} MOD(S1)={X,Y} DMOD(S1)={X} Update DMOD to MOD SUBROUTINE P INTEGER A S0:CALL S(A,A) END SUBROUTINE S(X,Y) INTEGER X,Y S1:CALL Q(X) END SUBROUTINE Q(Z) INTEGER Z Z = 0 END

Optimizing Compilers for Modern Architectures procedure findMod(N,E,n,IMOD+,LOCAL) for each call site s do begin MOD[s]:=DMOD[s]; for each x in DMOD[s] do for each v in ALIAS[p,x] do MOD[s]:=MOD[s] {v}; end end findMod O(EV 2 ) algorithm; need to do better Naïve Update of DMOD to MOD

Optimizing Compilers for Modern Architectures Update of DMOD to MOD Key Observations —Two global variables can never be aliases of each other. Global variables can only be aliased to a formal parameter —ALIAS(p,x) contains less than entries if x is global —ALIAS(p,f) for formal parameter f can contain any global variable or other formal parameters (O(V))

Optimizing Compilers for Modern Architectures Update of DMOD to MOD Break down update into two parts —If x is global we need to add less than elements but need to do it O(V) times —If x is a formal parameter we need to add O(V) elements but need to do it only O( ) times This gives O( V)=O(V) update time per call, implying O(VE) for all calls

Optimizing Compilers for Modern Architectures Computing Aliases Compute A(f): the set of global variables a formal parameter f can be aliased to —Use binding graph[with cycles reduced to single nodes] —Compute Alias(p,g) for all global variables Expand A(f) to Alias(p,f) for formal parameters —Find alias introduction sites (eg. CALL P(X,X)) —Propagate aliases

Optimizing Compilers for Modern Architectures Summary Some Interprocedural problems and their application in parallelization and vectorization Classified Interprocedural problems Solved the modification side-effect problem and the alias side- effect problem