Optimizing Compilers for Modern Architectures More Interprocedural Analysis Chapter 11, Sections 11.2.5 to end.

Slides:

Advertisements

Similar presentations

Optimizing Compilers for Modern Architectures Coarse-Grain Parallelism Chapter 6 of Allen and Kennedy.

Advertisements

Compiler Support for Superscalar Processors. Loop Unrolling Assumption: Standard five stage pipeline Empty cycles between instructions before the result.

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.

Optimizing Compilers for Modern Architectures Allen and Kennedy, Chapter 13 Compiling Array Assignments.

Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.

Data Flow Analysis. Goal: make assertions about the data usage in a program Use these assertions to determine if and when optimizations are legal Local:

1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.

Some Properties of SSA Mooly Sagiv. Outline Why is it called Static Single Assignment form What does it buy us? How much does it cost us? Open questions.

6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.

Early Global Program Optimizations Chapter Mooly Sagiv.

Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section to.

Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.

Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.

Telescoping Languages: A Compiler Strategy for Implementation of High-Level Domain-Specific Programming Systems Ken Kennedy Rice University.

Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.

Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.

Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.

Stanford University CS243 Winter 2006 Wei Li 1 Data Dependences and Parallelization.

Data Flow Analysis Compiler Design Nov. 3, 2005.

From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.

Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –

© Love Ekenberg The Algorithm Concept, Big O Notation, and Program Verification Love Ekenberg.

Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???

Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:

1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.

Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.

Intraprocedural Points-to Analysis Flow functions:

Data Flow Analysis Compiler Design Nov. 8, 2005.

Overview of program analysis Mooly Sagiv html://

Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.

Improving Code Generation Honors Compilers April 16 th 2002.

Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.

Section Section Summary Recursive Algorithms Proving Recursive Algorithms Correct Recursion and Iteration (not yet included in overheads) Merge.

From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.

Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.

Data Flow Analysis Compiler Design Nov. 8, 2005.

PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.

Software Testing and QA Theory and Practice (Chapter 4: Control Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.

Induction and recursion

Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.

1 CS 201 Compiler Construction Data Flow Analysis.

Carnegie Mellon Lecture 14 Loop Optimization and Array Analysis I. Motivation II. Data dependence analysis Chapter , 11.6 Dror E. MaydanCS243:

12/5/2002© 2002 Hal Perkins & UW CSER-1 CSE 582 – Compilers Data-flow Analysis Hal Perkins Autumn 2002.

1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.

Final Code Generation and Code Optimization.

CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.

1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.

1 Software Testing & Quality Assurance Lecture 13 Created by: Paulo Alencar Modified by: Frank Xu.

Optimizing Compilers for Modern Architectures Interprocedural Analysis and Optimization Chapter 11, through Section

1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.

Recursive Algorithms Section 5.4.

Interprocedural Analysis and Optimization

Global Register Allocation Based on

Data Flow Analysis Suman Jana

Dataflow analysis.

Outline of the Chapter Basic Idea Outline of Control Flow Testing

Interprocedural Analysis Chapter 19

University Of Virginia

1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.

Dataflow analysis.

Fall Compiler Principles Lecture 10: Global Optimizations

Final Code Generation and Code Optimization

Pointer analysis.

Static Single Assignment

Lecture 19: Code Optimisation

CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019

Pointer analysis John Rollinson & Kaiyuan Li

Presentation transcript:

Optimizing Compilers for Modern Architectures More Interprocedural Analysis Chapter 11, Sections to end

Optimizing Compilers for Modern Architectures Review We analyzed several interprocedural problems. —MOD -- set of variables modified in procedure —ALIAS -- set of aliased variables in a procedure. The problem of aliasing make interprocedural problems harder than global problems.

Optimizing Compilers for Modern Architectures SUBROUTINE FOO(N) INTEGER N,M CALL INIT(M,N) DO I = 1,P B(M*I + 1) = 2*B(1) ENDDO END SUBROUTINE INIT(M,N) M = N END If N = 0 on entry to FOO, the loop is a reduction. Otherwise, we can vectorize the loop. Constant Propagation Propagating constants between procedures can cause significant improvements. Dependence testing can be made more precise.

Optimizing Compilers for Modern Architectures Definition: Let s = (p,q) be a call site in procedure p, and let x be a parameter of q. Then, the jump function for x at s, gives the value of x in terms of parameters of p. The support of is the set of p-parameters that is dependent on. Constant Propagation

Optimizing Compilers for Modern Architectures Instead of a Def-Use graph, we construct an interprocedural value graph: —Add a node to the graph for each jump function —If x belongs to the support of, where t lies in the procedure q, then add an edge between and for every call site s = (p,q) for some p. We can now apply the constant propagation algorithm to this graph. —Might want to iterate with global propagation Constant Propagation

Optimizing Compilers for Modern Architectures The constant-propagation algorithm will Eventually converge to above values. (might need to iterate with global) 12 3 Example PROGRAM MAIN INTEGER A,B A = 1 B = 2  CALL S(A,B) END SUBROUTINE S(X,Y) INTEGER X,Y,Z,W Z = X + Y W = X - Y  CALL T(Z,W) END SUBROUTINE T(U,V) PRINT U,V END

Optimizing Compilers for Modern Architectures PROGRAM MAIN INTEGER A  CALL PROCESS(15,A) PRINT A END SUBROUTINE PROCESS(N,B) INTEGER N,B,I  CALL INIT(I,N)  CALL SOLVE(B,I) END SUBROUTINE INIT(X,Y) INTEGER X,Y X = 2*Y END SUBROUTINE SOLVE(C,T) INTEGER C,T C = T*10 END Need a way of building For x output of p, define to be the value of x on return from p in terms of p-parameters. The support of is defined as above. Jump Functions

Optimizing Compilers for Modern Architectures The NotKilled problem is easily solved globally: To extend NotKilled to the procedure level, construct the reduced-control-flow graph for the procedure: —Vertices consist of procedure entry, procedure exit, and call sites. —Every edge (x,y) in the graph is annotated with the set THRU(x,y) of variables not killed on that edge. Kill Analysis

Optimizing Compilers for Modern Architectures Reduced Control Graph ComputeReducedCFG(G) remove back edges from G for each successsor of entry node s, add (entry, s) to worklist while worklist isn’t empty remove a ready element (b,s) from the worklist if s is a call site add (s,t) to worlist for each sucessor t of s otherwise if s isn’t the exit node for each successor t of s if THRU[b,t] undefined then THRU[b,t]  {} THRU[b,t]  THRU[b,t]  (THRU[b,s]  THRU[s,t]) end

Optimizing Compilers for Modern Architectures Kill Analysis ComputeNKILL(p) for each b in reduced graph in reverse top order if b is exit node then NKILL[b]  {all variables} else NKILL[b]  {} for each successor s of b NKILL[b]  NKILL[b]  (NKILL[s]  THRU[b,s] if b is a call site (p,q) then NKILL[b]  NKILL[b]  NKILL[q] NKILL[p]  NKILL[entry node] end

Optimizing Compilers for Modern Architectures Kill Analysis ComputeNKill only works in the absence of formal parameters. A more complicated algorithm takes parameter binding into account using a binding graph. This algorithm ignores aliasing for efficiency.

Optimizing Compilers for Modern Architectures Symbolic Analysis Prove facts about variables other than constancy: —Find a symbolic expression for a variable in terms of other variables. —Establish a relationship between pairs of variables at some point in program. —Establish a range of values for a variable at a given point.

Optimizing Compilers for Modern Architectures [-∞  60  [50:∞][1:100] [-∞:100][1:∞] [-∞,∞] Jump functions and return jump functions return ranges. Meet operation is now more complicated. If we can bound number of times upper bound increases and lower bound decreases, the finite-descending-chain property is satisfied. Range Analysis: Symbolic Analysis Range analysis and symbolic evaluation can be solved using a lattice framework.

Optimizing Compilers for Modern Architectures DO I = 1,N CALL SOURCE(A,I) CALL SINK(A,I) ENDDO We want to know if this loop carries a dependence. MOD and USE are some use, but not much. Let be the set of locations in array modified on iteration I and set of locations used on iteration I. Then has a carried true dependence iff Array Section Analysis Consider the following code:

Optimizing Compilers for Modern Architectures Array Section Analysis One possible lattice representation are sections of the form: –A(I,L), A(I,*), A(*,L), A(*,*) The depth of the lattice is now on the order of the number of array subscripts., and the meet operation is efficient. A better representation is one in which upper and lower bounds for each subscript are allowed.

Optimizing Compilers for Modern Architectures SUBROUTINE SUB1(X,Y,P) INTEGER X,Y CALL P(X,Y) END What procedure names can be passed into P? To avoid loss of precision, we need to record sets of procedure-name tuples when a subroutine has multiple procedure parameters. Call Graph Construction This problem is complicated by procedure parameters:

Optimizing Compilers for Modern Architectures Call Graph Construction Procedure ComputeProcParms ProcParms(p)  {} for each procedure p for each call site s = (p,q) passing in procedure names Let t = be procedure names passed in. worklist  worklist  { } while worklist isn’t empty remove,p> from worklist ProcParms[p]  ProcParms[p]  {t} are parameters bound to for each call site (p,q) passing in some Pi Let u= set of procedure names and instances of Pi passed into q if u is not in ProcParms[q] then worklist  worklist  { } end

Optimizing Compilers for Modern Architectures Inlining Inlining procedure calls has several advantages: –Eliminates procedure call overhead. –Allows more optimizations to take place However, a study by Mary Hall showed that overuse of inlining can cause slowdowns: –Breaks compiler procedure assumptions. –Function calls add needed register spills. –Changing function forces global recompilation.

Optimizing Compilers for Modern Architectures PROCEDURE UPDATE(A,N,IS) REAL A(N) INTEGER I = 1,N A(I*IS-IS+1)=A(I*IS-IS+1)+PI ENDDO END If we knew that IS != 0 at a call, then loop can be vectorized. If we know that IS != 0 at specific call sites, clone a vectorized version of the procedure and use it at those sites. Procedure Cloning Often specific values of function parameters result in better optimizations.

Optimizing Compilers for Modern Architectures DO I = 1,N CALL FOO() ENDDO PROCEDURE FOO() … END CALL FOO() PROCEDURE FOO() DO I = 1,N … ENDDO END Hybrid optimizations Combinations of procedures can have benefit. One example is loop embedding:

Optimizing Compilers for Modern Architectures Whole Program Want to efficiently do interprocedural analysis without constant recompilation. This is a software engineering problem. Basic idea: every time the optimizer passes over a component, calculate information telling what other procedures must be rescanned. Include a feedback loop that optimizes until no procedures are out of date.

Optimizing Compilers for Modern Architectures Summary Solution of flow sensitive probems: —Constant propagation —Kill analysis Solutions to related problems such as symbolic analysis and array section analysis. Other optimizations and ways to integrate these into whole- program compilation.