1 June 4, 2016 1 June 4, 2016June 4, 2016June 4, 2016 Azusa, CA Sheldon X. Liang Ph. D. Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278.

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
1 Optimization Optimization = transformation that improves the performance of the target code Optimization must not change the output must not cause errors.
Intermediate Code Generation
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
1 Chapter 8: Code Generation. 2 Generating Instructions from Three-address Code Example: D = (A*B)+C =* A B T1 =+ T1 C T2 = T2 D.
Jeffrey D. Ullman Stanford University. 2  A never-published Stanford technical report by Fran Allen in  Fran won the Turing award in  Flow.
SSA.
C Chuen-Liang Chen, NTUCS&IE / 321 OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)
CS 201 Compiler Construction
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Data Flow Analysis Compiler Design Nov. 8, 2005.
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
Improving Code Generation Honors Compilers April 16 th 2002.
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Data Flow Analysis Compiler Design Nov. 8, 2005.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Precision Going back to constant prop, in what cases would we lose precision?
1 CS 201 Compiler Construction Data Flow Analysis.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine. Interpreter of the virtual machine is invoked to execute the.
What’s in an optimizing compiler?
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
1 October 18, October 18, 2015October 18, 2015October 18, 2015 Azusa, CA Sheldon X. Liang Ph. D. Azusa Pacific University, Azusa, CA 91702, Tel:
CSc 453 Final Code Generation Saumya Debray The University of Arizona Tucson.
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Advanced Compiler Design Early Optimizations. Introduction Constant expression evaluation (constant folding)  dataflow independent Scalar replacement.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Detecting Equality of Variables in Programs Bowen Alpern, Mark N. Wegman, F. Kenneth Zadeck Presented by: Abdulrahman Mahmoud.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
1 March 16, March 16, 2016March 16, 2016March 16, 2016 Azusa, CA Sheldon X. Liang Ph. D. Azusa Pacific University, Azusa, CA 91702, Tel: (800)
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Global Register Allocation Based on
Introduction to Optimization
High-level optimization Jakub Yaghob
Optimization Code Optimization ©SoftMoore Consulting.
Compiler Construction
Introduction to Optimization
Unit IV Code Generation
Chapter 6 Intermediate-Code Generation
CS 201 Compiler Construction
Interval Partitioning of a Flow Graph
Introduction to Optimization
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Intermediate Code Generation
Code Generation Part II
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

1 June 4, June 4, 2016June 4, 2016June 4, 2016 Azusa, CA Sheldon X. Liang Ph. D. Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction CS400 Compiler Construction

2 Optimization - various goals June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attribute of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied, and the growth of portable computers has created a market for minimizing the power consumed by a program.

3 June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction It has been shown that some code optimization problems are: NP-complete, or even undecidable. In practice, factors such as the programmer's willingness to wait for the compiler to complete its task place upper limits on the optimizations that a compiler implementor might provide. (Optimization is generally a very CPU- and memory-intensive process.) In the past, computer memory limitations were also a major factor in limiting which optimizations could be performed. Optimization - factors

4 June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, Keep in mind following questions Machine-dependent optimization Local optimization Superlocal optimization Global optimization Means of optimization Algebraic simplification Constant folding Value numbering Value Graph Congruent SSA form Partitioning CS400 Compiler Construction

5 Optimization Optimization = transformation that improves the performance of the target code Optimization must not change the output must not cause errors that were not present in the original program must be worth the effort (profiling often helps). Which optimizations are most important depends on the program, but generally, loop optimizations, register allocation and instruction scheduling are the most critical. Local optimizations : within Basic Blocks Superlocal optimizations : within Extended Basic Blocks Global optimizations: within Flow Graph June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

6 Means of Optimization Algebraic simplification Constant folding / constant propaganda Redundancy elimination Value numbering / Value graph June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

7 Extended Basic Block An Extended Basic Block is a maximal sequence of instructions beginning with a leader, that contains no join nodes other than its leader. Some local optimizations are more effective when applied on EBBs. Such optimizations tend to treat the paths through an EBB as if they were in a single block. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

8 Algebraic simplifications These include: Taking advantage of algebraic identities (x*1) is x Strength reduction (x*2) is (x << 1) Simplifications such as - (- x ) is x (1 || x ) is true (1 && x ) is x *(& x ) is x June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

9 Constant folding Definition: The evaluation at compile time of expressions whose values are known to be constant. Is it always safe? Booleans: yes Integers: almost always issues: division by zero, overflow Floating point: usually no issues: compiler's vs. processor's floating point arithmetic, exceptions, etc.) May be combined with constant propagation. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

10 Constant Propaganda Constant propagation is a well-known static compiler technique in which values of variables which are determined to be constants can be passed to expressions which use these constants. Code size reduction, bounds propagation, and dead-code elimination are some of the optimizations which benefit from this analysis. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

11 Redundancy elimination Redundancy elimination = determining that two computations are equivalent and eliminating one. There are several types of redundancy elimination: Value numbering Associates symbolic values to computations and identifies expressions that have the same value Common subexpression elimination Identifies expressions that have operands with the same name Constant/Copy propagation Identifies variables that have constant/copy values and uses the constants/copies in place of the variables. Partial redundancy elimination Inserts computations in paths to convert partial redundancy to full redundancy. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

12 Redundancy elimination read(i) j = i+1 k = i n = k+1 i = 2 j = i*2 k = i+2 a = b * c x = b * c June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

13 Value numbering Goal Assign a symbolic value (called a value number) to each expression. Two expressions should be assigned the same value number if the compiler can prove that they will be equal for all inputs. Use the value numbers to find and eliminate redundant computations. Extensions: Take algebraic identities into consideration Example: x*1 should be assigned the same value number as x Take commutativity into consideration Example: x+y should be assigned the same value number as y+x June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

14 Value numbering How does it work? Supporting data structure: hash table For expression x+y, look up x and y to get their value numbers, xv, yv. At this stage, we can order the operands by value number (to take advantage of commutativity) or apply algebraic simplifications or even constant folding. Look up (+, xv, yv) in hash table. If it is not there, insert it and give it a new value number. If the expression has a lhs, assign that value number to it. If the expression has no lhs, create a temporary one, assign the value number to it and insert a new instruction t=x+y to the basic block. If it is, then it has a value number already. Replace its computation by a reference to the variable with that value. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

15 Value numbering Consider this situation: The second x+y should not be replaced by z, because z was redefined since it was assigned x+y. How do we deal with this? Option 1: Do not store the lhs of a computed expression in the ST, but its value number instead. Then, if the lhs is redefined, its value number will be different, so we will not do an invalid replacement. Option 2: Every time an expression is evaluated, create a temporary to hold the result. The temporary will never be redefined, so the problem is avoided. The code shown above would be converted to: Option 3: Apply the algorithm to the SSA form of that block. Then this problem is not an issue any longer: z = x+y z = w v = x+y t1 = x+y z = t1 z = w v = t1 z1 = x0+y0 z2 = w0 v0 = z1 June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

16 Local value numbering Algorithm sketch for local value numbering: Processing of instruction inst located at BB[n,i] hashval = Hash(inst.opd, inst.opr1, inst.op2) If inst matches instruction inst2 in HT[hashval] if inst2 has a lhs, use that in inst If inst has a lhs remove all instructions in HT that use inst's lhs If inst has no lhs create new temp insert temp=inst.rhs before inst replace inst with temp Add i to the equivalence class at hashval. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

17 Local value numbering June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

18 Local value numbering June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

19 Local value numbering June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

20 Local value numbering June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

21 Local value numbering value table Adding an is_constant entry to the value table, along with the value of the constant, would allow us to incorporate constant folding. We will use SSA numbering for a variable's value number and the actual value for a constant's value number. s1:a=1 + 4 s2:b=4 + 1 s3:c=a + i s4:d=b + i s5:a=a * d s6:e=a + 2 s1:a=5 s2:b=5 s3:c=a + i s4:d=c s5:a=a * d s6:e=a + 2 hash table (+,1,4), [s1, s2] (+,a1,2), [s6] (*,5,c0), [s5] (+,5,i0), [s4] aa1F5 b5T5 ii0F- cc0F- d F- ee0F- June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

22 Local value numbering With a bit of extra work, we might also do some local constant propagation on the fly. value table s1:a=1 + 4 s2:b=4 + 1 s3:c=a + i s4:d=b + i s5:a=a * d s6:e=a + 2 s1:a=5 s2:b=5 s3:c=5 + i s4:d=c s5:a=5 * d s6:e=a + 2 hash table (+,1,4), [s1, s2] (+,a1,2), [s6] (*,5,c0), [s5] (+,5,i0), [s4] aa1F5 b5T5 ii0F- cc0F- d F- ee0F- Applying the same algorithm on a BB that is in SSA form will simplify things. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

23 Superlocal value numbering Each path on the EBB should be handled separately However, some blocks are prefixes of more than one EBB. We'd like to avoid recomputing the values in those blocks Possible solutions : Use a mechanism similar to those for lexical scope handling Save the state of the table at the end of each BB June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

24 Global value numbering Main Idea: Variable equivalence Two variables are equivalent at point P iff they are congruent and their defining assignments dominate P Two variables are congruent iff their definitions have identical operators and congruent operands. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

25 Global value numbering Data structure: The Value Graph. Nodes are labeled with operators function symbols constant values Nodes are named using SSA-form variables Edges point from operators or functions to operands Edges are labeled with numbers that indicate operand position June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

26 Global value numbering -- SSA form with Value graph In compiler design, static single assignment form (often abbreviated as SSA form or SSA) is an intermediate representation (IR) in which every variable is assigned exactly once. Existing variables in the original IR are split into versions, new variables typically indicated by the original name with a subscript, so that every definition gets its own version. In SSA form, use-def chains are explicit and each contains a single element. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

27 Global value numbering In the Value Graph: Two nodes are congruent iff They are the same node, OR Their labels are constants and the constants have the same value, OR Their labels are the same operator and their operands are congruent. Algorithm sketch: Partition nodes into congruent sets Initial partition is optimistic: nodes with the same label are placed together Note: An alternative would be a pessimistic version, where initial sets are empty and then fill up in a monotonic way. Iterate to a fixed point, splitting partitions where operands are not congruent. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

28 June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Global value numbering

29 June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

30 Initially, nodes that have the same label are placed in the same set. The initial partition is shown on the left. Nodes that are in the same set, have the same color. i4 and j4 are congruent because their operands are congruent. Similarly, i5 and j5 are congruent. However, i4 and i5 are not. The "red" partition needs to be split Exercise: How would the partitions change if i5 contained a minus? Answer: click here June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

31 The initial partition is shown on the left. Nodes that are in the same set, have the same color. As you can see, i5 and j5 are not congruent this time, since they are labeled differently. This, in turn, means that i2 and j2 are not congruent, so that set should be split. As a result of that, i3 and j3 are now not congruent. This causes i4 and j4 to not be congruent either. The final partition is shown on the next slide. June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

32 June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, Got it with following questions Machine-dependent optimization Local optimization Superlocal optimization Global optimization Means of optimization Algebraic simplification Constant folding Value numbering Value Graph Congruent SSA form Partitioning CS400 Compiler Construction

33 Thank you very much! Questions? June 4, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction