Common Sub-expression Elim Want to compute when an expression is available in a var Domain:

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Lecture 11: Code Optimization CS 540 George Mason University.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
Program Representations. Representing programs Goals.
Example in SSA X := Y op Z in out F X := Y op Z (in) = in [ { X ! Y op Z } X :=  (Y,Z) in 0 out F X :=   (in 0, in 1 ) = (in 0 Å in 1 ) [ { X ! E |
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Course project presentations No midterm project presentation Instead of classes, next week I’ll meet with each group individually, 30 mins each Two time.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Recap from last time Saw several examples of optimizations –Constant folding –Constant Prop –Copy Prop –Common Sub-expression Elim –Partial Redundancy.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Back to lattice (D, v, ?, >, t, u ) = (2 A, ¶, A, ;, Å, [ ) where A = { x ! N | x 2 Vars Æ N 2 Z } What’s the problem with this lattice? Lattice is infinitely.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Loop invariant detection using SSA An expression is invariant in a loop L iff: (base cases) –it’s a constant –it’s a variable use, all of whose single.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Class canceled next Tuesday. Recap: Components of IR Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
From last lecture x := y op z in out F x := y op z (in) = in [ x ! in(y) op in(z) ] where a op b =
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems.
Projects. Dataflow analysis Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Recap from last time We saw various different issues related to program analysis and program transformations You were not expected to know all of these.
Precision Going back to constant prop, in what cases would we lose precision?
Example x := read() v := a + b x := x + 1 w := x + 1 a := w v := a + b z := x + 1 t := a + b.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Program Representations. Representing programs Goals.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
DFA foundations Simone Campanoni
Data Flow Analysis Suman Jana
Dataflow analysis.
Program Representations
Topic 10: Dataflow Analysis
Factored Use-Def Chains and Static Single Assignment Forms
University Of Virginia
Another example: constant prop
Dataflow analysis.
Static Single Assignment Form (SSA)
Optimizations using SSA
Data Flow Analysis Compiler Design
EECS 583 – Class 7 Static Single Assignment Form
EECS 583 – Class 7 Static Single Assignment Form
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Common Sub-expression Elim Want to compute when an expression is available in a var Domain:

Common Sub-expression Elim Want to compute when an expression is available in a var Domain:

Flow functions X := Y op Z in out F X := Y op Z (in) = X := Y in out F X := Y (in) =

Flow functions X := Y op Z in out F X := Y op Z (in) = in – { X ! * } – { * !... X... } [ { X ! Y op Z | X  Y Æ X  Z} X := Y in out F X := Y (in) = in – { X ! * } – { * !... X... } [ { X ! E | Y ! E 2 in }

Example x := read() v := a + b x := x + 1 w := x + 1 a = w v = a + b z := x + 1 t = a + b

Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction In some cases, the constraints are of the form in = F(out) These are called backward problems. Example: live variables –compute the set of variables that may be live

Example: live variables Set D = Lattice: (D, v, ?, >, t, u ) =

Example: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) =

Example: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out – { x } [ { y, z}

Example: live variables x := 5 y := x + 2 x := x + 1 y := x y...

Example: live variables x := 5 y := x + 2 x := x + 1 y := x y... How can we remove the x := x + 1 stmt?

Revisiting assignment x := y op z in out F x := y op z (out) = out – { x } [ { y, z}

Revisiting assignment x := y op z in out F x := y op z (out) = out – { x } [ { y, z}

Theory of backward analyses Can formalize backward analyses in two ways Option 1: reverse flow graph, and then run forward problem Option 2: re-develop the theory, but in the backward direction

Precision Going back to constant prop, in what cases would we lose precision?

Precision Going back to constant prop, in what cases would we lose precision? if (p) { x := 5; } else x := 4; }... if (p) { y := x + 1 } else { y := x + 2 }... y... if (...) { x := -1; } else x := 1; } y := x * x;... y... x := 5 if ( ) { x := 6 }... x... where is equiv to false

Precision The first problem: Unreachable code –solution: run unreachable code removal before –the unreachable code removal analysis will do its best, but may not remove all unreachable code The other two problems are path-sensitivity issues –Branch correlations: some paths are infeasible –Path merging: can lead to loss of precision

MOP: meet over all paths Information computed at a given point is the meet of the information computed by each path to the program point if (...) { x := -1; } else x := 1; } y := x * x;... y...

MOP For a path p, which is a sequence of statements [s 1,..., s n ], define: F p (in) = F s n (...F s 1 (in)... ) In other words: F p = Given an edge e, let paths-to(e) be the (possibly infinite) set of paths that lead to e Given an edge e, MOP(e) = For us, should be called JOP (ie: join, not meet)

MOP vs. dataflow MOP is the “best” possible answer, given a fixed set of flow functions –This means that MOP v dataflow at edge in the CFG In general, MOP is not computable (because there can be infinitely many paths) –vs dataflow which is generally computable (if flow fns are monotonic and height of lattice is finite) And we saw in our example, in general, MOP  dataflow

MOP vs. dataflow However, it would be great if by imposing some restrictions on the flow functions, we could guarantee that dataflow is the same as MOP. What would this restriction be? x := -1; y := x * x;... y... x := 1; y := x * x;... y... Merge x := -1;x := 1; Merge y := x * x;... y... DataflowMOP

MOP vs. dataflow However, it would be great if by imposing some restrictions on the flow functions, we could guarantee that dataflow is the same as MOP. What would this restriction be? Distributive problems. A problem is distributive if: 8 a, b. F(a t b) = F(a) t F(b) If flow function is distributive, then MOP = dataflow

Summary of precision Dataflow is the basic algorithm To basic dataflow, we can add path-separation –Get MOP, which is same as dataflow for distributive problems –Variety of research efforts to get closer to MOP for non-distributive problems To basic dataflow, we can add path-pruning –Get branch correlation To basic dataflow, can add both: –meet over all feasible paths

Program Representations

Representing programs Goals

Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things –transformations are easy to perform –general, across input languages and target machines Additional goals –compact in memory –easy to translate to and from –tracks info from source through to binary, for source-level debugging, profilling, typed binaries –extensible (new opts, targets, language features) –displayable

Option 1: high-level syntax based IR Represent source-level structures and expressions directly Example: Abstract Syntax Tree

Option 2: low-level IR Translate input programs into low-level primitive chunks, often close to the target machine Examples: assembly code, virtual machine code (e.g. stack machines), three-address code, register-transfer language (RTL) Standard RTL instrs:

Option 2: low-level IR

Comparison

Advantages of high-level rep –analysis can exploit high-level knowledge of constructs –easy to map to source code (debugging, profiling) Advantages of low-level rep –can do low-level, machine specific reasoning –can be language-independent Can mix multiple reps in the same compiler

Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data dependencies: flow of definitions from defs to uses –operands computed before operations Ideal: represent just those dependencies that matter –dependencies constrain transformations –fewest dependences ) flexibility in implementation

Control dependencies Option 1: high-level representation –control implicit in semantics of AST nodes Option 2: control flow graph (CFG) –nodes are individual instructions –edges represent control flow between instructions Options 2b: CFG with basic blocks –basic block: sequence of instructions that don’t have any branches, and that have a single entry point –BB can make analysis more efficient: compute flow functions for an entire BB before start of analysis

Control dependencies CFG does not capture loops very well Some fancier options include: –the Control Dependence Graph –the Program Dependence Graph More on this later. Let’s first look at data dependencies

Data dependencies Simplest way to represent data dependencies: def/use chains

Def/use chains Directly captures dataflow –works well for things like constant prop But... Ignores control flow –misses some opt opportunities since conservatively considers all paths –not executable by itself (for example, need to keep CFG around) –not appropriate for code motion transformations Must update after each transformation Space consuming

SSA Static Single Assignment –invariant: each use of a variable has only one def

SSA Create a new variable for each def Insert  pseudo-assignments at merge points Adjust uses to refer to appropriate new names Question: how can one figure out where to insert  nodes using a liveness analysis and a reaching defns analysis.

Converting back from SSA Semantics of x 3 :=  (x 1, x 2 ) –set x 3 to x i if execution came from ith predecessor How to implement  nodes?

Converting back from SSA Semantics of x 3 :=  (x 1, x 2 ) –set x 3 to x i if execution came from ith predecessor How to implement  nodes? –Insert assignment x 3 := x 1 along 1 st predecessor –Insert assignment x 3 := x 2 along 2 nd predecessor If register allocator assigns x 1, x 2 and x 3 to the same register, these moves can be removed –x 1.. x n usually have non-overlapping lifetimes, so this kind of register assignment is legal