Download presentation
Presentation is loading. Please wait.
Published byOwen O’Neal’ Modified over 9 years ago
1
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Data Flow Analysis Foundations Guo, Yao Part of the slides are adapted from MIT 6.035 “Computer Language Engineering”
2
2 Fall 2011 “Advanced Compiler Techniques” Dataflow Analysis Compile-Time Reasoning About Compile-Time Reasoning About Run-Time Values of Variables or Expressions Run-Time Values of Variables or Expressions At Different Program Points At Different Program Points Which assignment statements produced value of variable at this point? Which assignment statements produced value of variable at this point? Which variables contain values that are no longer used after this program point? Which variables contain values that are no longer used after this program point? What is the range of possible values of variable at this program point? What is the range of possible values of variable at this program point?
3
3 Fall 2011 “Advanced Compiler Techniques” Program Representation Control Flow Graph Control Flow Graph Nodes N – statements of program Nodes N – statements of program Edges E – flow of control Edges E – flow of control pred(n) = set of all predecessors of n pred(n) = set of all predecessors of n succ(n) = set of all successors of n succ(n) = set of all successors of n Start node n 0 Start node n 0 Set of final nodes N final Set of final nodes N final Could also add special nodes ENTRY & EXIT Could also add special nodes ENTRY & EXIT
4
4 Fall 2011 “Advanced Compiler Techniques” Program Points One program point before each node One program point before each node One program point after each node One program point after each node Join point – point with multiple predecessors Join point – point with multiple predecessors Split point – point with multiple successors Split point – point with multiple successors
5
5 Fall 2011 “Advanced Compiler Techniques” Basic Idea Information about program represented using values from algebraic structure called lattice Information about program represented using values from algebraic structure called lattice Analysis produces lattice value for each program point Analysis produces lattice value for each program point Two flavors of analysis Two flavors of analysis Forward dataflow analysis Forward dataflow analysis Backward dataflow analysis Backward dataflow analysis
6
6 Fall 2011 “Advanced Compiler Techniques” Forward Dataflow Analysis Analysis propagates values forward through control flow graph with flow of control Analysis propagates values forward through control flow graph with flow of control Each node has a transfer function f Each node has a transfer function f Input – value at program point before node Input – value at program point before node Output – new value at program point after node Output – new value at program point after node Values flow from program points after predecessor nodes to program points before successor nodes Values flow from program points after predecessor nodes to program points before successor nodes At join points, values are combined using a merge function At join points, values are combined using a merge function Canonical Example: Reaching Definitions Canonical Example: Reaching Definitions
7
7 Fall 2011 “Advanced Compiler Techniques” Backward Dataflow Analysis Analysis propagates values backward through control flow graph against flow of control Analysis propagates values backward through control flow graph against flow of control Each node has a transfer function f Each node has a transfer function f Input – value at program point after node Input – value at program point after node Output – new value at program point before node Output – new value at program point before node Values flow from program points before successor nodes to program points after predecessor nodes Values flow from program points before successor nodes to program points after predecessor nodes At split points, values are combined using a merge function At split points, values are combined using a merge function Canonical Example: Live Variables Canonical Example: Live Variables
8
8 Fall 2011 “Advanced Compiler Techniques” Partial Orders Set P Set P Partial order such that x,y,z P Partial order such that x,y,z P x x (reflexive) x x (reflexive) x y and y x implies x y (asymmetric) x y and y x implies x y (asymmetric) x y and y z implies x z(transitive) x y and y z implies x z(transitive) Can use partial order to define Can use partial order to define Upper and lower bounds Upper and lower bounds Least upper bound Least upper bound Greatest lower bound Greatest lower bound
9
9 Fall 2011 “Advanced Compiler Techniques” Upper Bounds If S P then If S P then x P is an upper bound of S if y S. y x x P is an upper bound of S if y S. y x x P is the least upper bound of S if x P is the least upper bound of S if x is an upper bound of S, and x is an upper bound of S, and x y for all upper bounds y of S x y for all upper bounds y of S - join, least upper bound (lub), supremum, sup - join, least upper bound (lub), supremum, sup S is the least upper bound of S S is the least upper bound of S x y is the least upper bound of {x,y} x y is the least upper bound of {x,y}
10
10 Fall 2011 “Advanced Compiler Techniques” LowerBounds Lower Bounds If S P then x P is a lower bound of S if y S. x y x P is the greatest lower bound of S if x is a lower bound of S, and y x for all lower bounds y of S - meet, greatest lower bound (glb), infimum, inf S is the greatest lower bound of S x y is the greatest lower bound of {x,y}
11
11 Fall 2011 “Advanced Compiler Techniques” Covering x y if x y and x y x y if x y and x y x is covered by y (y covers x) if x is covered by y (y covers x) if x y, and x y, and x z y implies x z x z y implies x z Conceptually, y covers x if there are no elements between x and y Conceptually, y covers x if there are no elements between x and y
12
12 Fall 2011 “Advanced Compiler Techniques” Example P = { 000, 001, 010, 011, 100, 101, 110, 111} P = { 000, 001, 010, 011, 100, 101, 110, 111} (standard Boolean lattice, also called hypercube) x y if (x bitwise and y) = x x y if (x bitwise and y) = x 111 011 101 110 010 001 000 100 Hasse Diagram If y covers x Line from y to x y above x in diagram
13
13 Fall 2011 “Advanced Compiler Techniques” Lattices If x y and x y exist for all x,y P, then P is a lattice. If x y and x y exist for all x,y P, then P is a lattice. If S and S exist for all S P, then P is a complete lattice. If S and S exist for all S P, then P is a complete lattice. All finite lattices are complete All finite lattices are complete
14
14 Fall 2011 “Advanced Compiler Techniques” Lattices If x y and x y exist for all x,y P, then P is a lattice. If x y and x y exist for all x,y P, then P is a lattice. If S and S exist for all S P, then P is a complete lattice. If S and S exist for all S P, then P is a complete lattice. All finite lattices are complete All finite lattices are complete Example of a lattice that is not complete Example of a lattice that is not complete Integers I Integers I For any x, y I, x y = max(x,y), x y = min(x,y) For any x, y I, x y = max(x,y), x y = min(x,y) But I and I do not exist But I and I do not exist I { , } is a complete lattice I { , } is a complete lattice
15
15 Fall 2011 “Advanced Compiler Techniques” Lattice Examples Lattices Lattices Non-lattices Non-lattices
16
16 Fall 2011 “Advanced Compiler Techniques” Semi-Lattice Only one of the two binary operations (meet or join) exist Only one of the two binary operations (meet or join) exist Meet-semilattice Meet-semilattice If x y exist for all x,y P If x y exist for all x,y P Join-semilattice Join-semilattice If x y exist for all x,y P If x y exist for all x,y P
17
17 Fall 2011 “Advanced Compiler Techniques” Top and Bottom Greatest element of P (if it exists) is top ( ⊤ ) Greatest element of P (if it exists) is top ( ⊤ ) Least element of P (if it exists) is bottom ( ) Least element of P (if it exists) is bottom ( )
18
18 Fall 2011 “Advanced Compiler Techniques” Connection Between , , and The following 3 properties are equivalent: The following 3 properties are equivalent: x y x y x y y x y y x y x x y x Will prove: Will prove: x y implies x y y and x y x x y implies x y y and x y x x y y implies x y x y y implies x y x y x implies x y x y x implies x y Then by transitivity, can obtain Then by transitivity, can obtain x y y implies x y x x y y implies x y x x y x implies x y y x y x implies x y y
19
19 Fall 2011 “Advanced Compiler Techniques” Connecting Lemma Proofs Proof of x y implies x y y Proof of x y implies x y y x y implies y is an upper bound of {x,y}. x y implies y is an upper bound of {x,y}. Any upper bound z of {x,y} must satisfy y z. Any upper bound z of {x,y} must satisfy y z. So y is least upper bound of {x,y} and x y y So y is least upper bound of {x,y} and x y y Proof of x y implies x y x Proof of x y implies x y x x y implies x is a lower bound of {x,y}. x y implies x is a lower bound of {x,y}. Any lower bound z of {x,y} must satisfy z x. Any lower bound z of {x,y} must satisfy z x. So x is greatest lower bound of {x,y} and x y x So x is greatest lower bound of {x,y} and x y x
20
20 Fall 2011 “Advanced Compiler Techniques” Connecting Lemma Proofs Proof of x y y implies x y Proof of x y y implies x y y is an upper bound of {x,y} implies x y y is an upper bound of {x,y} implies x y Proof of x y x implies x y Proof of x y x implies x y x is a lower bound of {x,y} implies x y x is a lower bound of {x,y} implies x y
21
21 Fall 2011 “Advanced Compiler Techniques” Lattices as Algebraic Structures Have defined and in terms of Have defined and in terms of Will now define in terms of and Will now define in terms of and Start with and as arbitrary algebraic operations that satisfy associative, commutative, idempotence, and absorption laws Start with and as arbitrary algebraic operations that satisfy associative, commutative, idempotence, and absorption laws Will define using and Will define using and Will show that is a partial order Will show that is a partial order Intuitive concept of and as information combination operators (or, and) Intuitive concept of and as information combination operators (or, and)
22
22 Fall 2011 “Advanced Compiler Techniques” Lattice as Algebra 定理 设 是具有两个二元运算∗和 的 代数系统, 若对于∗和 满足交换律、结合律、 吸收律, 则可以适当定义 S 中的偏序≼, 使得 构成格, 且 a,b ∈ S 有 a ∧ b = a ∗ b, a ∨ b = a ∘ b. a ∧ b = a ∗ b, a ∨ b = a ∘ b. 定义 设 是代数系统, ∗和 ∘ 是二元运算, 如果∗和 ∘ 运算满足交换律、结合律和吸收律, 则 构成格.
23
23 Fall 2011 “Advanced Compiler Techniques” Algebraic Properties of Lattices Assume arbitrary operations and such that (x y) z x (y z)(associativity of ) (x y) z x (y z)(associativity of ) (x y) z x (y z)(associativity of ) (x y) z x (y z)(associativity of ) x y y x(commutativity of ) x y y x(commutativity of ) x y y x(commutativity of ) x y y x(commutativity of ) x x x(idempotence of ) x x x(idempotence of ) x x x(idempotence of ) x x x(idempotence of ) x (x y) x(absorption of over ) x (x y) x(absorption of over ) x (x y) x(absorption of over ) x (x y) x(absorption of over )
24
24 Fall 2011 “Advanced Compiler Techniques” Connection Between and x y y if and only if x y x x y y if and only if x y x Proof of x y y implies x = x y Proof of x y y implies x = x y x = x (x y) (by absorption) = x y(by assumption) = x y(by assumption) Proof of x y x implies y = x y Proof of x y x implies y = x y y = y (y x)(by absorption) = y (x y)(by commutativity) = y (x y)(by commutativity) = y x (by assumption) = y x (by assumption) = x y (by commutativity) = x y (by commutativity)
25
25 Fall 2011 “Advanced Compiler Techniques” Properties of Define x y if x y y Define x y if x y y Proof of transitive property. Must show that Proof of transitive property. Must show that x y y and y z z implies x z z x y y and y z z implies x z z x z = x (y z) (by assumption) = (x y) z(by associativity) = (x y) z(by associativity) = y z(by assumption) = y z(by assumption) = z(by assumption) = z(by assumption)
26
26 Fall 2011 “Advanced Compiler Techniques” Properties of Proof of asymmetry property. Must show that Proof of asymmetry property. Must show that x y y and y x x implies x y x = y x (by assumption) = x y (by commutativity) = x y (by commutativity) = y (by assumption) = y (by assumption) Proof of reflexivity property. Must show that Proof of reflexivity property. Must show that x x x x x x (by idempotence) x x x (by idempotence)
27
27 Fall 2011 “Advanced Compiler Techniques” Monotonic Function & Fixed point Let L be a lattice. A function f : L → L is monotonic if Let L be a lattice. A function f : L → L is monotonic if ∀ x, y ∈ S : x y ⇒ f (x) f (y) Let A be a set, f : A → A a function, a ∈ A. Let A be a set, f : A → A a function, a ∈ A. If f ( a ) = a, then a is called a fixed point of f on A
28
28 Fall 2011 “Advanced Compiler Techniques” Existence of Fixed Points The height of a lattice is defined to be the length of the longest path from ⊥ to ⊤The height of a lattice is defined to be the length of the longest path from ⊥ to ⊤ In a complete lattice L with finite height, every monotonic function f : L → L has a unique least fixed-point :In a complete lattice L with finite height, every monotonic function f : L → L has a unique least fixed-point : fix (f ) = fix (f ) =
29
29 Fall 2011 “Advanced Compiler Techniques” Another Existence Theorem A function f : L → L on a complete lattice L is called continuous if for any subset X of L, f ( X) = f (X). A function f : L → L on a complete lattice L is called continuous if for any subset X of L, f ( X) = f (X). Let L be a complete lattice. If f : L → L is monotonic and continuous, then f has a unique least fixed-point Let L be a complete lattice. If f : L → L is monotonic and continuous, then f has a unique least fixed-point fix (f ) = fix (f ) =
30
30 Fall 2011 “Advanced Compiler Techniques” Knaster-Tarski Fixed Point Theorem Suppose (L, ) is a complete lattice, f: L L is a monotonic function. Suppose (L, ) is a complete lattice, f: L L is a monotonic function. Then the fixed point m of f can be defined as Then the fixed point m of f can be defined as
31
31 Fall 2011 “Advanced Compiler Techniques” Calculating Fixed Point The time complexity of computing a fixed-point depends on three factors: The time complexity of computing a fixed-point depends on three factors: The height of the lattice, since this provides a bound for i; The height of the lattice, since this provides a bound for i; The cost of computing f; The cost of computing f; The cost of testing equality. The cost of testing equality. The computation of a fixed-point can be illustrated as a walk up the lattice starting at ⊥ :
32
32 Fall 2011 “Advanced Compiler Techniques” Fixed Point Solution Let L be a lattice with finite height. An equation system is of the form: x 1 = F 1 (x 1,..., x n ) x 2 = F 2 (x 1,..., x n )... x n = F n (x 1,..., x n ) where x i are variables and F i : L n → L is a collection of monotone functions. The solution can be obtained using the least fixed-point of the function F : L n → L n defined by: F(x 1,...,x n ) = (F 1 (x 1,...,x n ),..., F n (x 1,...,x n )) F(x 1,...,x n ) = (F 1 (x 1,...,x n ),..., F n (x 1,...,x n ))
33
33 Fall 2011 “Advanced Compiler Techniques” Algorithm The naive algorithm is then to proceed as follows: x = ( ⊥,..., ⊥ ); do { t = x; x = F(x); } while (x ≠ t); to compute the fixed-point x. A better algorithm: x1 = ⊥ ;... xn = ⊥ ; do { t1 = x1;... tn = xn; x1 = F1(x1,..., xn);... xn = Fn(x1,..., xn); } while (x1 ≠ t1 ∨... ∨ xn ≠ tn) to compute the fixed-point (x1,..., xn). Why Better?
34
34 Fall 2011 “Advanced Compiler Techniques” Chains A set S is a chain if x,y S. y x or x y A set S is a chain if x,y S. y x or x y P has no infinite chains if every chain in P is finite P has no infinite chains if every chain in P is finite P satisfies the ascending chain condition if for all sequences x 1 x 2 … there exists n, such that x n = x n+1 = … P satisfies the ascending chain condition if for all sequences x 1 x 2 … there exists n, such that x n = x n+1 = …
35
35 Fall 2011 “Advanced Compiler Techniques” Application to Dataflow Analysis Dataflow information will be lattice values Dataflow information will be lattice values Transfer functions operate on lattice values Transfer functions operate on lattice values Solution algorithm will generate increasing sequence of values at each program point Solution algorithm will generate increasing sequence of values at each program point Ascending chain condition will ensure termination Ascending chain condition will ensure termination Will use to combine values at control- flow join points Will use to combine values at control- flow join points
36
36 Fall 2011 “Advanced Compiler Techniques” Transfer Functions Transfer function f: P P for each node in control flow graph Transfer function f: P P for each node in control flow graph f models effect of the node on the program information f models effect of the node on the program information
37
37 Fall 2011 “Advanced Compiler Techniques” Transfer Functions Each dataflow analysis problem has a set F of transfer functions f: P P Identity function i F Identity function i F F must be closed under composition: f,g F. the function h = x.f(g(x)) F F must be closed under composition: f,g F. the function h = x.f(g(x)) F Each f F must be monotone: x y implies f(x) f(y) Each f F must be monotone: x y implies f(x) f(y) Sometimes all f F are distributive: f(x y) = f(x) f(y) Sometimes all f F are distributive: f(x y) = f(x) f(y) Distributivity implies monotonicity Distributivity implies monotonicity
38
38 Fall 2011 “Advanced Compiler Techniques” Distributivity Implies Monotonicity Proof of distributivity implies monotonicity Proof of distributivity implies monotonicity Assume f(x y) = f(x) f(y) Assume f(x y) = f(x) f(y) Must show: x y = y implies f(x) f(y) = f(y) Must show: x y = y implies f(x) f(y) = f(y) f(y) = f(x y)(by assumption) = f(x) f(y)(by distributivity) = f(x) f(y)(by distributivity)
39
39 Fall 2011 “Advanced Compiler Techniques” Putting Pieces Together Forward Dataflow Analysis Framework Forward Dataflow Analysis Framework Simulates execution of program forward with flow of control Simulates execution of program forward with flow of control
40
40 Fall 2011 “Advanced Compiler Techniques” Forward Dataflow Analysis Simulates execution of program forward with flow of control Simulates execution of program forward with flow of control For each node n, have For each node n, have in n – value at program point before n in n – value at program point before n out n – value at program point after n out n – value at program point after n f n – transfer function for n (given in n, computes out n ) f n – transfer function for n (given in n, computes out n ) Require that solution satisfy Require that solution satisfy n. out n = f n (in n ) n. out n = f n (in n ) n n 0. in n = { out m. m in pred(n) } n n 0. in n = { out m. m in pred(n) } in n0 = I in n0 = I Where I summarizes information at start of program Where I summarizes information at start of program
41
41 Fall 2011 “Advanced Compiler Techniques” Dataflow Equations Compiler processes program to obtain a set of dataflow equations Compiler processes program to obtain a set of dataflow equations out n := f n (in n ) in n := { out m. m in pred(n) } Conceptually separates analysis problem from program Conceptually separates analysis problem from program
42
42 Fall 2011 “Advanced Compiler Techniques” Worklist Algorithm for Solving Forward Dataflow Equations for each n do out n := f n ( ) in n0 := I; out n0 := f n0 (I) worklist := N - { n 0 } while worklist do remove a node n from worklist in n := { out m. m in pred(n) } out n := f n (in n ) if out n changed then worklist := worklist succ(n)
43
43 Fall 2011 “Advanced Compiler Techniques” Correctness Argument Why result satisfies dataflow equations Why result satisfies dataflow equations Whenever process a node n, set Whenever process a node n, set out n := f n (in n ) Algorithm ensures that out n = f n (in n ) Whenever out m changes, put succ(m) on worklist. Consider any node n succ(m). It will eventually come off worklist and algorithm will set Whenever out m changes, put succ(m) on worklist. Consider any node n succ(m). It will eventually come off worklist and algorithm will set in n := { out m. m in pred(n) } to ensure that in n = { out m. m in pred(n) } So final solution will satisfy dataflow equations So final solution will satisfy dataflow equations
44
44 Fall 2011 “Advanced Compiler Techniques” Termination Argument Why does algorithm terminate? Why does algorithm terminate? Sequence of values taken on by in n or out n is a chain. If values stop increasing, worklist empties and algorithm terminates. Sequence of values taken on by in n or out n is a chain. If values stop increasing, worklist empties and algorithm terminates. If lattice has ascending chain property, algorithm terminates If lattice has ascending chain property, algorithm terminates Algorithm terminates for finite lattices Algorithm terminates for finite lattices For lattices without ascending chain property, use widening operator For lattices without ascending chain property, use widening operator
45
45 Fall 2011 “Advanced Compiler Techniques” Reaching Definitions P = powerset of set of all definitions in program (all subsets of set of definitions in program) P = powerset of set of all definitions in program (all subsets of set of definitions in program) = (order is ) = (order is ) = = I = in n0 = I = in n0 = F = all functions f of the form f(x) = a (x-b) F = all functions f of the form f(x) = a (x-b) b is set of definitions that node kills b is set of definitions that node kills a is set of definitions that node generates a is set of definitions that node generates General pattern for many transfer functions General pattern for many transfer functions f(x) = GEN (x-KILL) f(x) = GEN (x-KILL)
46
46 Fall 2011 “Advanced Compiler Techniques” Does Reaching Definitions Framework Satisfy Properties? satisfies conditions for satisfies conditions for x y and y z implies x z (transitivity) x y and y z implies x z (transitivity) x y and y x implies y = x (asymmetry) x y and y x implies y = x (asymmetry) x x (idempotence) x x (idempotence) F satisfies transfer function conditions F satisfies transfer function conditions x. (x- ) = x.x F (identity) x. (x- ) = x.x F (identity) Will show f(x y) = f(x) f(y) (distributivity) Will show f(x y) = f(x) f(y) (distributivity) f(x) f(y) = (a (x – b)) (a (y – b)) = a (x – b) (y – b) = a ((x y) – b) = a (x – b) (y – b) = a ((x y) – b) = f(x y) = f(x y)
47
47 Fall 2011 “Advanced Compiler Techniques” Does Reaching Definitions Framework Satisfy Properties? What about composition? What about composition? Given f 1 (x) = a 1 (x-b 1 ) and f 2 (x) = a 2 (x-b 2 ) Given f 1 (x) = a 1 (x-b 1 ) and f 2 (x) = a 2 (x-b 2 ) Must show f 1 (f 2 (x)) can be expressed as a (x - b) Must show f 1 (f 2 (x)) can be expressed as a (x - b) f 1 (f 2 (x)) = a 1 ((a 2 (x-b 2 )) - b 1 ) = a 1 ((a 2 - b 1 ) ((x-b 2 ) - b 1 )) = a 1 ((a 2 - b 1 ) ((x-b 2 ) - b 1 )) = (a 1 (a 2 - b 1 )) ((x-b 2 ) - b 1 )) = (a 1 (a 2 - b 1 )) ((x-b 2 ) - b 1 )) = (a 1 (a 2 - b 1 )) (x-(b 2 b 1 )) = (a 1 (a 2 - b 1 )) (x-(b 2 b 1 )) Let a = (a 1 (a 2 - b 1 )) and b = b 2 b 1 Let a = (a 1 (a 2 - b 1 )) and b = b 2 b 1 Then f 1 (f 2 (x)) = a (x – b) Then f 1 (f 2 (x)) = a (x – b)
48
48 Fall 2011 “Advanced Compiler Techniques” Available Expressions P = powerset of set of all expressions in program (all subsets of set of expressions) P = powerset of set of all expressions in program (all subsets of set of expressions) = (order is ) = (order is ) = P = P I = in n0 = I = in n0 = F = all functions f of the form f(x) = a (x-b) F = all functions f of the form f(x) = a (x-b) b is set of expressions that node kills b is set of expressions that node kills a is set of expressions that node generates a is set of expressions that node generates Another GEN/KILL analysis Another GEN/KILL analysis
49
49 Fall 2011 “Advanced Compiler Techniques” Concept of Conservatism Reaching definitions use as join Reaching definitions use as join Optimizations must take into account all definitions that reach along ANY path Optimizations must take into account all definitions that reach along ANY path Available expressions use as join Available expressions use as join Optimization requires expression to reach along ALL paths Optimization requires expression to reach along ALL paths Optimizations must conservatively take all possible executions into account. Optimizations must conservatively take all possible executions into account.
50
50 Fall 2011 “Advanced Compiler Techniques” Backward Dataflow Analysis Simulates execution of program backward against the flow of control For each node n, have in n – value at program point before n out n – value at program point after n f n – transfer function for n (given out n, computes in n ) Require that solution satisfies n. in n = f n (out n ) n N final. outn = { in m. m in succ(n) } n N final = out n = O Where O summarizes information at end of program
51
51 Fall 2011 “Advanced Compiler Techniques” Worklist Algorithm for Solving Backward Dataflow Equations for each n do in n := f n ( ) for each n N final do out n := O; inn := f n (O) worklist := N - N final while worklist do remove a node n from worklist out n := { in m. m in succ(n) } inn := f n (out n ) if inn changed then worklist := worklist pred(n)
52
52 Fall 2011 “Advanced Compiler Techniques” Live Variables P = powerset of set of all variables in program (all subsets of set of variables in program) P = powerset of set of all variables in program (all subsets of set of variables in program) = (order is ) = (order is ) = = O = O = F = all functions f of the form f(x) = a (x-b) F = all functions f of the form f(x) = a (x-b) b is set of variables that node kills b is set of variables that node kills a is set of variables that node reads a is set of variables that node reads
53
53 Fall 2011 “Advanced Compiler Techniques” Very Busy Expressions An expression is very busy if it will definitely be evaluated again before its value changes. An expression is very busy if it will definitely be evaluated again before its value changes. Also called “anticipated” expressions Also called “anticipated” expressions We need the same lattice and auxiliary functions as for available expressions. We need the same lattice and auxiliary functions as for available expressions. For every CFG node v the variable [v] denotes the set of expressions that at the program point before the node definitely are busy. For every CFG node v the variable [v] denotes the set of expressions that at the program point before the node definitely are busy.
54
54 Fall 2011 “Advanced Compiler Techniques” Formulate the Analysis P = powerset of set of all expressions in program (all subsets of set of expressions) P = powerset of set of all expressions in program (all subsets of set of expressions) = (order is ) = (order is ) = P = P I = in Exit = I = in Exit = F = all functions f of the form f(x) = a (x-b) F = all functions f of the form f(x) = a (x-b) b is set of expressions that node kills b is set of expressions that node kills a is set of expressions that node evaluates a is set of expressions that node evaluates
55
55 Fall 2011 “Advanced Compiler Techniques” Example var x,a,b; x = input; a = x-1; b = x-2; while (x>0) { output a*b-x; x = x-1; } output a*b; var x,a,b,atimesb; x = input; a = x-1; b = x-2; atimesb = a*b; while (x>0) { output atimesb-x; x = x-1; } output atimesb;
56
56 Fall 2011 “Advanced Compiler Techniques” Comparisons of Four Analyses Reaching Definitions Reaching Definitions Available Expressions Available Expressions Live Variables Live Variables Very Busy Expressions Very Busy Expressions
57
57 Fall 2011 “Advanced Compiler Techniques” Forwards vs. backwards A forwards analysis is one that for each program point computes information about the past behavior. A forwards analysis is one that for each program point computes information about the past behavior. Examples of this are available expressions and reaching definitions. Examples of this are available expressions and reaching definitions. Calculation: predecessors of CFG nodes. Calculation: predecessors of CFG nodes. A backwards analysis is one that for each program point computes information about the future behavior. A backwards analysis is one that for each program point computes information about the future behavior. Examples of this are liveness and very busy expressions. Examples of this are liveness and very busy expressions. Calculation: successors of CFG nodes. Calculation: successors of CFG nodes.
58
58 Fall 2011 “Advanced Compiler Techniques” May vs. Must A may analysis is one that describes information that may possibly be true and, thus, computes an upper approximation. A may analysis is one that describes information that may possibly be true and, thus, computes an upper approximation. Examples of this are liveness and reaching definitions. Examples of this are liveness and reaching definitions. Calculation: union operator. Calculation: union operator. A must analysis is one that describes information that must definitely be true and, thus, computes a lower approximation. A must analysis is one that describes information that must definitely be true and, thus, computes a lower approximation. Examples of this are available expressions and very busy expressions. Examples of this are available expressions and very busy expressions. Calculation: intersection operator. Calculation: intersection operator.
59
59 Fall 2011 “Advanced Compiler Techniques” Classification of Analyses ForwardsBackwards May Reaching Definitions Live Variables Must Available Expressions Very Busy Expressions
60
60 Fall 2011 “Advanced Compiler Techniques” Example: Initialization Try define an analysis that ensures that variables are initialized before they are read. Lattice? Powerset of variables in the given program Forwards or Backwards? Forwards May or Must? Must Must
61
61 Fall 2011 “Advanced Compiler Techniques” Meaning of Dataflow Results Concept of program state s for control-flow graphs Concept of program state s for control-flow graphs Program point n where execution located (n is node that will execute next) Program point n where execution located (n is node that will execute next) Values of variables in program Values of variables in program Each execution generates a trajectory of states: Each execution generates a trajectory of states: s 0 ;s 1 ; … ;s k,where each s i ST s 0 ;s 1 ; … ;s k,where each s i ST s i+1 generated from s i by executing basic block to s i+1 generated from s i by executing basic block to Update variable values Update variable values Obtain new program point n Obtain new program point n
62
62 Fall 2011 “Advanced Compiler Techniques” Relating States to Analysis Result Meaning of analysis results is given by an abstraction function AF:ST P Correctness condition: require that for all states s AF(s) in n where n is the next statement to execute in state s
63
63 Fall 2011 “Advanced Compiler Techniques” Sign Analysis Example Sign analysis - compute sign of each variable v Sign analysis - compute sign of each variable v Base Lattice: P = flat lattice on {-,0,+} Base Lattice: P = flat lattice on {-,0,+} Actual lattice records a value for each variable Actual lattice records a value for each variable Example element: [a +, b 0, c -] Example element: [a +, b 0, c -] -0+ TOP BOT
64
64 Fall 2011 “Advanced Compiler Techniques” Interpretation of Lattice Values If value of v in lattice is: If value of v in lattice is: BOT: no information about sign of v BOT: no information about sign of v -: variable v is negative -: variable v is negative 0: variable v is 0 0: variable v is 0 +: variable v is positive +: variable v is positive TOP: v may be positive or negative TOP: v may be positive or negative What is abstraction function AF? What is abstraction function AF? AF([x 1, …,x n ]) = [sign(x 1 ), …, sign(x n )] AF([x 1, …,x n ]) = [sign(x 1 ), …, sign(x n )] Where sign(x) = 0 if x = 0, + if x > 0, - if x 0, - if x < 0
65
65 Fall 2011 “Advanced Compiler Techniques” Transfer Functions If n of the form v = c If n of the form v = c f n (x) = x[v +] if c is positive f n (x) = x[v +] if c is positive f n (x) = x[v 0] if c is 0 f n (x) = x[v 0] if c is 0 f n (x) = x[v -] if c is negative f n (x) = x[v -] if c is negative If n of the form v 1 = v 2 *v 3 If n of the form v 1 = v 2 *v 3 f n (x) = x[v 1 x[v 2 ] x[v 3 ]] f n (x) = x[v 1 x[v 2 ] x[v 3 ]] I = TOP (uninitialized variables may have any sign) I = TOP (uninitialized variables may have any sign)
66
66 Fall 2011 “Advanced Compiler Techniques” Operation on Lattice BOT-0+TOP BOT 0 - +0-TOP 000000 +BOT-0+TOP BOTTOP0
67
67 Fall 2011 “Advanced Compiler Techniques” Example b = -1b = 1 a = 1 [a +] [a +, b +] [a +, b -] [a +, b TOP] c = a*b [a +, b TOP,c TOP]
68
68 Fall 2011 “Advanced Compiler Techniques” Imprecision In Example b = -1b = 1 a = 1 [a +] [a +, b +] [a +, b -] [a +, b TOP] c = a*b Abstraction Imprecision: [a 1] abstracted as [a +] Control Flow Imprecision: [b TOP] summarizes results of all executions. In any execution state s, AF(s)[b] TOP
69
69 Fall 2011 “Advanced Compiler Techniques” General Sources of Imprecision Abstraction Imprecision Abstraction Imprecision Concrete values (integers) abstracted as lattice values (-,0, and +) Concrete values (integers) abstracted as lattice values (-,0, and +) Lattice values less precise than execution values Lattice values less precise than execution values Abstraction function throws away information Abstraction function throws away information Control Flow Imprecision Control Flow Imprecision One lattice value for all possible control flow paths One lattice value for all possible control flow paths Analysis result has a single lattice value to summarize results of multiple concrete executions Analysis result has a single lattice value to summarize results of multiple concrete executions Join operation moves up in lattice to combine values from different execution paths Join operation moves up in lattice to combine values from different execution paths Typically if x y, then x is more precise than y Typically if x y, then x is more precise than y
70
70 Fall 2011 “Advanced Compiler Techniques” Why Have Imprecision Make analysis tractable Make analysis tractable Unbounded sets of values in execution Unbounded sets of values in execution Typically abstracted by finite set of lattice values Typically abstracted by finite set of lattice values Execution may visit unbounded set of states Execution may visit unbounded set of states Abstracted by computing joins of different paths Abstracted by computing joins of different paths
71
71 Fall 2011 “Advanced Compiler Techniques” Abstraction Function AF(s)[v] = sign of v AF(s)[v] = sign of v AF(n,[a 5, b 0, c -2]) = [a +, b 0, c -] AF(n,[a 5, b 0, c -2]) = [a +, b 0, c -] Establishes meaning of the analysis results Establishes meaning of the analysis results If analysis says variable has a given sign If analysis says variable has a given sign Always has that sign in actual execution Always has that sign in actual execution Correctness condition: Correctness condition: v. AF(s)[v] in n [v] (n is node for s) v. AF(s)[v] in n [v] (n is node for s) Reflects possibility of imprecision Reflects possibility of imprecision
72
72 Fall 2011 “Advanced Compiler Techniques” Abstraction Function Soundness Will show v. AF(s)[v] in n [v] (n is node for s) Will show v. AF(s)[v] in n [v] (n is node for s) by induction on length of computation that produced s Base case: Base case: v. in n0 [v] = TOP, which implies that v. in n0 [v] = TOP, which implies that v. AF(s)[v] TOP v. AF(s)[v] TOP
73
73 Fall 2011 “Advanced Compiler Techniques” Induction Step Assume v. AF(s)[v] in n [v] for computations of length k Assume v. AF(s)[v] in n [v] for computations of length k Prove for computations of length k+1 Prove for computations of length k+1 Proof: Proof: Given s (state), n (node to execute next), and in n Given s (state), n (node to execute next), and in n Find p (the node that just executed), s p (the previous state), and in p Find p (the node that just executed), s p (the previous state), and in p By induction hypothesis v. AF(s p )[v] in p [v] By induction hypothesis v. AF(s p )[v] in p [v] Case analysis on form of n Case analysis on form of n If n of the form v = c, then If n of the form v = c, then s[v] = c and out p [v] = sign(c), so AF(s)[v] = sign(c) = out p [v] in n [v] s[v] = c and out p [v] = sign(c), so AF(s)[v] = sign(c) = out p [v] in n [v] If x v, s[x] = s p [x] and out p [x] = in p [x], so AF(s)[x] = AF(s p )[x] in p [x] = out p [x] in n [x] If x v, s[x] = s p [x] and out p [x] = in p [x], so AF(s)[x] = AF(s p )[x] in p [x] = out p [x] in n [x] Similar reasoning if n of the form v 1 = v 2 *v 3 Similar reasoning if n of the form v 1 = v 2 *v 3
74
74 Fall 2011 “Advanced Compiler Techniques” Augmented Execution States Abstraction functions for some analyses require augmented execution states Abstraction functions for some analyses require augmented execution states Reaching definitions: states are augmented with definition that created each value Reaching definitions: states are augmented with definition that created each value Available expressions: states are augmented with expression for each value Available expressions: states are augmented with expression for each value
75
75 Fall 2011 “Advanced Compiler Techniques” Meet Over Paths Solution What solution would be ideal for a forward dataflow analysis problem? What solution would be ideal for a forward dataflow analysis problem? Consider a path p = n 0, n 1, …, n k, n to a node n (note that for all i n i pred(n i+1 )) Consider a path p = n 0, n 1, …, n k, n to a node n (note that for all i n i pred(n i+1 )) The solution must take this path into account: The solution must take this path into account: f p ( ) = (f nk (f nk-1 ( … f n1 (f n0 ( )) … )) in n So the solution must have the property that {f p ( ). p is a path to n} in n So the solution must have the property that {f p ( ). p is a path to n} in n and ideally {f p ( ). p is a path to n} = in n
76
76 Fall 2011 “Advanced Compiler Techniques” Soundness Proof of Analysis Algorithm Property to prove: Property to prove: For all paths p to n, f p ( ) in n Proof is by induction on length of p Proof is by induction on length of p Uses monotonicity of transfer functions Uses monotonicity of transfer functions Uses following lemma Uses following lemma Lemma: Lemma: Worklist algorithm produces a solution such that f n (in n ) = out n if n pred(m) then out n in m
77
77 Fall 2011 “Advanced Compiler Techniques” Proof Base case: p is of length 1 Base case: p is of length 1 Then p = n 0 and f p ( ) = = in n0 Then p = n 0 and f p ( ) = = in n0 Induction step: Induction step: Assume theorem for all paths of length k Assume theorem for all paths of length k Show for an arbitrary path p of length k+1 Show for an arbitrary path p of length k+1
78
78 Fall 2011 “Advanced Compiler Techniques” Induction Step Proof p = n 0, …, n k, n p = n 0, …, n k, n Must show f k (f k-1 ( … f n1 (f n0 ( )) … )) in n Must show f k (f k-1 ( … f n1 (f n0 ( )) … )) in n By induction (f k-1 ( … f n1 (f n0 ( )) … )) in nk By induction (f k-1 ( … f n1 (f n0 ( )) … )) in nk Apply f k to both sides, by monotonicity we get f k (f k-1 ( … f n1 (f n0 ( )) … )) f k (in nk ) Apply f k to both sides, by monotonicity we get f k (f k-1 ( … f n1 (f n0 ( )) … )) f k (in nk ) By lemma, f k (in nk ) = out nk By lemma, f k (in nk ) = out nk By lemma, out nk in n By lemma, out nk in n By transitivity, f k (f k-1 ( … f n1 (f n0 ( )) … )) in n By transitivity, f k (f k-1 ( … f n1 (f n0 ( )) … )) in n
79
79 Fall 2011 “Advanced Compiler Techniques” Distributivity Distributivity preserves precision Distributivity preserves precision If framework is distributive, then worklist algorithm produces the meet over paths solution If framework is distributive, then worklist algorithm produces the meet over paths solution For all n: For all n: {f p ( ). p is a path to n} = in n
80
80 Fall 2011 “Advanced Compiler Techniques” Lack of Distributivity Example Constant Calculator Constant Calculator Flat Lattice on Integers Flat Lattice on Integers Actual lattice records a value for each variable Actual lattice records a value for each variable Example element: [a 3, b 2, c 5] Example element: [a 3, b 2, c 5] 1 0 TOP BOT -2 2… …
81
81 Fall 2011 “Advanced Compiler Techniques” Transfer Functions If n of the form v = c If n of the form v = c f n (x) = x[v c] f n (x) = x[v c] If n of the form v 1 = v 2 +v 3 If n of the form v 1 = v 2 +v 3 f n (x) = x[v 1 x[v 2 ] + x[v 3 ]] f n (x) = x[v 1 x[v 2 ] + x[v 3 ]] Lack of distributivity Lack of distributivity Consider transfer function f for c = a + b Consider transfer function f for c = a + b f([a 3, b 2]) f([a 2, b 3]) = [a TOP, b TOP, c 5] f([a 3, b 2]) f([a 2, b 3]) = [a TOP, b TOP, c 5] f([a 3, b 2] [a 2, b 3]) = f([a TOP, b TOP]) = [a TOP, b TOP, c TOP] f([a 3, b 2] [a 2, b 3]) = f([a TOP, b TOP]) = [a TOP, b TOP, c TOP]
82
82 Fall 2011 “Advanced Compiler Techniques” Lack of Distributivity Anomaly a = 2 b = 3 a = 3 b = 2 [a 3, b 2][a 2, b 3] [a TOP, b TOP] c = a+b [a TOP, b TOP, c TOP] Lack of Distributivity Imprecision: [a TOP, b TOP, c 5] more precise What is the meet over all paths solution?
83
83 Fall 2011 “Advanced Compiler Techniques” How to Make Analysis Distributive Keep combinations of values on different paths Keep combinations of values on different paths a = 2 b = 3 a = 3 b = 2 {[a 3, b 2]}{[a 2, b 3]} {[a 2, b 3], [a 3, b 2]} c = a+b {[a 2, b 3,c 5], [a 3, b 2,c 5]}
84
84 Fall 2011 “Advanced Compiler Techniques” Issues Basically simulating all combinations of values in all executions Basically simulating all combinations of values in all executions Exponential blowup Exponential blowup Nontermination because of infinite ascending chains Nontermination because of infinite ascending chains Nontermination solution Nontermination solution Use widening operator to eliminate blowup (can make it work at granularity of variables) Use widening operator to eliminate blowup (can make it work at granularity of variables) Loses precision in many cases Loses precision in many cases
85
85 Fall 2011 “Advanced Compiler Techniques” Summary Formal dataflow analysis framework Formal dataflow analysis framework Lattices, partial orders Lattices, partial orders Transfer functions, joins and splits Transfer functions, joins and splits Dataflow equations and fixed point solutions Dataflow equations and fixed point solutions Connection with program Connection with program Abstraction function AF: S P Abstraction function AF: S P For any state s and program point n, AF(s) in n For any state s and program point n, AF(s) in n Meet over all paths solutions, distributivity Meet over all paths solutions, distributivity
86
86 Fall 2011 “Advanced Compiler Techniques” Next Time Partial Redundancy Elimination Partial Redundancy Elimination DragonBook 9.5 DragonBook §9.5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.