CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.

CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference

Must and may define An assignment statement of the form x:=E or a read statement of the form read x must define x. A function call fun(x) where x is passed by reference or an assignment of the form *q:= y where nothing is known about pointer q, may define x. 2

Reaching definitions Reaching defintions algorithm computes for each block b the set REACHES(b) of the statements that reach b. We say that a statement s that must or may define a variable x reaches b if – there is a path from s to b in the control flow graph such that none of the statements in the path must define x. The reaching definitions algorithm must be conservative to guarantee correct transformations. – i.e. it must assume that a definition can reach a block b unless it is absolutely certain that it does not. 3

Computing reaching definitions Consider the following language: S ::= id := expression | S ; S | if expression then S else S | do S while expression We use the following terms to compute reaching definitions: – gen[S] is the set of definitions “generated by S”. – kill[S] is the set of definitions “killed” by S. – in[S] is the set of definitions reaching S (meaning reaching the top of S) – out[S] is the set of definitions that reach the bottom of S. 4

Computing reaching definitions When S has the form a:= expression and label d: – gen[S] = { d } – kill[S] = D a -{ d }. Here D a is the set of all definitions of a. – out[S] = gen[S] ∪ (in[S]-kill[S]) When S is of the form S 1 ; S 2 : – gen[S] = gen[S 2 ] ∪ (gen[S 1 ]-kill[S 2 ]) – kill[S] = kill[S 2 ] ∪ (kill[S 1 ]-gen[S 2 ]) – in[S1] = in[S] – in[S 2 ] = out[S 1 ] – out[S] = out[S 2 ] 5

Computing reaching definitions When S is of the form if... then S 1 else S 2 – gen[S]=gen[S 1 ] ∪ gen[S 2 ] – kill[S]=kill[S 1 ] ∩ kill[S 2 ] – in[S 1 ] = in[S] – in[S 2 ] = in[S] – out[S] = out[S 1 ] ∪ out[S 2 ] When S is of the form do S 1 while...: – gen[S] = gen[S 1 ] – kill[S] = kill[S 1 ] – in[S 1 ] = in[S] ∪ gen[S 1 ] – out[S] = out[S 1 ] 6

Computing reaching definitions Use an abstractsyntax tree instead of a control flow graph. Gen[S] and kill[S] are computed bottom up on that tree. The “real” gen[S] is a subset of the computed gen[S]. For example when S is an if statement that always takes the “true” branch, the real gen[S]=gen[S 1 ] is a subset of the computed gen[S 1 ] ∪ gen[S 2 ]. At the same time, the real kill[S] is a superset of the computed kill[S]. Notice that out[S] is not the same as gen[S]. The former contains all the definitions that reach the bottom of S, while the latter includes only those definitions within S that reach the bottom of S. In[S] and out[S] are computed starting at the statement S 0 representing the whole program 7

Algorithm IN-OUT: Compute in[S] and out[S] for all statements S Input: An abstract syntax tree of program S 0 and the gen and kill sets for all the statements within the program. Output: in[S] and out[S] for all statements within the program computeOut(S,INS): case S a :=...: return(out[S]=gen[S] ∪ (INS-kill[S])) S1;S2: in[S 1 ]=INS in[S 2 ],out[S 1 ]=computeOut(S 1,in[S 1 ]) return(out[S 2 ]=computeOut(S 2,in[S 2 ])) if … then S1 else S2: in[S 1 ], in[S2] = INS return(out[S]=computeOut(S 1,in[S 1 ]) ∪ computeOut(S 2,in[S 2 ])) do S1 while... : in[S 1 ] = INS ∪ gen[S 1 ] return(out[S]=computeOut(S 1,in[S 1 ])) end case end in[S 0 ] = ∅ computeOut(S 0,in[S 0 ]) 8

Computing reaching definitions The sets of statements can be represented with bit vectors. Then unions and intersections take the form of or and and operations. Only the statements that assign values to program variables have to be taken into account. That is statements (or instructions) assigning to compilergenerated temporaries can be ignored. In an implementation it is better to do the computations for basic blocks instead of statements. The kill and gen of a basic block and the reaching sets for each statement can then be obtained by applying the rules above. 9

Computing reaching definitions iteratively Reaching definitions was computed assuming a structured program represented as an abstract syntax tree. An alternative approach that works even when the flow graph is not reducible is to use iterative algorithms on the control flow graph. These algorithms seek a solution to a system of equations: in[S] = ∪ out[T] out[S]=gen[S] ∪ (in[S]\kill[S]) or, in terms of in alone: in[S] = ∪ gen[S] ∪ (in[S]\kill[S]) For the CFG entry node, S 0, it is assumed that in[S 0 ] = ∅. Initially in[S] = ∅ for all S in the program. In this way, we get the smallest solution to these equations (which typically have more than one solution) for maximum accuracy. 10 T ∈ PRED(S)

Computing reaching definitions iteratively Input: A flow graph for which kill[S] and gen[S] has been computed for each assignment statement S Output: in[S] and out [S] for each assignment statement S Method: for each statement S ∈ PROG out[S] = gen[S] // compute out assuming in[S] = ∅. change := true while change change:=false for each statement S ∈ PROG in[S] := ∪ out[T] oldout:= out[S] out[S]:=gen[S] ∪ (in[S]\kill[S]) if out[S] ≠ oldout then change := true 11 T ∈ PRED(S)

Use-definition chains Data interconnections may be expressed in a pure form which directly links instructions that produce values to instructions that use them. For each statement S with input variable v, we say that DEFS( v,S) = in[S] ∩D v. If v is not input to S, DEFS( v,S) = ∅. For each definition S with output variable v, USES(S) ={T| S is in DEFS( v,T)}. Once DEFS is computed, USES can be computed by simple inversion as follows: Algorithm US: USES Computation Input: DEFS, a program PROG Output: USES Method: for each statement T in PROG USES(T):= ∅ for each input variable v of T for each statement S ∈ DEFS( v,T) USES(S) = USES(S) ∪ {T} 12

Algorithm MK: Mark Useful Definitions Input: – A program, PROG, – DEFS, – CRIT, a set of critical statements which are useful by definition (e.g writes). Output: – MARK(S). For each definition S, MARK(S) = true iff S is useful Method: for each statement T ∈ PROG MARK(T) = false PILE=CRIT while PILE ≠ ∅ S = from PILE // from removes one element from set PILE MARK(S)=true for each input variable v of S for each T ∈ DEFS( v,S) if MARK(T) then PILE = PILE ∪ {T} 13

Algorithm CP: Constant Propagation Input: – A program PROG – A flag CONST( v,S) for each statement S and input variable v of S. Initially, CONST( v,S) is false for all v and S. – CONST(S) for the output variable of S. Initially, CONST(S) is true if the rhs of S is a constant. – USES and DEFS Output: – The modified CONST flags – The mapping VAL( v,S) which provides the run-time constant value of input variable v at statement S. VAL( v,S) is defined only if CONST( v,S) is true. VAL(S) is the value of the output of S. VAL(S) is defined only if CONST(S) is true. 14

Algorithm CP: Constant Propagation Method: PILE = {S ∈ PROG | the rhs of S is a constant} // trivially constant statements while PILE ≠ ∅ T from PILE v = LHS(T) for each S ∈ USES(T) // check for constant inputs for each W in DEFS( v,S)-{T} //check that all inputs are constant if ¬ CONST(W) or VAL(W) ≠ VAL(T) then next(2) // If they are constant CONST( v,S)=true VAL( v,S)=VAL(T) // is the statement now computing constant? if CONST( w,S) is true for all inputs w of S then CONST(S) = true VAL(S) = evaluateRHS(S) PILE = PILE ∪ {S} 15

Algorithm TA: Type Analysis In dynamic language, run-time type checks are needed unless it can bedetermined at compile time the type of the operands. We need an algebra of types where – The atomic type symbols are: I (integer), R (real), N (number, i.e. real or integer), UD (undefined), NS (set of arbitrary elements), Z (error), etc. – The transition function F ⊕ for each operation ⊕ which for input types t 1, t 2,..., t n of the operands, produces the type of the left hand side: t 0 =F ⊕ (t 1, t 2,..., t n ). e.g. real+real is real, real + integer is also real, integer + integer is integer, real+error is error. And a “merging operation”: 16

Input: – A program PROG – A mapping TYPE, such that TYPE( v,S) is the best inital estimate of the type of the variable v at the top of statement S (for most variables this is ‘UD’). – TYPE(S), the type of the output (LHS) of S. – DEFS and USES Output: – For each instruction S and input or output variable v, Type( v,S), a conservative approximation to the most specific type information provably true at S. 17

Algorithm TA: Type Analysis PILE={S ∈ PROG | no variable in the rhs is of type ‘UD’} while PILE ≠ ∅ S from PILE v = LHS (S) for each T ∈ USES(S) // recompute type oldtype = TYPE( v,T) TYPE( v,T) = TYPE(S) if TYPE( v,T) ≠ oldtype then // a type refinement TYPE(T) = F ⊕ (types on RHS of T) // ⊕ is the operation on RHS of T PILE = PILE ∪ {T} 18 S ∈ DEFS( v,T)

Algorithm CP2: A second algorithm for constant propagation It is also possible to do constant propagation (and type inference) without using reaching definitions. The approach would follow an iterative procedure where, for each statement S and until convergence, we compute – in[S] as the merge of the out of the predecessors of S in the control flow graph, and – out[S] in terms of a value in[S] as specified next For constant propagation, we use not a set but a map as follows. 19

Algorithm CP2 Compute out[S] in terms of in[S] For constant propagation the values of in[S] and out[S] are – in[S]:V  ℝ ∪ {nonconstant, undefined} – out[S]:V  ℝ ∪ {nonconstant, undefined} – V is the set of variables in the program. The transfer functions are created from the type of operation in the flowgraph: – if no definitions then out[S]=in[S]. – if S = x=c then out[S](w)=in[S](w) ∀ w ≠ x and out[S]( x ) = c. – if S = x=y+z then out[S](w)=in[S](w) ∀ w ≠ x and out[S]( x )=in[S]( y )+in[S]( z ) Here + is extended to ℝ ∪ {nonconstant, undefined} as follows: nonconstant + a = nonconstant, undefined + a = undefined, nonconstant + undefined = nonconstant] – if S = read(x) then out[S](w)=in[S](w) ∀ w ≠ x and out[S]( x )=nonconstant. 20

Algorithm CP2 Merge operation The merge operation is associative so we only need to define it for pairs: – M= out[S 1 ] out[S 2 ] – Where M(x) is defined by the table: 21 out[S 1 ](x) out[S 2 ](x) nonconstant d ∈ ℝd ∈ ℝ undefined nonconstant c ∈ ℝ nonconstantIf c=d then c else nonconstant c undefinednonconstantdundefined

CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.

Similar presentations

Presentation on theme: "CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.

Similar presentations

Presentation on theme: "CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference."— Presentation transcript:

Similar presentations

About project

Feedback