Compiler Principles Fall Compiler Principles Lecture 7: Lowering Correctness Roman Manevich Ben-Gurion University of the Negev
Tentative syllabus Front End Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Intermediate Representation Lowering Lowering Correctness Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection 2
Previously The role of intermediate representations Two example languages – A high-level language – An intermediate language Lowering Correctness – Formal meaning of programs 3
While syntax A n | x | A ArithOp A | ( A ) ArithOp - | + | * | / B true | false | A = A | A A | B | B B | ( B ) S x := A | skip | S ; S | { S } | if B then S else S | while B S 4 n Numnumerals x Varprogram variables
IL syntax V n | x R V Op V | V Op - | + | * | / | = | | > | … C l : skip | l : x := R | l : Goto l’ | l : IfZ x Goto l’ | l : IfNZ x Goto l’ IR C + 5 n NumNumerals l Num Labels x Temp VarTemporaries and variables
Translation rules for expressions 6 cgen(n) = (l: t:=n, t)where l and t are fresh cgen(x) = (l: t:=x, t)where l and t are fresh cgen(e 1 ) = (P 1, t 1 ) cgen(e 2 ) = (P 2, t 2 ) cgen(e 1 op e 2 ) = (P 1 · P 2 · l: t:=t 1 op t 2, t) where l and t are fresh
Translation rules for statements 7 cgen(e) = (P, t) cgen( x := e) = P · l: x :=t where l is fresh cgen( S 1 ) = P 1, cgen( S 2 ) = P 2 cgen( S 1 ; S 2 ) = P 1 · P 2 cgen( skip ) = l: skip where l is fresh
Translation rules for conditions 8 cgen( b ) = (Pb, t), cgen( S 1 ) = P 1, cgen( S 2 ) = P 2 cgen( if b then S 1 else S 2 ) = Pb lb: IfZ t Goto label(P 2 ) P 1 l finish : Goto L after P 2 l after : skip where lb, l finish, l after are fresh
Translation rules for loops 9 cgen( b ) = (Pb, t), cgen( S ) = P cgen( while b S ) = l before : skip Pb IfZ t Goto l after P l loop : Goto L before l after : skip where l after, l before, l loop are fresh
Translation example 10 1: t1 := 137 2: t2 := 3 3: t3 := t1 + t2 4: y := t3 5: t4 := x 6: t5 := 0 7: t6 := t4=t5 8: IfZ t6 Goto 12 9: t7 := y 10: z := t7 11: Goto 14 12: t8 := y 13: x := t8 14: skip y := 137+3; if x=0 z := y; else x := y;
agenda Operational semantics of While Operational semantics of IL Formalizing the correctness of lowering Proving correctness 11
Correctness 12
Compiler correctness Intuitively, a compiler translates programs in one language (usually high) to another language (usually lower) such that they are bot equivalent Our goal is to formally define the meaning of this equivalence But first, we must define the meaning of a programming language 13
Formal semantics 14
Operational semantics of while 15
While syntax reminder A n | x | A ArithOp A | ( A ) ArithOp - | + | * | / B true | false | A = A | A A | B | B B | ( B ) S x := A | skip | S ; S | { S } | if B then S else S | while B S 16 n Numnumerals x Varprogram variables
Semantic categories Z Integers {0, 1, -1, 2, -2, …} T Truth values { ff, tt } State Var Z Example state: =[ x 5, y 7, z 0] Lookup: ( x) = 5 Update: [ x 6] = [ x 6, y 7, z 0] 17
Semantics of expressions 18
Semantics of arithmetic expressions Semantic function A : State Z Defined by induction on the syntax tree n = n x = (x) a 1 + a 2 = a 1 + a 2 a 1 - a 2 = a 1 - a 2 a 1 * a 2 = a 1 a 2 (a 1 ) = a 1 --- not needed - a = 0 - a 1 Compositional Expressions in While are side-effect free 19
Semantics of boolean expressions Semantic function B : State T Defined by induction on the syntax tree true = tt false = ff a 1 = a 2 = a 1 a 2 = b 1 b 2 = b = Compositional Expressions in While are side-effect free 20
Natural operating semantics Developed by Gilles Kahn [STACS 1987]STACS 1987 Configurations S, Statement S is about to execute on state Terminal (final) state Transitions S, ’ Execution of S from will terminate with the result state ’ – Ignores non-terminating computations 21
Natural operating semantics defined by rules of the form The meaning of compound statements is defined using the meaning immediate constituent statements 22 S 1, 1 1 ’, …, S n, n n ’ S, ’ if… premise conclusion side condition
Natural semantics for While 23 x := a, [x a ] [ass ns ] skip, [skip ns ] S 1, ’, S 2, ’ ’’ S 1 ; S 2, ’’ [comp ns ] S 1, ’ if b then S 1 else S 2, ’ if b = tt [if tt ns ] S 2, ’ if b then S 1 else S 2, ’ if b = ff [if ff ns ] axioms
Natural semantics for While 24 S, ’, while b S, ’ ’’ while b S, ’’ if b = tt [while tt ns ] while b S, if b = ff [while ff ns ] Non-compositional
Executing the semantics 25
Example Let 0 be the state which assigns zero to all program variables 26 x:=x+1, 0 skip, 0 0 0 [x 1] skip, 0 0, x:=x+1, 0 0 [x 1] skip ; x:=x+1, 0 0 [x 1] x:=x+1, 0 0 [x 1] if x=0 then x:=x+1 else skip, 0 0 [x 1]
Derivation trees Using axioms and rules to derive a transition S, ’ gives a derivation tree – Root: S, ’ – Leaves: axioms – Internal nodes: conclusions of rules Immediate children: matching rule premises 27
Derivation tree example 1 Assume 0 =[x 5, y 7, z 0] 1 =[x 5, y 7, z 5] 2 =[x 7, y 7, z 5] 3 =[x 7, y 5, z 5] 28 ( z:=x; x:=y); y:=z, 0 3 ( z:=x; x:=y), 0 2 y:=z, 2 3 z:=x, 0 1 x:=y, 1 2 [ass ns ] [comp ns ]
Derivation tree example 1 Assume 0 =[x 5, y 7, z 0] 1 =[x 5, y 7, z 5] 2 =[x 7, y 7, z 5] 3 =[x 7, y 5, z 5] 29 ( z:=x; x:=y); y:=z, 0 3 ( z:=x; x:=y), 0 2 y:=z, 2 3 z:=x, 0 1 x:=y, 1 2 [ass ns ] [comp ns ]
Top-down evaluation via derivation trees Given a statement S and an input state find an output state ’ such that S, ’ Start with the root and repeatedly apply rules until the axioms are reached – Inspect different alternatives in order Theorem: In While, ’ and the derivation tree are unique 30
Top-down evaluation example Factorial program with x = 2 Shorthand: W= while (x=1) { y:=y*x; x:=x-1 } 31 y:=1; while (x=1) { y:=y*x; x:=x-1 }, [y 2][x 1] y:=1, [y 1] W, [y 1] [y 2, x 1] y:=y*x; x:=x-1, [y 1] [y 2][x 1] W, [y 2][x 1] [y 2, x 1] y:=y*x, [y 1] [y 2] x:=x-1, [y 2] [y 2][x 1] [ass ns ] [comp ns ] [ass ns ] [comp ns ] [while ff ns ] [while tt ns ] [ass ns ]
Properties of natural semantics 32
Program termination Given a statement S and input – S terminates on if there exists a state ’ such that S, ’ – S loops on if there is no state ’ such that S, ’ Given a statement S – S always terminates if for every input state , S terminates on – S always loops if for every input state , S loops on 33
Semantic equivalence S 1 and S 2 are semantically equivalent if for all and ’ S 1, ’ if and only if S 2, ’ Simple example while b do S is semantically equivalent to: if b then (S; while b S) else skip – Read proof in pages
Properties of natural semantics Equivalence of program constructs – skip; skip is semantically equivalent to skip – ((S 1 ; S 2 ); S 3 ) is semantically equivalent to (S 1 ; (S 2 ; S 3 )) – (x:=5; y:=x*8) is semantically equivalent to (x:=5; y:=40) 35
Equivalence of {S 1 ; S 2 }; S 3 and S 1 ; {S 2 ; S 3 } 36
Equivalence of {S 1 ; S 2 }; S 3 and S 1 ; {S 2 ; S 3 } 37 (S 1 ; S 2 ), 12, S 3, 12 ’ {S 1 ; S 2 }; S 3, ’ S 1, 1, S 2, 1 12 S 1, 1, {S 2 ; S 3 }, 1 ’ S 1 ; {S 2 ; S 3 }, ’ S 2, 1 12, S 3, 12 ’ Assume (S 1 ; S 2 ); S 3, ’ then the following unique derivation tree exists: Using the rule applications above, we can construct the following derivation tree: And vice versa.
Deterministic semantics for While Theorem: for all statements S and states 1, 2 if S, 1 and S, 2 then 1 = 2 38
The semantics of statements The meaning of a statement S is defined as Examples: skip = x:=1 = [x 1] while true do skip = undefined 39 S = ’ if S, ’ else
Operational semantics of IL 40
IL syntax reminder V n | x R V Op V | V Op - | + | * | / | = | | > | … C l : skip | l : x := R | l : Goto l’ | l : IfZ x Goto l’ IR C + 41 n NumNumerals l Num Labels x Var TempVariables and temporaries
Intermediate program states Z Integers {0, 1, -1, 2, -2, …} IState (Var Temp {pc}) Z – Var, Temp, and {pc} are all disjoint – For every state m and program P=1:c 1,…,n:c n we have that 1 m(pc) n+1 We can check that the labels used in P are within the range 1..n 42
Rules for executing commands We will use rules of the following form Here m is the pre-state, which is scheduled to be executed as the program counter indicates, and m’ is the post-state The rules specialize for the particular type of command C and possibly other conditions 43 m(pc) = l P( l ) = C m m’
Rules for executing commands 44 m(pc) = l P( l ) = skip m m[pc l +1] m(pc) = l P( l ) = Goto l’ m m[pc l’ ] m(pc) = l P( l ) = x := v 1 op v 2 m m[pc l +1, x M( v 1 ) op M( v 2 )] M(v)= m(v) v Var Temp v else m(pc) = l P( l ) = x := v m m[pc l +1, x M( v )] M(v)= m(v) v Var Temp v else
Rules for executing commands 45 m(pc) = l P( l ) = IfZ x Goto l’ m( x )=0 m m[pc l’ ] m(pc) = l P( l ) = IfZ x Goto l’ m( x ) 0 m m[pc l +1] m(pc) = l P( l ) = IfNZ x Goto l’ m( x ) 0 m m[pc l’ ] m(pc) = l P( l ) = IfNZ x Goto l’ m( x )=0 m m[pc l +1]
Executing programs For a program P=1:c 1,…,n:c n we define executions as finite or infinite sequences m 1 m 2 … m n … We write m * m’ if there is a finite execution starting at m and ending at m’: m = m 1 m 2 … m n = m’ 46
Semantics of a program For a program P=1:c 1,…,n:c n and a state m s.t. m(pc)=1 we define the result of executing P on m as Lemma: the function is well-defined (i.e., at most one output state) 47 P m=P m= m’if m * m’ and m’(pc)=n+1 else
Execution example Execute the following intermediate language program on a state where all variables evaluate to : t1 := 137 2: y := t : IfZ x Goto 7 4: t2 := y 5: z := t2 6: Goto 9 7: t3 := y 8: x := t3 9: skip m = [pc 1, t1 0, t2 0, t3 0, x 0, y 0] ?
Execution example Execute the following intermediate language program on a state where all variables evaluate to : t1 := 137 2: y := t : IfZ x Goto 7 4: t2 := y 5: z := t2 6: Goto 9 7: t3 := y 8: x := t3 9: skip m = [pc 1, t1 0, t2 0, t3 0, x 0, y 0] m[pc 2, t1 137] m[pc 3, t1 137, y 140] m[pc 7, t1 137, y 140] m[pc 8, t1 137, t3 140, y 140] m[pc 9, t1 137, t3 140,, x 140, y 140] m[pc 10, t1 137, t3 140,, x 140, y 140]
Proof methodology 50
Structural induction To prove a property of a derivation tree – Prove property holds for leaves – Assume property holds on all sub-trees of a given node and establish that it holds for the node Conclude that the property holds for every derivation tree 51
Defining and Proving equivalence for expressions 52
Exercise 1 Are the following equivalent in your opinion? 53 x := 1371: t1 := 137 2: x := t1 ILWhile
Exercise 2 Are the following equivalent in your opinion? 54 x := 1371: y := 137 2: x := y ILWhile
Exercise 3 Are the following equivalent in your opinion? 55 x := 1372: t2 := 138 3: t1 := 137 4: x := t1 ILWhile
Exercise 4 Are the following equivalent in your opinion? 56 x := 1372: t2 := 138 3: t1 := 137 4: x := t1 5: t1 := 138 ILWhile
Equivalence of arithmetic expressions Define TVar = Var Temp While state: Var Z IL state: m (Var Temp {pc}) Z An arithmetic While expression a is equivalent to an IL program P and x TVar iff for every input state m such that m(pc)=label(P): a m| Var = ( P m) x 57
Defining equivalence for expressions 58 ILWhile (P, t)a ( P m) t a m| Var cgen = (P, t)a if m: m(pc)=label(P) Definition:
Equivalence of Boolean expressions A Boolean While expression b is equivalent to an IL program P and x TVar iff for every input state m such that m(pc)=label(P): b m| Var = tt and ( P m) x = 1 or b m| Var = ff and ( P m) x = 0 59
Equivalence of atomic expressions 60 cgen(n) = (l: t:=n, t)where l and t are fresh cgen(x) = (l: t:=x, t)where l and t are fresh Claim: n (l: t:=n, t) Proof: choose w.l.o.g m such that m(pc)=l n m| Var = n ( l: t:=n m) t = m[pc l+1, t n] t = n Q.E.D Claim: n (l: t:=x, t) Proof: choose w.l.o.g m such that m(pc)=l x m| Var = m(x) since x Var ( l: t:=x m) t = m[pc l+1, t m(x)] t = m(x) Q.E.D
Lemmas for expressions 61 cgen(e 1 ) = (P 1, t 1 ) cgen(e 2 ) = (P 2, t 2 ) cgen(e 1 op e 2 ) = (P 1 · P 2 · l: t:=t 1 op t 2, t) where l and t are fresh Lemma 1: Let cgen(e 1 op e 2 )=(P, t) then P always terminates Lemma 2 (sequential execution): Let P e and P be two IL such that (P e, t)=cgen(e) and labels(P e ) labels(P) = and Temps(P e ) Temps(P) = {t} Then for every m, such that m(pc)=label(P 1 ) ( P e m) = m’ such that m’(pc) = label(P). Moreover, let =m| Var then P e · P m| Var = ( P m[t e ])| Var. Lemma 3: Let Temps(P) be the set of temporaries appearing in P. Let P 1 and P 2 be two programs such that Temps(P 1 ) Temps(P 2 ) = and. Then for every m, such that m(pc)=label(P) ( P 1 · P 2 m)| Var = ( P 1 m 1 ) t 1
Lemmas for expressions 62 cgen(e 1 ) = (P 1, t 1 ) cgen(e 2 ) = (P 2, t 2 ) cgen(e 1 op e 2 ) = (P 1 · P 2 · l: t:=t 1 op t 2, t) where l and t are fresh Lemma 4: Let P 1 and P 2 be two IL programs such that cgen(e 1 op e 2 ) = (P 1 · P 2 · l: t:=t 1 op t 2, t) Then for every m 1, m 2 such that m 1 (pc)=label(P 1 ) and m 2 (pc)=label(P 2 ) ( P 1 · P 2 m 1 ) t 1 = ( P 1 m 1 ) t 1 ( P 1 · P 2 m 1 ) t 2 = ( P 2 m 1 ) t 2 Lemma 3: Let Temps(P) stand for the set of temporaries appearing in P. If cgen(e 1 op e 2 ) = (P 1 · P 2 · l: t:=t 1 op t 2, t) Then Temps(P 1 ) Temps(P 2 ) =
Equivalence of compound expressions 63 cgen(e 1 ) = (P 1, t 1 ) cgen(e 2 ) = (P 2, t 2 ) cgen(e 1 op e 2 ) = (P 1 · P 2 · l: t:=t 1 op t 2, t) where l and t are fresh Claim: Let P= P 1 · P 2 · l: t:=t 1 op t 2 then e 1 op e 2 (P, t) Proof: choose w.l.o.g m such that m(pc)=label(P 1 ) let =m| Var e 1 + e 2 = e 1 + e 2 by the induction hypothesis: e 1 (P 1, t 1 ) and e 2 (P 2, t 2 ) Denote m’ = P 1 · P 2 m By Lemma 4, we have that ( P 1 · P 2 m) t 1 = ( P 1 m) t 1 = e 1 = m’ t 1 ( P 1 · P 2 m) t 2 = ( P 2 m) t 2 = e 2 = m’ t 2 Therefore, by Lemma 2 ( P m) t = ( l: t:=t 1 op t 2 m’) t = m’[pc l+1, t m’(t 1 ) op m’(t 2 ) m) t = e 1 + e 2 = e 1 + e 2 Q.E.D
Conclusion Missing: proof for Boolean expression (exercise for home) Theorem 1: for each While expression e we have that cgen(e) e 64
Defining and Proving equivalence for statements 65
State equivalence Define m iff =m| Var – That is, for each x Var (x)=m(x) 66
Statement equivalence A While statement S is equivalent to an IL program P iff for every input state m such that m(pc)=label(P): S m| Var ( P )| Var 67
Defining equivalence for statements 68 IL While PS cgen P mP m S S P S if m, : m(pc)=label(P) and m Definition:
Equivalence of skip 69 cgen( skip ) = l: skip where l is fresh Claim: skip l: skip Proof: choose w.l.o.g m such that m(pc)=l and =m| Var skip m| Var = m| Var ( l: skip m)| Var = m[pc l+1]| Var = m| Var Q.E.D
Equivalence of assignments 70 Lemma 5: let DefVars(P) denote the variables being assigned-to in P. If DefVars(P)= Then for every m such that m(pc)=label(P) ( P m)| Var = m| Var cgen(a) = (P, t) cgen( x := a) = P · l: x := t where l is fresh
Equivalence of assignments 71 Claim: x := a P · l: x := t Proof: choose w.l.o.g m such that m(pc)=l and =m| Var x := a = [ x a ] Let m’ = P m. Then from Theorem 1: a (P,t). That is, a = m’(t) From Lemma 2: P · l: x := t m = l: x := t m’ = m’[pc l+1, x m’(t)] = m’[pc l+1, x a ] Now, since DefVars(P)= by Lemma 5, we have that m’| Var = m| Var Therefore, [ x a ]| Var = P · l: x := t | Var Q.E.D cgen(a) = (P, t) cgen( x := a) = P · l: x := t where l is fresh
Natural semantics for sequencing 72 S 1, ’, S 2, ’ ’’ S 1 ; S 2, ’’ [comp ns ] Lemma 6: S 1 ; S 2 = S 2 ( S 1 )
Helper lemmas for sequencing 73 Lemma 7: Let P 1 and P 1 be two IL programs such that such that cgen( S 1 ; S 2 ) = P 1 · P 2 Then labels(P 1 ) labels(P 2 ) = and Temps(P 1 ) Temps(P 2 ) = and for every m, such that m(pc)=label(P 1 ) ( P 1 m) = m’ such that m’(pc) = label(P 2 ). Moreover, P 1 · P 2 m = P 2 m’ cgen( S 1 ) = P 1, cgen( S 2 ) = P 2 cgen( S 1 ; S 2 ) = P 1 · P 2
Equivalence of sequencing 74 Claim: assume S 1 P 1 and S 2 P 2 then S 1 ; S 2 P 1 · P 2 Proof: choose w.l.o.g m such that m(pc)=l and =m| Var By Lemma 7, we have that ( P 1 m) = m’ such that m’(pc) = label(P 2 ) By the induction hypothesis m’| Var = S 1 . By Lemma 7, we have that P 1 · P 2 m = P 2 m’ Let S 1 = ’. By the induction hypothesis, since m’ ’, we have that P 2 m’ S 2 ’ By the definition of the natural semantics (lemma 6), we have that S 1 ; S 2 P 1 · P 2 m Q.E.D cgen( S 1 ) = P 1, cgen( S 2 ) = P 2 cgen( S 1 ; S 2 ) = P 1 · P 2
Equivalence of conditions 75 cgen( b ) = (Pb, t), cgen( S 1 ) = P 1, cgen( S 2 ) = P 2 cgen( if b then S 1 else S 2 ) = Pb lb: IfZ t Goto label(P 2 ) P 1 l finish : Goto L after P 2 l after : skip where lb, l finish, l after are fresh Lemma 8: for all states we have that If b = tt then, if b then S 1 else S 2 = S 1 If b = ff then, if b then S 1 else S 2 = S 2
Helper lemmas 76 cgen( b ) = (Pb, t), cgen( S 1 ) = P 1, cgen( S 2 ) = P 2 cgen( if b then S 1 else S 2 ) = Pb lb: IfZ t Goto label(P 2 ) P 1 l finish : Goto L after P 2 l after : skip where lb, l finish, l after are fresh
Equivalence of conditions: ff 77 Claim: Let P=cgen( if b then S 1 else S 2 ) if b then S 1 else S 2 P Proof: choose w.l.o.g m such that m(pc)=l and =m| Var From Theorem 1: b (P,t) therefore Pb m t = 0 Case ff: assume b = ff. By Lemma 8: if b then S 1 else S 2 = S 2 = 2 By Lemma 2 Pb · lb: IfZ t Goto label(P 2 ) · P 1 · l finish : Goto L after · P 2 · l after : skip m IfZ t Goto label(P 2 ) · P 1 · l finish : Goto L after · P 2 · l after : skip (m[t 0]) IfZ t Goto label(P 2 ) · P 1 · l finish : Goto L after · P 2 · l after : skip (m[pc label(P 2 )]) By the induction hypothesis and lemma 2: Let P 2 m[pc label(P 2 )] = m 2 then m 2 2 IfZ t Goto label(P 2 ) · P 1 · l finish : Goto L after · P 2 · l after : skip (m 2 [pc l after ]) = m 2 [pc l after +1] m 2
Equivalence of conditions: tt 78 Claim: Let P=cgen( if b then S 1 else S 2 ) if b then S 1 else S 2 P Proof: choose w.l.o.g m such that m(pc)=l and =m| Var From Theorem 1: b (P,t) therefore Pb m t = 0 Case ff: assume b = tt. By Lemma 8: if b then S 1 else S 2 = S 1 = 1 By Lemma 2 Pb · lb: IfZ t Goto label(P 2 ) · P 1 · l finish : Goto L after · P 2 · l after : skip m IfZ t Goto label(P 2 ) · P 1 · l finish : Goto L after · P 2 · l after : skip (m[t 1]) IfZ t Goto label(P 2 ) · P 1 · l finish : Goto L after · P 2 · l after : skip (m[pc label(P 1 )]) By the induction hypothesis and lemma 2: Let P 1 m[pc label(P 1 )] = m 1 then m 1 1 IfZ t Goto label(P 2 ) · P 1 · l finish : Goto L after · P 2 · l after : skip (m 1 [pc l finish ]) = m 1 [pc l after ] = m 1 [pc l after +1] m 1
Equivalence for loops 79 cgen( b ) = (Pb, t), cgen( S ) = P cgen( while b S ) = l before : skip Pb IfZ t Goto l after P l loop : Goto L before l after : skip where l after, l before, l loop are fresh
Proof outline Let be a state We will split the proof into two cases: 1. while b S terminates (there is a derivation tree) 2. while b S loops (no derivation tree) 80
Case 1: while b S terminates 81
Case 2: while b S loops 82
Next lecture: Dataflow-based Optimization