Proving Optimizations Correct using Parameterized Program Equivalence University of California, San Diego Sudipta Kundu Zachary Tatlock Sorin Lerner
Compiler Correctness Difficult to develop reliable compilers: large, complex programs take a long time to mature Consequence: buggy compilers, but also … hinders development of new languages, architectures discourages user from extending compiler Focus on correctness of compiler optimizations many intricate opts, unexpected interactions turning off optimizations often no longer an option
Existing Techniques Translation Validation prove equivalence for each transformation every compiler execution a priori Correctness prove correctness before compiler runs once and for all TVOC [Zuck et al.] Rhodium [Lerner et al.] Compcert [Leroy et al.] [Necula 00] Focus on Automated Techniques Verified TV [Tristan et al.] [Pnueli et al.]
Scope of Guarantee Verify Compiler RunVerify Compiler TVOC [Zuck et al.] Rhodium [Lerner et al.] [Necula 00] Expressive Power Complex Loop Opts One-to-one Rewrites Focus on Automated Techniques complex loop opts + once-and-for-all correctness Our Approach: Adapt Translation Validation to once-and-for-all setting Key Insight:
Generalize to Parameterized Progs Equivalence Checking Equivalence Checking Generalize to Parameterized Programs Output Prog Input Prog Output PProg Input PProg Translation Validation k := 0 while(k<100){ a[k] += k k++ }
Generalize to Parameterized Progs Equivalence Checking Equivalence Checking Generalize to Parameterized Programs I := 0 while(I<100){ a[k] += I I++ } Output Prog Input Prog Output PProg Input PProg
Generalize to Parameterized Progs Equivalence Checking Equivalence Checking Generalize to Parameterized Programs I := 0 while(I<E){ a[k] += I I++ } Output Prog Input Prog Output PProg Input PProg
Equivalence Checking Equivalence Checking I := 0 while(I<E){ S I++ } I := 0 while(I<E){ S I++ } Generalize to Parameterized Progs Output Prog Input Prog Generalize to Parameterized Programs Output PProg Input PProg Optimization Instance Optimization
Generalize to Parameterized Progs Equivalence Checking Equivalence Checking Generalize to Parameterized Programs Parameterized Equivalence Checking Parameterized Equivalence Checking Optimization Instance Optimization Output Prog Input Prog Output PProg Input PProg
Contributions Parameterized Equivalence Checking Parameterized Equivalence Checking Parameterized Equivalence Checking (PEC) proves opts correct statically and automatically can reason about many-to-many opts Expressed and proved a variety of opts correct which Rhodium could not have proven correct software pipelining and other complex loop opts Optimization Output PProg Input PProg
I := 0 while(I<E-1){ S I++ } S I++ Parameterized Rewrite Rules Loop Peeling move iter out of loop ids range over: I : variable E : expression S : statement Shift final iteration after loop Side conditions indicate when rewrite safe I := 0 while(I < E){ S I++ } where: E > 0 S does not modify I, E
Enable loop unrolling Apply Rewrite 1.Match parameters 2.Check side conds 3.Rewrite where: 100 > 0 a[k] += k does not modify k,100 Applying Rewrite Rules I := 0 while(I < E){ S I++ } I := 0 while(I<E-1){ S I++ } S I++ where: E > 0 S does not modify I, E k := 0 while(k<100){ a[k] += k k++ } k := 0 while(k<99){ a[k] += k k++ } a[k] += k k++ Divisible by 3 Directly unroll by 3 Not divisible by 3 Hard to unroll by 3
Checking Correctness OPT Parameterized Equivalence Checking Parameterized Equivalence Checking
OPT Checking Correctness Generalization of [ Necula 00 ] Generalization of [ Zuck et al. 04] OR Relate Permute Relate
I := 0 while(I<E-1){ S I++ } S I++ I := 0 while(I < E){ S I++ } where: E > 0 S does not modify I, E Checking Rewrite Rules I:=0 I<E I≥E S I++ I:=0 I<E-1 I≥E-1 S I++ S σ1=σ2σ1=σ2 Programs equivalent: Consider CFGs Start in equal states End in equal states σ1=σ2σ1=σ2
Checking Rewrite Rules Relate Executions: 1.Find synchronization points 2.Generate invariants 3.Check invariants preserved I:=0 I<E I≥E S I++ I:=0 I<E-1 I≥E-1 S I++ S A B Use auto theorem prover Each invariant must imply successor invariants Strengthen if inv too weak σ1=σ2σ1=σ2 σ1=σ2σ1=σ2
1. Find Synchronization Points I<E S I++ I:=0 I<E-1 I≥E-1 S I++ S Traverse in lockstep Stop at stmt params Prune infeasible paths From Path: E≤0 Side Conds: E>0 Path never executes I:=0 I≥E
2. Generate Invariants I:=0 I<E I≥E S I++ I:=0 I<E-1 I≥E-1 S I++ S Invariants: predicates over σ 1, σ 2 Gen initial invariant: σ 1 = σ 2 AND strongest post cond σ1=σ2σ1=σ2 σ1=σ2σ1=σ2 B A A(σ1,σ2)A(σ1,σ2)... B(σ1,σ2)B(σ1,σ2) A(σ1,σ2)A(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) B(σ1,σ2)B(σ1,σ2)... B A I<E I<E-1 I≥E-1 A(σ1,σ2)A(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) B(σ1,σ2)B(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I ≥ E-1)
3. Check Invariants I:=0 I<E I≥E S I++ I:=0 I<E-1 I≥E-1 S I++ S σ1=σ2σ1=σ2 σ1=σ2σ1=σ2 B A Each invariant must imply successor invariants Query Theorem Prover B A I<E S I++ I≥E-1 S I++ S I<E S I++ I≥E-1 A B A(σ1,σ2)A(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) B(σ1,σ2)B(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I ≥ E-1) Entry A Entry B A B A A B Exit
σ 1 σ 2. A (σ 1,σ 2 ) ∧ σ 1 ’ = step(σ 1, S;I++;I<E) ∧ σ 2 ’ = step(σ 2, S;I++;I≥E-1) B (σ 1 ’, σ 2 ’) 3. Check Invariants A B S I++ I<E S I++ I≥E-1 σ1σ1 σ2σ2 σ1’σ1’σ2’σ2’ σ 1 σ 2. A(σ1,σ2) ∧A(σ1,σ2) ∧ σ 1 ’ = step(σ 1, S;I++;I<E) ∧ σ 2 ’ = step(σ 2, S;I++;I≥E-1) B (σ 1 ’, σ 2 ’) ATP Query: ATP A(σ1,σ2)A(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) B(σ1,σ2)B(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I ≥ E-1)
σ 1 σ 2. A (σ 1,σ 2 ) ∧ σ 1 ’ = step(σ 1, S;I++;I<E) ∧ σ 2 ’ = step(σ 2, S;I++;I≥E-1) B (σ 1 ’, σ 2 ’) 3. Check Invariants A B S I++ I<E S I++ I≥E-1 σ1σ1 σ2σ2 σ1’σ1’σ2’σ2’ ATP Query: ATP A(σ1,σ2)A(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) B(σ1,σ2)B(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I ≥ E-1) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) σ 1 ’=σ 2 ’ ∧ eval(σ 1 ’, I < E) ∧ eval(σ 2 ’, I ≥ E-1)
σ 1 σ 2. A (σ 1,σ 2 ) ∧ σ 1 ’ = step(σ 1, S;I++;I<E) ∧ σ 2 ’ = step(σ 2, S;I++;I≥E-1) B (σ 1 ’, σ 2 ’) 3. Check Invariants B S I++ I<E S I++ I≥E-1 σ1σ1 σ2σ2 σ1’σ1’σ2’σ2’ ATP Query: ATP A(σ1,σ2)A(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) B(σ1,σ2)B(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I ≥ E-1) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) σ 1 ’=σ 2 ’ ∧ eval(σ 1 ’, I < E) ∧ eval(σ 2 ’, I ≥ E-1) A ∧ B (σ 1 ’, σ 2 ’) Strengthen A if the theorem prover fails σ 1 ’ = step(σ 1, S;I++;I<E) σ 2 ’ = step(σ 2, S;I++;I≥E-1) A
3. Check Invariants I:=0 I<E I≥E S I++ I:=0 I<E-1 I≥E-1 S I++ S σ1=σ2σ1=σ2 σ1=σ2σ1=σ2 B A A(σ1,σ2)A(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I < E-1) B(σ1,σ2)B(σ1,σ2) σ 1 =σ 2 ∧ eval(σ 1, I < E) ∧ eval(σ 2, I ≥ E-1) Each invariant must imply successor invariants Query Auto Theorem Prover Entry A Entry B A B A A B Exit
OPT Checking Correctness Generalization of [ Necula 00 ] Generalization of [ Zuck et al. 04] OR Relate Permute Relate Permute Works well for structure-preserving optimizations Works well for non structure-preserving optimizations
Permute Module for (I in R1){ for (J in R2){ S[I, J]; } for (I in R1){ for (J in R2){ S[I, J]; } for (N in R2){ for (M in R1){ S[M, N]; } for (N in R2){ for (M in R1){ S[M, N]; } where: ⟨ side condition ⟩ Generate correspondence between loop indices Ask ATP to show that ⟨ side condition ⟩ implies: correspondence is one-to-one For all I, I’ ∊ R 1 and J, J’ ∊ R 2 S[I, J] commutes with S[I’, J’] Generate correspondence between loop indices Ask ATP to show that ⟨ side condition ⟩ implies: correspondence is one-to-one For all I, I’ ∊ R 1 and J, J’ ∊ R 2 S[I, J] commutes with S[I’, J’] Loop Interchange
Contributions Parameterized Equivalence Checking Parameterized Equivalence Checking Parameterized Equivalence Checking (PEC) proves opts correct statically and automatically can reason about many-to-many opts Expressed and proved a variety of opts correct which Rhodium could not have proven correct software pipelining and other complex loop opts Optimization Output PProg Input PProg
Optimizations Proved Correct Category 1: PEC and Rhodium formulation equivalent Copy propagation Constant propagation Common sub-expression elim Partial redundancy elim Category 2: PEC formulation easier, more general Loop invariant code hoisting Conditional speculation Speculation Category 3: Expressible in PEC No Rhodium formulation possible Software pipelining Loop unswitching Loop unrolling Loop peeling Loop splitting Loop alignment Loop interchange Loop reversal Loop skewing Loop fusion Loop distribution
Conclusion Parameterized Equivalence Checking Parameterized Equivalence Checking Parameterized Equivalence Checking (PEC) proves opts correct statically and automatically can reason about many-to-many opts Expressed and proved a variety of opts correct which Rhodium could not have proven correct software pipelining and other complex loop opts Optimization Output PProg Input PProg