Automated Soundness Proofs for Dataflow Analyses and Transformations via Local Rules Sorin Lerner* Todd Millstein** Erika Rice* Craig Chambers* * University of Washington ** UCLA [graduating this year!]
A traditional compiler Parser Code Gen Compiler Opt
Using a domain specific language Parser Code Gen Compiler DSL Opt DSL Opt DSL Opt
Using a domain specific language Parser Code Gen Compiler DSL Execution engine DSL Opt DSL Opt DSL Opt
Checking correctness automatically Parser Code Gen Compiler DSL Execution engine DSL Opt DSL Opt DSL Opt
Checking correctness automatically Parser Code Gen Compiler DSL Execution engine Checker DSL Opt Checker DSL Opt Checker DSL Opt
Checker DSL Opt DSL Opt Checker DSL Opt Checking correctness automatically Checker
DSL Opt Checker DSL Opt Checker DSL Opt Checking correctness automatically Checker
DSL Opt Checking correctness automatically Checker
Checking correctness automatically VCGen Verification Condition (VC) Checker Automatic Theorem Prover Checker DSL Opt
Checking correctness automatically VCGen DSL Opt Checker Automatic Theorem Prover Verification Condition (VC)
Checking correctness automatically opt- independent Checker opt- specific Automatic Theorem Prover DSL Opt VCGen Lemma: VC implies correctness VC
Cobalt The Cobalt DSL is an instantiation of this architecture –An opt written in Cobalt is a rewrite rule triggered by a declarative global condition over the CFG Expressed and automatically proved the correctness of a variety of intraprocedural optimizations, including: –const prop and folding, branch folding, CSE, PRE, DAE, partial DAE [PLDI 03]
In this talk: the Rhodium DSL Increased expressiveness –New model for expressing opts: local propagation rules with explicit dataflow facts –Heap summaries –Infinite analysis domains –Flow-sensitive and -insensitive –Intraprocedural and interprocedural Some Rhodium opts not expressible in Cobalt: –Arithmetic invariant detection, integer range analysis, loop-induction-variable strength reduction, Andersen's may-point-to analysis with allocation-site summaries
Outline Overview Rhodium by example Checking correctness automatically Future work, related work and conclusion
MustPointTo analysis c := a a := &b *c := d cd ab cd c ab
MustPointTo info in Rhodium c := a a := &b *c := d mustPointTo ( a, b ) mustPointTo ( c, d ) cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, b )
MustPointTo info in Rhodium c := a a := &b *c := d mustPointTo ( a, b ) mustPointTo ( c, d ) cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, b ) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b )
MustPointTo info in Rhodium define fact mustPointTo(X:Var,Y:Var) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b )
Propagating facts c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var)
if currStmt = [X := &Y] then Propagating facts define fact mustPointTo(X:Var,Y:Var) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b ) if currStmt = [X := &Y] then a := &b mustPointTo ( a, b )
if currStmt = [X := &Y] then Propagating facts define fact mustPointTo(X:Var,Y:Var) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b )
if currStmt = [X := &Y] then Propagating facts if Æ currStmt = [Z := &W] Æ X Z then define fact mustPointTo(X:Var,Y:Var) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b ) if Æ currStmt = [Z := &W] Æ X Z then a := &b mustPointTo ( c, d )
if currStmt = [X := &Y] then Propagating facts if Æ currStmt = [Z := &W] Æ X Z then define fact mustPointTo(X:Var,Y:Var) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b ) if Æ currStmt = [Z := X] then c := a mustPointTo ( a, b ) mustPointTo ( c, b )
if currStmt = [X := &Y] then Propagating facts if Æ currStmt = [Z := &W] Æ X Z then define fact mustPointTo(X:Var,Y:Var) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b ) if Æ currStmt = [Z := X] then
if currStmt = [X := &Y] then Transformations if Æ currStmt = [Z := &W] Æ X Z then define fact mustPointTo(X:Var,Y:Var) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b ) if Æ currStmt = [Z := X] then
*c := d Transformations define fact mustPointTo(X:Var,Y:Var) c := a a := &b *c := d cd ab cd c ab mustPointTo ( a, b ) mustPointTo ( c, d ) mustPointTo ( a, b ) mustPointTo ( c, b ) if Æ currStmt = [*X := Z] then transform to [Y := Z] mustPointTo ( c, b ) b := d
Semantics of a Rhodium opt Run all the propagations rules using optimistic iterative analysis starting with complete set of facts until the best fixed point is reached Then run all transformation rules For better precision, combine analyses and transformations using our previous composition framework [POPL 02]
More in Rhodium (see paper for details) Mixing facts Heap summaries MayPointTo analysis via MustNotPointTo Infinite domains Flow-sensitive and -insensitive Intraprocedural and interprocedural
Outline Overview Rhodium by example Checking correctness automatically Future work, related work and conclusion
Rhodium correctness checker Automatic theorem prover Checker opt- independent VCGen VC Lemma: VC ) correctness Rhodium optimization
Rhodium correctness checker Automatic theorem prover Checker opt- independent VCGen Rhodium optimization define fact … if … then transform … if … then … VC Lemma: VC ) correctness
Rhodium correctness checker Automatic theorem prover IL semantics axioms Rhodium optimization Checker opt- independent Lemma: VC ) correctness define fact … if … then transform … if … then … VCGen Local VC Lemma: Local VCs ) correctness
Local correctness of prop. rules define fact mustPointTo(X:Var,Y:Var) Z := X mustPointTo ( X, Y ) mustPointTo ( Z, Y ) currStmt = [Z := X] then if Æ
define fact mustPointTo(X:Var,Y:Var) define fact mustPointTo(X:Var,Y:Var) with meaning « X == &Y ¬ Local correctness of prop. rules currStmt = [Z := X] then mustPointTo ( X, Y ) mustPointTo ( Z, Y ) Z := X XY in out ZY ? Local VC sent to ATP: if Æ then « Z == &Y ¬ ( out ) if « X == &Y ¬ ( in ) Æ Z := X in out
Local correctness of trans. rules define fact mustPointTo(X:Var,Y:Var) if Æ currStmt = [*X := Z] then transform to [Y := Z] with meaning « X == &Y ¬ then if « X == &Y ¬ ( in ) Æ *X := Z in out Y := Z in out mustPointTo ( X, Y ) *X := Z XY in out ? Y := Z XY in out Y := Z Local VC sent to ATP:
More on correctness (see paper for details) Heap summaries Separating profitability from correctness Theorem stating soundness of the framework for creating interprocedural and flow-insensitive analyses
Outline Overview Rhodium by example Checking correctness automatically Future work, related work and conclusion
Current and future work Backward optimizations Infer rules from just the dataflow fact declarations and their meanings Debugging Efficient execution engine
Some related work Proving correctness by hand –Abstract interpretation [Cousot and Cousot 77, 79] –Partial equivalence relations [Benton 04] –Temporal logic [Lacey et al. 02] Proving correctness with interactive theorem prover –Using Coq proof assistant [Cachera et al. 04] Testing correctness one compilation at a time –Translation validation [Pnueli et al. 98, Necula 00] –Credible compilation [Rinard 99] Execution engines –Incremental execution of transformations [Sittampalam et al. 04] –Running opts specified with temporal logic [Steffen 91]
Conclusion Local rules in Rhodium are more expressive than Cobalt’s global condition The correctness checker found subtle bugs in our Rhodium opts Good step towards pushing more of the burden of writing compilers on to the computer