Provably Correct Compilers (Part 2) Nazrul Alam and Krishnaprasad Vikram April 21, 2005.

Slides:

Advertisements

Similar presentations

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?

Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.

Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:

7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.

Lecture 11: Code Optimization CS 540 George Mason University.

1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.

Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.

Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.

Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.

A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.

Optimization Compiler Baojian Hua

ISBN Chapter 3 Describing Syntax and Semantics.

Automated Soundness Proofs for Dataflow Analyses and Transformations via Local Rules Sorin Lerner* Todd Millstein** Erika Rice* Craig Chambers* * University.

Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.

An Integration of Program Analysis and Automated Theorem Proving Bill J. Ellis & Andrew Ireland School of Mathematical & Computer Sciences Heriot-Watt.

Example in SSA X := Y op Z in out F X := Y op Z (in) = in [ { X ! Y op Z } X :=  (Y,Z) in 0 out F X :=   (in 0, in 1 ) = (in 0 Å in 1 ) [ { X ! E |

Plan for today Proof-system search ( ` ) Interpretation search ( ² ) Quantifiers Equality Decision procedures Induction Cross-cutting aspectsMain search.

Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.

Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.

Recap from last time Saw several examples of optimizations –Constant folding –Constant Prop –Copy Prop –Common Sub-expression Elim –Partial Redundancy.

Proof-system search ( ` ) Interpretation search ( ² ) Main search strategy DPLL Backtracking Incremental SAT Natural deduction Sequents Resolution Main.

Automatically Proving the Correctness of Compiler Optimizations Sorin Lerner Todd Millstein Craig Chambers University of Washington.

CS 536 Spring Global Optimizations Lecture 23.

CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.

Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.

Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.

9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.

4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)

X := 11; if (x == 11) { DoSomething(); } else { DoSomethingElse(); x := x + 1; } y := x; // value of y? Phase ordering problem Optimizations can interact.

Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???

Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?

Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.

Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.

Some administrative stuff Class mailing list: –send to with the command “subscribe”

From last time S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t.

Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.

From last lecture x := y op z in out F x := y op z (in) = in [ x ! in(y) op in(z) ] where a op b =

Common Sub-expression Elim Want to compute when an expression is available in a var Domain:

Intermediate Code. Local Optimizations

Projects. Dataflow analysis Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why.

Describing Syntax and Semantics

Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.

Data Flow Analysis Compiler Design Nov. 8, 2005.

Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)

Recap from last time We saw various different issues related to program analysis and program transformations You were not expected to know all of these.

Pointer analysis. Flow insensitive loss of precision S1: l := new Cons p := l S2: t := new Cons *p := t p := t l t S1 p S2 l t S1 p S2 l t S1 p S2 l t.

Automatically Checking the Correctness of Program Analyses and Transformations.

Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.

Have Your Verified Compiler And Extend It Too Zachary Tatlock Sorin Lerner UC San Diego.

Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.

1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”

Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.

13 Aug 2013 Program Verification. Proofs about Programs Why make you study logic? Why make you do proofs? Because we want to prove properties of programs.

Program Representations. Representing programs Goals.

Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.

CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.

Proving Optimizations Correct using Parameterized Program Equivalence University of California, San Diego Sudipta Kundu Zachary Tatlock Sorin Lerner.

©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.

Credible Compilation With Pointers Martin Rinard and Darko Marinov Laboratory for Computer Science Massachusetts Institute of Technology.

Code Optimization Overview and Examples

Code Optimization.

Proof Carrying Code and Proof Preserving Program Transformations

Optimization Code Optimization ©SoftMoore Consulting.

State your reasons or how to keep proofs while optimizing code

Machine-Independent Optimization

Compilers have many bugs

Fall Compiler Principles Lecture 10: Loop Optimizations

Data Flow Analysis Compiler Design

Advanced Compiler Design

Building “Correct” Compilers

Presentation transcript:

Provably Correct Compilers (Part 2) Nazrul Alam and Krishnaprasad Vikram April 21, 2005

Automated Soundness Proofs for Dataflow Analyses and Transformations via Local Rules (Rhodium Paper) Sorin Lerner, Todd Millstein, Erika Rice, Craig Chambers Today’s Focus…

Rhodium: Successor of Cobalt Increased expressiveness –New model for expressing opts: local propagation rules with explicit dataflow facts –Heap summaries –Infinite analysis domains –Flow-sensitive and -insensitive –Intraprocedural and interprocedural Some Rhodium opts not expressible in Cobalt: –Arithmetic invariant detection, integer range analysis, loop-induction-variable strength reduction, Andersen's may-point-to analysis with allocation-site summaries

Recap from Last Class Optimization needs analysis. Each analysis is formulated as a global path condition in Cobalt. In Rhodium it is done with local rules.

Similarities with Cobalt… (Also helpful for refreshing your memory..:)

Checker Written by programmer Given Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt

Checker Written by programmer Given Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt

Rdm Opt Rdm Opt Rdm Opt Checker

Exec Compiler Rhodium Execution engine Rdm Opt Rdm Opt Rdm Opt if (…) { x := …; } else { y := …; } …; Checker

Verification Task Automatic Theorem Prover Rdm Opt Verification Task Checker Show that for any original program: behavior of original program = behavior of optimized program Verification Task

Automatic Theorem Prover Rdm Opt Verification Task

Automatic Theorem Prover Rdm Opt Verification Task

Three techniques to simplify Verification Task 1.Rhodium is declarative no loops, no branches, no program counter declare intent using rules execution engine takes care of the rest Automatic Theorem Prover Rdm Opt

Three techniques to simplify Verification Task 1.Rhodium is declarative no loops, no branches, no program counter declare intent using rules execution engine takes care of the rest Automatic Theorem Prover Rdm Opt

Three techniques to simplify Verification Task 1.Rhodium is declarative 2.Factor out heuristics –legal transformations –vs. profitable transformations Automatic Theorem Prover Rdm Opt Heuristics not affecting correctness Part that must be reasoned about

Three techniques to simplify Verification Task Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics –legal transformations –vs. profitable transformations Heuristics not affecting correctness Part that must be reasoned about

Three techniques to simplify Verification Task 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent Automatic Theorem Prover opt- dependent opt- independent

Three techniques to simplify Verification Task 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent Automatic Theorem Prover

Three techniques to simplify Verification Task Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task –opt-dependent –vs. opt-independent

Three techniques to simplify Verification Task Automatic Theorem Prover 1.Rhodium is declarative 2.Factor out heuristics 3.Split verification task Result: Expressive language Automated correctness checking

Where is the difference? Rohodium’s local rules are different from Cobalt’s global condition Then how exactly Rhodium works?

MustPointTo analysis c = a a = &b d = *c ab c ab d = b

MustPointTo info in Rhodium c = a a = &b mustPointTo ( a, b ) c ab mustPointTo ( c, b ) ab d = *c

MustPointTo info in Rhodium c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) mustPointTo ( a, b ) abab

MustPointTo info in Rhodium define fact mustPointTo(X:Var,Y:Var) with meaning σ( X)== σ( &Y) c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) mustPointTo ( a, b ) ab Fact correct on edge if: whenever program execution reaches edge, meaning of fact evaluates to true in the program state

Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning σ( X ) == σ( &Y ) mustPointTo ( a, b ) ab

a = &b if currStmt == [X = &Y] then Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if currStmt == [X = &Y] then define fact mustPointTo(X:Var,Y:Var) with meaning σ( X) == σ( &Y) mustPointTo ( a, b ) ab

if currStmt == [X = &Y] then Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning σ( X) == σ(&Y) mustPointTo ( a, b ) ab

c = a Propagating facts a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if  currStmt == [Z = X] then mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning σ( X) == σ(&Y) if currStmt == [X = &Y] then mustPointTo ( a, b ) ab

Propagating facts c = a a = &b d = *c c ab mustPointTo ( a, b ) mustPointTo ( c, b ) define fact mustPointTo(X:Var,Y:Var) with meaning σ( X ) == σ( &Y ) if currStmt == [X = &Y] then if  currStmt == [Z = X] then mustPointTo ( a, b ) ab

d = *c Transformations c = a a = &b c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if  currStmt == [Z = *X] then transform to [Z = Y] mustPointTo ( c, b ) d = b define fact mustPointTo(X:Var,Y:Var) with meaning σ( X )== σ( &Y ) if currStmt == [X = &Y] then if  currStmt == [Z = X] then mustPointTo ( a, b ) ab

d = *c Transformations c = a a = &b c ab mustPointTo ( a, b ) mustPointTo ( c, b ) if  currStmt == [Z = *X] then transform to [Z = Y] d = b define fact mustPointTo(X:Var,Y:Var) with meaning σ( X )== σ( &Y ) if currStmt == [X = &Y] then if  currStmt == [Z = X] then mustPointTo ( a, b ) ab

Semantics of a Rhodium opt Run propagation rules in a loop until there are no more changes (optimistic iterative analysis) Then run transformation rules Then run profitability heuristics For better precision, combine propagation rules and transformations rules.

Rhodium is more expressive

Cobalt: Simple Pointer Analysis

Rhodium version..

And Rhodium can do more… …..Can not be expressed in Cobalt

Arithmetic Simplification Optimization

Arithmetic Simplification Optimization (Cont.)

Loop induction-variable strength reduction

Loop induction-variable strength reduction (cont.)

Checking Rhodium optimizations

Exec Compiler Rhodium Execution engine Rdm Opt Rdm Opt if (…) { x := …; } else { y := …; } …; Checker Rhodium correctness checker Rdm Opt Checker

Rhodium correctness checker Checker Rdm Opt

Checker Rhodium correctness checker Automatic theorem prover Rdm Opt Checker

Rhodium correctness checker Automatic theorem prover define fact … if … then transform … if … then … Checker Profitability heuristics Rhodium optimization

Rhodium correctness checker Automatic theorem prover Rhodium optimization define fact … if … then transform … if … then … Checker

Rhodium correctness checker Automatic theorem prover Rhodium optimization define fact … VCGen Local VC Lemma For any Rhodium opt: If Local VCs are true Then opt is correct Proof   «  ¬   $  \ r t  l Checker Opt- dependent Opt- independent VCGen if … then … if … then transform …

Local correctness of prop. rules currStmt == [Z = X] then if  define fact mustPointTo(X,Y) with meaning σ( X) == σ(&Y ) Fact correct on edge iff: whenever program execution reaches edge, meaning of fact evaluates to true in the program state

Local correctness of prop. rules currStmt == [Z = X] then Local VC (generated and proven automatically) if  define fact mustPointTo(X,Y) with meaning σ( X)==σ(&Y ) Assume: Propagated fact is correct Show: All incoming facts are correct Show: Z == &Y (  out ) X == &Y (  in )   out  = step (  in, [Z = X] ) Assume:

Local correctness of prop. rules Show: Z == &Y (  out ) X == &Y (  in )   out  = step (  in, [Z = X] ) Assume: Local VC (generated and proven automatically) define fact mustPointTo(X,Y) with meaning X == &Y currStmt == [Z = X] then if 

Evaluation

Dimensions of evaluation Correctness guarantees Usefulness of the checker Expressiveness

Correctness guarantees Once checked, optimizations are guaranteed to be correct Caveat: trusted computing base –execution engine –checker implementation –proofs done by hand once by Lerner. Adding a new optimization does not increase the size of the trusted computing base Ease of use Guarantees Usefulness Expressiveness

Usefulness of the checker Found subtle bugs in Lerner’s initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning σ( X) == σ(E) if currStmt == [X = E] then x := x + 1x = x + 1 equals ( x, x + 1 ) Ease of use Guarantees Usefulness Expressiveness

if currStmt == [X = E] then if currStmt == [X = E]  “X does not appear in E” then Usefulness of the checker Found subtle bugs in Lerner’s initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning σ( X) == σ(E ) x := x + 1x = x + 1 equals ( x, x + 1 ) Ease of use Guarantees Usefulness Expressiveness

x = x + 1 x = *y + 1 Usefulness of the checker Found subtle bugs in Lerner’s initial implementation of various optimizations define fact equals(X:Var, E:Expr) with meaning σ( X) == σ(E ) if currStmt == [X = E] Æ “X does not appear in E” then equals ( x, x + 1 ) equals ( x, *y + 1 ) if currStmt == [X = E]  “E does not use X” then Ease of use Guarantees Usefulness Expressiveness

Rhodium expressiveness Traditional optimizations: –const prop and folding, branch folding, dead assignment elim, common sub-expression elim, partial redundancy elim, partial dead assignment elim, arithmetic invariant detection, and integer range analysis. Pointer analyses –must-point-to analysis, Andersen's may-point-to analysis with heap summaries Loop opts –loop-induction-variable strength reduction, code hoisting, code sinking Array opts –constant propagation through array elements, redundant array load elimination Ease of use Guarantees Usefulness Expressiveness

Expressiveness limitations May not be able to express your optimization in Rhodium –opts that build complicated data structures –opts that perform complicated many-to-many transformations (e.g.: loop fusion, loop unrolling) A correct Rhodium optimization may be rejected by the correctness checker –limitations of the theorem prover –limitations of first-order logic Ease of use Guarantees Usefulness Expressiveness

Summary Rhodium system –makes it easier to write optimizations –provides correctness guarantees –is expressive enough for realistic optimizations Rhodium system provides a foundation for safe extensible program manipulators

Future works Overcome the limitation of Rhodium –opts that build complicated data structures –opts that perform complicated many-to-many transformations (e.g.: loop fusion, loop unrolling) Overcoming the limitation of theorem prover –Simplify is conservative –Using higher order logic?

Future works.. Automatically infer the whole compiler from a high level specification –Infer analyses –Infer transformations –Program optimizations by demonstration –Automatically pick good data representations –Automatically explore the tradeoffs between scalability and precision

References Correctness –By hand [Cousot and Cousot 77, 79, Benton 04, Lacey et al. 02] –With interactive theorem prover [Cachera et al. 04] –One compilation at a time [Pnueli et al. 98, Necula 00, Rinard 99] Declarative languages for writing transformations –Attribute grammars [Reps and Teitelbaum 88] –Temporal logic [Steffen 91, Lacey et al. 02] Execution engines –Incremental execution of transformations [Sittampalam et al. 04] –Running opts specified with temporal logic [Steffen 91]

More facts define fact mustNotPointTo(X:Var,Y:Var) with meaning σ ( X)  σ ( &Y) define fact hasConstantValue(X:Var,C:Const) with meaning σ( X) == C define fact doesNotPointIntoHeap(X:Var) with meaning Y:Var and σ ( X) == σ ( &Y)

More rules if currStmt == [X = *A]    B:Var.  mustNotPointTo(B,Y) then