1 Regression Verification: Proving the equivalence of similar programs Benny Godlin Ofer Strichman Technion, Haifa, Israel Recently joined: Yossi Levhari Looking for an additional PhD student for this project.
2 Functional Verification The main pillar of the grand challenge [H’03]. Suppose we ignore completeness. Still, there are two major problems: Specification Complexity
3 Words of wisdom “For every problem that you cannot solve, there is an easier problem that you cannot solve” * * George Polya, in How To Solve It
4 A more modest challenge: Regression Verification Develop a method for formally verifying the equivalence of two similar programs. Pros: Default specification = earlier version. Computationally easier than functional verification. Ideally, the complexity should depend on the semantic difference between the programs, and not on their size. Cons: Defines a weaker notion of correctness.
5 Regression-based technologies In Testing: Regression testing is the most popular automatic testing method In hardware verification: Equivalence checking is the most used verification method in the industry
6 Hoare’s 1969 paper has it all.... …and 6 pages later:
7 Previous work In the theorem-proving world ACL2 community): Not industrial programming languages Not utilizing the similarity between the two programs Industrial / realistic programs: Code free of: loops, recursion, dynamic-memory allocation Intel [AEFMMSSTVZ-05], embedded Feng & Hu [FH-05], symbolic Matsumoto et al. [TSF-06]
8 Goals Develop notions of equivalence Develop corresponding proof rules Present a prototype for verifying equivalence of C programs, that incorporates the proof rules sensitive to the magnitude of change rather than the magnitude of the programs
9 Notions of equivalence (1 / 6) Partial equivalence Executions of P1 and P2 on equal inputs …which terminate, result in equal outputs. Undecidable
10 Notions of equivalence (2 / 6) Mutual termination Given equal inputs, P1 terminates, P2 terminates Undecidable
11 Notions of equivalence (3 / 6) Reactive equivalence Let P1 and P2 be reactive programs. Executions of P1 and P2 …which read the same input sequence, emit the same output sequence. Undecidable
12 Notions of equivalence (4 / 6) k -equivalence Executions of P1 and P2 on equal inputs …where loops and recursions are restricted to k iterations, result in equal output. Decidable
13 Notions of equivalence (5 / 6) Full equivalence P1 and P2 are partially equivalent and mutually terminate Undecidable
14 Notions of equivalence (6 / 6) Total equivalence P1 and P2 are partially equivalent and both terminate Undecidable
15 Notions of equivalence: summary 1. Partial equivalence 2. Mutual termination 3. Reactive equivalence 4. k-equivalence 5. Full equivalence* = Partial equivalence + mutual termination 6. Total equivalence** = partial equivalence + both terminate * Appeared in Luckham, Park, and M. Paterson [LPP-70] / Pratt [P-71] ** Appeared in Bouge and D. Cachera [BC-97]
16 Partial equivalence Consider the call graphs: … where A, B have: same prototype no loops Prove partial equivalence of A, B How shall we handle the recursion ? A B Side 1Side 2
17 Hoare ’ s Rule for Recursion Let A be a recursive function. “… The solution... is simple and dramatic: to permit the use of the desired conclusion as a hypothesis in the proof of the body itself. ” [H’71]
18 Hoare ’ s Rule for Recursion // {p} A(... ) {... // {p} call A(...); // {q}... } // {q}
19 //in[A] A(... ) {... //in[call A] call A(...); //out[call A]... } //out[A] Rule 1: Proving partial equivalence A B //in[B] B(... ) {... // in[call B] call B(...); //out[call B]... } //out[B]
20 Rule 1: Proving partial equivalence Q: How can a verification condition for the premise look like? A: Replace the recursive calls with calls to functions that over-approximate A, B, and are partially equivalent by construction Natural candidates: Uninterpreted Functions
21 Proving partial equivalence Let A UF, B UF be A, B, after replacing the recursive call with a call to (the same) uninterpreted function. We can now rewrite the rule: The premise is decidable
22 unsigned gcd1 UF (unsigned a, unsigned b) { unsigned g; if (b == 0) g = a; else { a = a % b; g = gcd1(b, a); } return g; } unsigned gcd2 UF (unsigned x, unsigned y) { unsigned z; z = x; if (y > 0) z = gcd2(y, z % y); } return z; } Using (PART-EQ-1) : example ?=?= U U a,a,b)b) x,x, y)y) g;g; z;z; Transition functions Inputs Outputs T gcd1 T gcd2 a,b x,y g z
23 Rule 1: example Transition functions T gcd1 T gcd2 Inputs a,ba,bx,yx,y Outputs gz Equal inputs Equal outputs
24 Partial equivalence: Generalization Assume: no loops; 1-1 mapping map between the recursive functions of both sides Mapped functions have the same prototype Define: For a function f, UF( f ) is an uninterpreted function such that f and UF( f ) have the same prototype ( f, g ) 2 map, UF( f ) = UF( g ).
25 Partial equivalence: Generalization Definition: is called in A]
26 Partial equivalence: Example (1 / 3) Side 1 Side 2 f ’ g g’ f {(g,g’),(f,f’)} 2 map Need to prove: f ’ UF f g g’ UF = =
27 Partial equivalence: Example (2 / 3) An improvement: Find a map that intersects all cycles, e.g., (g,g’) Only when calling functions in this map replace with uninterpreted functions Side 1 Side 2 f ’ UF g g’ f UF
28 Partial equivalence: Example (3 / 3) Connected SCCs… Prove bottom-up Abstract partially-equivalent functions Inline Side 1 Side 2 f ’ gg’ f h h’ UF
29 RVT: Decomposition algorithm A: B: f1() f2() f5() f3()f4() f6() f1’() f2’() f3’()f4’() f5’() Equivalent pair Syntactically equivalent pair Equivalence undecided yet Could not prove equivalent Legend: check Unpaired function f7’() U UUU U U
30 RVT: Decomposition algorithm (with SCCs) A: B: f1() f2() f5() f3()f4() f6() f1’() f3’()f4’() f5’() f6’() Equivalent pair Syntactically equivalent pair Equivalence undecided yet Could not prove equivalent Legend: Equivalent if MSCC U UUU U U check U U U U f2’()
31 PART-EQ: Soundness Proved soundness for a simple programming language (LPL) Covers most features of modern imperative languages …but does not allow call by reference, and address manipulation.
32 PART-EQ: Completeness We show a sufficient condition for completeness.
33 PART-EQ: Completeness Definition: A,B are reach-equivalent if… (by example) A() { … if (cond 1 ) f(a 1,b 1 ) else … … } B() { … if (cond 2 ) g(a 2,b 2 ) else ….. }
34 PART-EQ: Completeness Definition: A,B are reach-equivalent if… For equal inputs: For callees F,G s.t. (F,G) 2 map: F is called, G is called F and G are called with the same arguments. Completeness: (PART-EQ) is complete for reach-equivalent functions.* * under some “reasonable” assumptions (e.g. no dead-code)
35 What (PART-EQ) cannot prove... Not reach-equivalent: calling with different arguments. returns n + nondet() returns n + n -1 + nondet()
36 What (PART-EQ) cannot prove... Not reach-equivalent: calling under different conditions: returns 1 returns 1 + nondet() when n == 1 : (news)
37 Notions of equivalence: summary 1. Partial equivalence 2. Mutual termination 3. Reactive equivalence 4. k-equivalence 5. Full equivalence* = Partial equivalence + mutual termination 6. Total equivalence** = partial equivalence + both terminate * Appeared in Luckham, Park, and M. Paterson [LPP-70] / Pratt [P-71] ** Appeared in Bouge and D. Cachera [BC-97]
38 Rule 2: Proving Mutual termination Prove with (PART-EQ)
39 unsigned gcd1 UF (unsigned a, unsigned b) { unsigned g; if (b == 0) g = a; else { a = a % b; g = gcd1(b, a); } return g; } unsigned gcd2 UF (unsigned x, unsigned y) { unsigned z; z = x; if (y > 0) z = gcd2(y, z % y); } return z; } Using (M-TERM) : example = a,a,b)b) x,x, y)y) (b == 0) (y > 0) (y, z % y); (b, a); U U ?
40 Proving reach-equiv(gcd 1 UF, gcd 2 UF ) Valid. gcd 1,gcd 2 mutually terminate. Using (M-TERM) : example Equal inputs Equal guards if called then equal arguments
41 The Regression Verification Tool (RVT) Given two C programs: loops recursive functions. Map functions, globals, etc. After that: Decompose to the granularity of pairs of functions Use a C verification engine (CBMC) to discharge
42 The Regression Verification Tool (RVT) CBMC: a C bounded model checker by Daniel Kroening Our use: No loops or recursion to unroll... Use “assume(…)” construct to enforce equal inputs. Use assert() to demand equal outputs. Uninterpreted functions are implemented as C functions: Return consistent nondeterminisitic values.
43 The Regression Verification Tool (RVT) The premise of ( PART-EQ ) requires comparing arguments. What if these arguments are pointers ? What our system does: Dynamic structures: creates an unrolled nondeterministic structure Arrays: attempts to find references to cells in the array.
44 RVT: User-defined equivalence specification The user can define pairs of ‘checkpoints’: side 1: side 2: In each side: update an array with the value of exp each time it reaches label and condition holds. Assert that when executed on the same input…, … these arrays are equivalent. exp 1 exp 2... P1: exp ’ 1 exp ’ 2... P2: = ===
45 RVT Version AVersion B CBMC rename identical globals enforce equality of inputs. assert equality of outputs add checkpoints Supports: Decomposition Abstraction some static analysis … feedback result counterexample C program RVT
46
47 RVT: Experiments Automatically generated sizable programs with complex recursive structures and loops. up-to thousands of lines of code Limited-size industrial programs: Parts of TCAS - Traffic Alert and Collision Avoidance System. Core of MicroC/OS - real-time kernel for embedded systems. Matlab examples: parts of engine-fuel-injection simulation. We tested the Regression Verification Tool (RVT) with:
48 Testing RVT on programs: Conclusions For equivalent programs, partial-equivalence checks were very fast: proving equivalence in minutes. For non-equivalent programs: RVT attempts to prove partial-equivalence but fails then RVT tries to prove k-equivalence
49 Summary Regression verification is an important problem A solution to this problem has a better chance to succeed in the industry than functional verification A grand challenge by its own right… Lots of future research...
50 More Challenges Q1: How can we generalize counterexamples ? Q2: What is the ideal gap between two versions of the same program, that makes Regression Verification most effective ? Q3: How can functional verification and equivalence verification benefit from each other ?
51 The end … Thank you
52 Partial equivalence: Example 2 f(...) {... g(...)...}f ’(...){... g’(...)...} g(...) {... f(...)...}g’(...){... f ’(...)...} Need to prove: 1. partial-eq(f(...) {... UF(g)(...)...}, f ’(...){... UF(g’)(...)...}) 2. partial-eq(g(...) {... UF(f)(...)...}, g’(...){... UF(f ’) (...)...}) f Side 1 Side 2 g f’f’ g’g’ map = {(g,g’),(f,f ’)} (Later we show that it is sufficient just to break all cycles)
53 Some generalizations So far we only considered call graphs that are SCCs 1-1 mapping between the functions Side 1 Side 2 Side 1 Side 2
54 Some generalizations What about… 1. Prove bottom-up 2. Abstract equivalent functions
55 Some generalizations What about… inlining
56 Some generalizations According to (PART-EQ) we should prove… UF
57 Some generalizations And then … … and so forth UF
58 But it is sufficient to … Find a map that intersects all cycles Only when calling functions in this map replace with uninterpreted functions In all other cases - inline Some generalizations UF
59 Rule 3: Proving reactive equivalence The rule: If all paired procedures satisfy 1. given the same arguments and the same input sequences, they return the same values, 2. they consume the same number of inputs, and 3. the interleaved sequence of procedure calls and output statements inside the mapped procedures is the same (and the procedure calls are made with the same arguments) then all paired procedures are reactive equivalent.
60 Rule 2: Proving Mutual termination The rule: If all paired procedures satisfy : 1. Partial equivalence, 2. The conditions under which they call each pair of mapped procedures are equal, and 3. The read arguments of the called procedures are the same when they are called, then all paired procedures mutually terminate.
61 Mutual Recursion – an optimization Mutually Recursive Functions: Recognized by MSCCs in the call graph Need to prove enough pairs such that all cycles are broken.
62 Long list of future work Finish implementation of: Rules 2/3, arrays (especially in UFs), slicing,… Future work: How to strengthen the rules with invariants. Test on real changes (CVS repository). Use other program verification tools for C to prove equivalence.
63 Long list of future work Future directions: Strengthen the rules with invariants. Support larger parts of C Use other program verification tools for C to prove equivalence. Q: What is the ideal gap between two versions of the same program, that makes Regression Verification most effective ?
64 Long list of future work Future directions: Strengthen the rules with invariants. Support larger parts of C
65 Slides to add (?) show the direct reduction. this is called ‘self composition’. see c:\temp\equivalence_fmics09_submission.pdf discuss check points discuss how the verified programs look like
66 Outline Inference rules for proving equivalence Hoare’s rule of recursion Rule 1: partial equivalence Rule 2: mutual termination Rule 3: reactive equivalence Regression verification for C programs CBMC – the back engine Architecture Decomposition Handling dynamic data structures
67 Advantages of Regression Verification Specification by reference to an earlier version. Can be applied from early development stages. Equivalence proofs are potentially easier than functional verification. Ideally, the complexity should depend on the semantic difference between the programs, and not on their size.
68 C normal form... Every two C programs can be brought to the form:... but this will hardly ever work contradict our goal to decompose the problem main
69 Checkpoints with channels To allow more flexibility in comparison, allow channels exp 1 exp 2... P1,C2: exp ’ 1 exp ’ 2... P2,C2: = === exp 1 exp 2... P1,C1: exp ’ 1 exp ’ 2... P2,C1: = ===