Rahul Sharma, Eric Schkufza, Berkeley Churchill, Alex Aiken
Prove two programs are equivalent Compiler optimizations Validate refactorings Cross checking different implementations Old and well studied problem Undecidable in general Major challenge: prove equivalence of loops Straight line programs relatively easy
Prove equivalence of two binaries … while … … while … … Trustworthy Compiler CompCert, gcc –O0 Optimizing Compiler gcc –O3, icc –O3
Straight Line Code Straight Line Code Trustworthy Compiler CompCert, gcc –O0 STOKE (ASPLOS 13) Random mutations … while … … while … …
Do not support “while” loops: [CHR00], [FH02], [FH05], [AEF + 05], [SBC + 05], [MSF06] Do not reason about termination: [SDE + 08], [GS09], [RE11], [LHM + 12], [PY13], [LMS + 13] Translation validation: [Nec00],[GZB05], … Need information from the compiler
Decompose proof movq 8(rsp), rdi #rdi != 0 movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq movq 8(rsp), r9 #r9 != 0 decq r9retq a a’ bb’ cc’
Given a simulation relation, proofs for loops reduce to proofs for loop free fragments Use decision procedures Main challenge: infer a simulation relation Infer synchronization points Infer invariants We use compilers as black boxes Mine relations from concrete executions
Run some tests to get data From executions, unit tests, random tests, etc.
B retq B’ retq 2nn B;B n
Attempt to detect synchronization points Number of times program points are executed Values align movq 8(rsp), rdi #rdi != 0 movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq movq 8(rsp), r9 #r9 != 0 decq r9retq n 1n n+1 n
Invariants are restricted to equalities Infer invariants from observed data values 8(rsp)rdi movq 8(rsp), rdi #rdi != 0 movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq
Invariants are restricted to equalities Infer invariants from observed data values 8(rsp)rdir9’ movq 8(rsp), r9 #r9 != 0 decq r9retq
8(rsp)rdir9’
The executions are synchronized The invariants are maintained movq 8(rsp), rdi #rdi != 0 movq 8(rsp), rdi decq rdi movq rdi, 8(rsp) retq movq 8(rsp), r9 #r9 != 0 decq r9retq a a’ bb’ cc’ States equal Live outs equal
The executions are synchronized The invariants are maintained Queries in quantifier free bitvector arithmetic Complete SMT solvers! Incorporate counter-examples in relations Sound but not complete If checking succeeds then equivalent Can fail to infer a sound simulation relation
Insufficient data to infer a sound relation Expressiveness of invariants Inequalities, quantifiers, etc. Expressiveness of SMT solver Floating point, multiply, divide, etc.
Run tests and generate data Nullspace computation libIML: integer matrix library SMT solver: Z3
Compute kernel inside OpenSSL Validating CompCert against gcc Stochastic optimization for loops
Multiplication kernel Extensive performance tests Run the kernel ~15 million times Choose 16 random tests for inference Compile with gcc –O0 and gcc –O3 Successfully prove equivalence
ProgramStoke vs gcc -O0Stoke vs gcc –O3 Bansal1.58X1.04X SAXPY9.22X1.48X
Prove equivalence of loops in two stages Infer simulation relation Check the inferred relation using SMT solvers Use runtime data for inference No change required to the compilers Better verifiers lead to better optimizers
M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program., 69(1-3):35–45, 2007 T. Nguyen, D. Kapur, W. Weimer, and S. Forrest. Using dynamic analysis to discover polynomial and array invariants. ICSE 2012 P. Garg, C. Löding, P. Madhusudan, D. Neider: Learning Universally Quantified Invariants of Linear Data Structures. CAV 2013 R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, A. V. Nori: A Data Driven Approach for Algebraic Loop Invariants. ESOP 2013 R. Sharma, S. Gupta, B. Hariharan, A. Aiken, A. V. Nori: Verification as Learning Geometric Concepts. SAS 2013 A.V. Nori, R. Sharma: Termination proofs from tests. ESEC/SIGSOFT FSE 2013