Revisiting Generalizations Ken McMillan Microsoft Research Aws Albarghouthi University of Toronto.

Slides:



Advertisements
Similar presentations
Model Checking Base on Interoplation
Advertisements

Automated abstraction refinement II Heuristic aspects Ken McMillan Cadence Berkeley Labs.
The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.
Exploiting SAT solvers in unbounded model checking
A practical and complete approach to predicate abstraction Ranjit Jhala UCSD Ken McMillan Cadence Berkeley Labs.
Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
Exploiting SAT solvers in unbounded model checking K. L. McMillan Cadence Berkeley Labs.
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Software Model Checking with SMT Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.
50.530: Software Engineering
Satisfiability Modulo Theories (An introduction)
Hoare’s Correctness Triplets Dijkstra’s Predicate Transformers
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.
Interpolation and Widening Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.
Logic as the lingua franca of software verification Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A Joint work with Andrey Rybalchenko.
Interpolants from Z3 proofs Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.
© Anvesh Komuravelli IC3/PDR Overview of IC3/PDR Anvesh Komuravelli Carnegie Mellon University.
Rahul Sharma, Saurabh Gupta, Bharath Hariharan, Alex Aiken, and Aditya Nori (Stanford, UC Berkeley, Microsoft Research India) Verification as Learning.
Point-and-Line Problems. Introduction Sometimes we can find an exisiting algorithm that fits our problem, however, it is more likely that we will have.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Axiomatic Semantics.
SAT and Model Checking. Bounded Model Checking (BMC) A.I. Planning problems: can we reach a desired state in k steps? Verification of safety properties:
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Analysis of Algorithms. Time and space To analyze an algorithm means: –developing a formula for predicting how fast an algorithm is, based on the size.
1 Quantified Formulas Acknowledgement: QBF slides borrowed from S. Malik.
Approximation Algorithms
Efficient Reachability Checking using Sequential SAT G. Parthasarathy, M. K. Iyer, K.-T.Cheng, Li. C. Wang Department of ECE University of California –
Search in the semantic domain. Some definitions atomic formula: smallest formula possible (no sub- formulas) literal: atomic formula or negation of an.
Formal Verification Group © Copyright IBM Corporation 2008 IBM Haifa Labs SAT-based unbounded model checking using interpolation Based on a paper “Interpolation.
Last time Proof-system search ( ` ) Interpretation search ( ² ) Quantifiers Equality Decision procedures Induction Cross-cutting aspectsMain search strategy.
SAT Algorithms in EDA Applications Mukul R. Prasad Dept. of Electrical Engineering & Computer Sciences University of California-Berkeley EE219B Seminar.
1 Abstraction Refinement for Bounded Model Checking Anubhav Gupta, CMU Ofer Strichman, Technion Highly Jet Lagged.
Describing Syntax and Semantics
Invisible Invariants: Underapproximating to Overapproximate Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
Automated Extraction of Inductive Invariants to Aid Model Checking Mike Case DES/CHESS Seminar EECS Department, UC Berkeley April 10, 2007.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Lecture 2 Geometric Algorithms. A B C D E F G H I J K L M N O P Sedgewick Sample Points.
Called as the Interval Scheduling Problem. A simpler version of a class of scheduling problems. – Can add weights. – Can add multiple resources – Can ask.
Constraint-based Invariant Inference. Invariants Dictionary Meaning: A function, quantity, or property which remains unchanged Property (in our context):
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
TH EDITION Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 1 Equations and Inequalities Copyright © 2013, 2009, 2005 Pearson Education,
1 Decision Procedures for Linear Arithmetic Presented By Omer Katz 01/04/14 Based on slides by Ofer Strichman.
SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014.
Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber Aug 31, 2011 TexPoint fonts used in EMF. Read the TexPoint manual before you.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.
1 MVD 2010 University of Iowa New York University Comparing Proof Systems for Linear Real Arithmetic Using LFSC Andrew Reynolds September 17, 2010.
Lazy Annotation for Program Testing and Verification Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang November 26,
CS Inductive Bias1 Inductive Bias: How to generalize on novel data.
On Finding All Minimally Unsatisfiable Subformulas Mark Liffiton and Karem Sakallah University of Michigan {liffiton, June 21, 2005.
SAT 2009 Ashish Sabharwal Backdoors in the Context of Learning (short paper) Bistra Dilkina, Carla P. Gomes, Ashish Sabharwal Cornell University SAT-09.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
Nikolaj Bjørner Microsoft Research DTU Winter course January 2 nd 2012 Organized by Flemming Nielson & Hanne Riis Nielson.
CS357 Lecture 13: Symbolic model checking without BDDs Alex Aiken David Dill 1.
SAT Solving As implemented in - DPLL solvers: GRASP, Chaff and
Approximation Algorithms based on linear programming.
© Anvesh Komuravelli Spacer Model Checking with Proofs and Counterexamples Anvesh Komuravelli Carnegie Mellon University Joint work with Arie Gurfinkel,
Chapter 4 (Part 3): Mathematical Reasoning, Induction & Recursion
Hybrid BDD and All-SAT Method for Model Checking
Inference and search for the propositional satisfiability problem
CS 9633 Machine Learning Concept Learning
Solving Linear Arithmetic with SAT-based MC
Structural testing, Path Testing
Introduction to Software Verification
Mining backbone literals in incremental SAT
Propositional Calculus: Boolean Algebra and Simplification
Inferring Simple Solutions to Recursion-free Horn Clauses via Sampling
Copyright © 2017, 2013, 2009 Pearson Education, Inc.
SAT/SMT seminar 18/02/2018 Computing multiple MUSes (Minimal Unsatisfiable Subformulas) and MSISes (Minimal Safe Inductive Subsets) Alexander Ivrii IBM.
Presentation transcript:

Revisiting Generalizations Ken McMillan Microsoft Research Aws Albarghouthi University of Toronto

Generalization Interpolants are generalizations We use them as a way of forming conjectures and lemmas Many proof search methods uses interpolants as generalizations SAT/SMT solvers CEGAR, lazy abstraction, etc. Interpolant-based invariants, IC3, etc. There is widespread use of the idea that interpolants are generalizations, and generalizations help to form proofs

In this talk We will consider measures of the quality of these generalizations I will argue: The evidence for these generalizations is often weak This motivates a retrospective approach: revisiting prior generalizations in light of new evidence This suggests new approaches to interpolation and proof search methods that use it. We’ll consider how these ideas may apply in SMT and IC3.

Criteria for generalization A generalization is an inference that in some way covers a particular case Example: a learned clause in a SAT solver We require two properties of a generalization: Correctness: it must be true Utility: it must make our proof task easier A useful inference is one that occurs in a simple proof Let us consider what evidence we might produce for correctness and utility of a given inference…

What evidence can we provide? Evidence for correctness: Proof (best) Bounded proof (pretty good) True in a few cases (weak) Evidence for utility: Useful for one truth assignment Useful for one program path

Interpolants as generalizations Consider a bounded model checking formula: Evidence for interpolant as conjectured invariant Correctness: true after 3 steps Utility: implies property after three steps Note the duality In invariant generation, correctness and utility are time-reversal duals. interpolant

CDCL SAT solvers A learned clause is a generalization (interpolant) Evidence for correctness: Proof by resolution (strong!) Evidence for utility: Simplifies proof of current assignment (weak!) In fact, CDCL solvers produce many clauses of low utility that are later deleted. Retrospection CDCL has a mechanism of revisiting prior generalization in the light of new evidence This is called “non-chronological backtracking”

Retrospection in CDCL CDCL can replace an old generalization with a new one that covers more cases That is, utility evidence of the new clause is better The decision stack: Conflict!

Retrospection in CEGAR This path has two possible refutations of equal utility CEGAR considers one counterexample path at a time

Retrospection and Lazy SMT

Diamond example with lazy SMT Theory lemmas correspond to program paths: … (16 lemmas, exponential in number of diamonds) Lemmas have low utility because each covers only once case Lazy SMT framework does not allow higher utility inferences

Diamond example (cont.) We can produce higher utility inferences by structurally decomposing the problem Each covers many paths Proof is linear in number of diamonds

Compositional SMT Use SMT solver As block box With each new sample, we reconsider the interpolant to maximize utility

Example in linear rational arithmetic

Compositional approach 3. Interpolant now covers all disjuncts Notice we reconsidered our first interpolant choice in light of further evidence.

Comparison to Lazy SMT Interpolant from a lazy SMT solver proof: Each half-space corresponds to a theory lemma Theory lemmas have low utility Four lemmas cover six cases

B3B3 Why is the simpler proof better? A simple fact that covers many cases may indicate an emerging pattern... B4B4 Greater complexity allows overfitting Especially important in invariant generation

Finding simple interpolants We break this problem into two parts Search for large subsets of the samples that can be separated by linear half-spaces. Synthesize an interpolant as a Boolean combination of these separators. The first part can be accomplished by well-established methods, using an LP solver and Farkas’ lemma. The Boolean function synthesis problem is also well studied, though we may wish to use more light-weight methods.

Farkas’ lemma and linear separators Farkas’ lemma says that inconsistent rational linear constraints can be refuted by summation: constraints solve

Finding separating half-spaces

Half-spaces to interpolants Each region is a cube over half- spaces Must be true Must be false Don’t care In practice, we don’t have to synthesize an optimal combination

Sequential verification We can extend our notions of evidence and retrospection to sequential verification A “case” may be some sequence of program steps Consider a simple sequential program: x = y = 0; while(*) x++; y++; while(x != 0) x--; y--; assert (y <= 0); Wish to discover invariant:

{y = 0} {y = 1} {y = 2} {y = 1} {y = 0} {False} {True} {False} {True} Execute the loops twice These interpolants cover all the cases with just one predicate. In fact, they are inductive. These predicates have low utility, since each covers just one case. As a result, we “overfit” and do not discover the emerging pattern. x = y = 0; x++; y++; [x!=0]; x--; y--; [x!=0]; x--; y--; [x == 0] [y > 0] Choose interpolants at each step, in hope of obtaining inductive invariant.

Sequential interpolation strategy Step 0: low evidence Step 1: better evidence

Occam’s razor says simplicity prevents over- fitting This is not the whose story, however Consider different separators of similar complexity Are simple interpolants enough? Simplicity of the interpolant may not be sufficient for a good generalization.

Proof simplicity criterion Empirically, the strange separators seem to be avoided by minimizing proof complexity Maximize zero coefficients in Farkas proofs This can be done in a greedy way Note the difference between applying Occam’s razor in synthetic and analytic cases We are generalizing not from data but from an argument We want the argument to be as simple as possible

Value of retrospection 30 small programs over integers Tricky inductive invariants involving linear constraints Some disjunctive, most conjunctive ToolComp. SMT CPA Checker UFOInvGen with AI InvGen w/o AI % solved Better evidence for generalizations Better fits the observed cases Results in better convergence Still must trade off cost v. utility in generalizing Avoid excessive search while maintaining convergence

Value of retrospection (cont) Exponential theory lemmas achieved in practice. Computing one interpolant gives power ½ speedup.

Incremental inductive methods (IC3) Compute a sequence interpolant by local proof Interpolant successively weakens Inductive generalization Valid!

Generalization in IC3 (bad state) Inductive generalization

Retrospection in IC3? Find a relatively inductive clause that covers as many bad states as possible Search might be inefficient, since each test is a SAT problem Possible solution: relax correctness evidence Sample on both A and B side Correctness evidence is invariance in a sample of behaviors rather than all up to depth k.

True in all A samples Generalization problem separator A samples (reachable states) False in many B samples B samples (bad states) Find a clause to separate A and B samples This is another two-level logic optimization problem. This suggests logic optimization techniques could be applied to computing interpolants in incremental inductive style.

Questions to ask For a given inference technique we can ask Where does generalization occur? Is there good evidence for generalizations? Can retrospection be applied? What form do separators take? How can I solve for separators? At what cost? Theories: arrays, nonlinear arithmetic,… Can I improve the cost? Shift the tradeoff of utility and correctness Can proof complexity be minimized? Ultimately, is the cost of better generalization justified?

Conclusion Many proof search methods rely on interpolants as generalizations Useful to the extent the make the proof simpler Evidence of utility in existing methods is weak Usually amounts to utility in one case Can lead to many useless inferences Retrospection: revisit inferences on new evidence For example, non-chronological backtracking Allows more global view of the problem Reduces commitment to inferences based on little evidence

Conclusion (cont) Compositional SMT Modular approach to interpolation Find simple proofs covering many cases Constraint-based search method Improves convergence of invariant discovery Exposes emerging pattern in loop unfoldings Think about methods in terms of The form of generalization Quality of evidence for correctness and utility Cost v. benefit of the evidence provided

Cost of retrospection Early retrospective approach due to Anubhav Gupta Finite-state localization abstraction method Finds optimal localization covering all abstract cex’s. In practice, “quick and dirty” often better than optimal Compositional SMT usually slower than direct SMT However, if bad generalizations imply divergence, then the cost of retrospection is justified. Need to understand when revisiting generalizations is justified.

Inference problem Find a clause to separate A and B samples Clause must have a literal from every A sample Literals must be false in many B samples This is a binate covering problem Wells studied in logic synthesis Relate to Bloem et al Can still use relative inductive strengthening Better utility and evidence