Rahul Sharma, Aditya V. Nori, Alex Aiken Stanford MSR India Stanford
int i = 1, j = 0; while (i<=5) { j = j+i ; i = i+1; } Increasing precision D. Monniaux and J. L. Guen. Stratified static analysis based on variable dependencies. Electr. Notes Theor. Comput. Sci. 2012
A. V. Nori and S. K. Rajamani. An empirical study of optimizations in YOGI. ICSE (1) 2010
Increased precision is causing worse results Programs have unbounded behaviors Program analysis Analyze all behaviors Run for a finite time In finite time, observe only finite behaviors Need to generalize
Generalization is ubiquitous Abstract interpretation: widening CEGAR: interpolants Parameter tuning of tools Lot of folk knowledge, heuristics, …
“It’s all about generalization” Learn a function from observations Hope that the function generalizes Work on formalization of generalization
Model the generalization process Probably Approximately Correct (PAC) model Explain known observations by this model Use this model to obtain better tools
INTERPOLANTSCLASSIFIERS + Rahul Sharma, Aditya V. Nori, Alex Aiken: Interpolants as Classifiers. CAV
c
c
H For any arbitrary labeling
Precision is low Underfitting Precision is high Overfitting Good fit Y X
Generalization error is bounded by sum of Bias: Empirical error of best available hypothesis Variance: O (VC-d) Bias Variance Increase precision Generalization error Possible hypotheses
int i = 1, j = 0; while (i<=5) { j = j+i ; i = i+1; }
What goes wrong with excess precision? Fit polyhedra to program behaviors Transfer functions, join, widening Too many polyhedra, make a wrong choice
J. Henry, D. Monniaux, and M. Moy. Pagai: A path sensitive static analyser. Electr. Notes Theor. Comput. Sci
A. V. Nori and S. K. Rajamani. An empirical study of optimizations in YOGI. ICSE (1) 2010
Parameter tuning of program analyses Overfitting? Generalization on new tasks? P. Godefroid, A. V. Nori, S. K. Rajamani, and S. Tetali. Compositional may-must program analysis: unleashing the power of alternation. POPL Benchmark Set (2490 verification tasks) Train
How to set the test length in Yogi Benchmark Set (2490 verification tasks) Training Set (1743) Test Set (747) Train Test
On 2106 new verification tasks 40% performance improvement! Yogi in production suffers from overfitting
Keep separate training and test sets Design of the tools governed by training set Test set as a check SVCOMP: all benchmarks are public Test tools on some new benchmarks too
R. Jhala and K. L. McMillan. A practical and complete approach to predicate refinement. TACAS Suggests incrementally increasing precision Find a sweet spot where generalization error is low
No generalization -> no bias-variance tradeoff Certain classes of type inference Abstract interpretation without widening Loop-free and recursion-free programs Verify a particular program (e.g., seL4) Overfit on the one important program
A model to understand generalization Bias-Variance tradeoffs These tradeoffs do occur in program analysis Understand these tradeoffs for better tools