Pentagons: A Weakly Relational Abstract Domain for the Efficient Validation of Array Accesses Francesco Logozzo, Manuel Fahndrich Microsoft Research, Redmond.

Slides:



Advertisements
Similar presentations
Combining Abstract Interpreters Sumit Gulwani Microsoft Research Redmond, Group Ashish Tiwari SRI RADRAD.
Advertisements

Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Global Value Numbering using Random Interpretation Sumit Gulwani George C. Necula CS Department University of California, Berkeley.
Logical Abstract Interpretation Sumit Gulwani Microsoft Research, Redmond.
Demand-driven inference of loop invariants in a theorem prover
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Challenges in increasing tool support for programming K. Rustan M. Leino Microsoft Research, Redmond, WA, USA 23 Sep 2004 ICTAC Guiyang, Guizhou, PRC joint.
De necessariis pre condiciones consequentia sine machina P. Consobrinus, R. Consobrinus M. Aquilifer, F. Oratio.
Modular and Verified Automatic Program Repair Francesco Logozzo, Thomas Ball RiSE - Microsoft Research Redmond.
An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
Abstract Interpretation Part II
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
A System to Generate Test Data and Symbolically Execute Programs Lori A. Clarke September 1976.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
CodeContracts & Clousot Francesco Logozzo - Microsoft Mehdi Bouaziz – ENS.
Inferring Disjunctive Postconditions Corneliu Popeea and Wei-Ngan Chin School of Computing National University of Singapore - ASIAN
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
© 2010 Carnegie Mellon University B OXES : A Symbolic Abstract Domain of Boxes Arie Gurfinkel and Sagar Chaki Software Engineering Institute Carnegie Mellon.
Discovering Affine Equalities Using Random Interpretation Sumit Gulwani George Necula EECS Department University of California, Berkeley.
Using Statically Computed Invariants Inside the Predicate Abstraction and Refinement Loop Himanshu Jain Franjo Ivančić Aarti Gupta Ilya Shlyakhter Chao.
Francesco Logozzo Microsoft Research, Redmond, WA.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Contracts, tools, verification K. Rustan M. Leino Research in Software Engineering (RiSE) Microsoft Research, Redmond Keynote, ASWEC 2010; Auckland, NZ;
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Program analysis Mooly Sagiv html://
Program analysis Mooly Sagiv html://
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Overview of program analysis Mooly Sagiv html://
1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.
Quantifier Elimination Procedures in Z3 Support for Non-linear arithmetic Fixed-points – features and a preview.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Compiler Construction Lecture 16 Data-Flow Analysis.
A Parametric Segmentation Functor for Fully Automatic and Scalable Array Content Analysis Patrick Cousot, NYU & ENS Radhia Cousot, CNRS & ENS & MSR Francesco.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.
NummSquared Coercion make it so! Samuel Howse poohbist.com November 29, 2006 Copyright © 2006 Samuel Howse. All rights reserved.
1 Names, Scopes and Bindings Aaron Bloomfield CS 415 Fall
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
© 2008 Carnegie Mellon University Combining Predicate and Numeric Abstraction for Software Model Checking Software Engineering Institute Carnegie Mellon.
Page 1 5/2/2007  Kestrel Technology LLC A Tutorial on Abstract Interpretation as the Theoretical Foundation of CodeHawk  Arnaud Venet Kestrel Technology.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Application to the automatic extraction of circuit shapes Charles Hymans Modular analysis of a circuit description language by Abstract Interpretation.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.
1 Combining Abstract Interpreters Mooly Sagiv Tel Aviv University
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
Program Analysis and Verification
Ch Ch jcmt CSE 3302 Programming Languages CSE3302 Programming Languages Dr. Carter Tiernan.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
IBM Research: Software Technology © 2005 IBM Corporation Programming Technologies 1 Temporal Rules Vijay Saraswat IBM TJ Watson July 27, 2012.
1 jcmt Summer 2003Programming Languages CSE3302 Programming Languages Summer 2003 Dr. Carter Tiernan.
A Static Analyzer for Large Safety-­Critical Software Presented by Dario Bösch, ETH Zürich Research Topics in Software Engineering Dario.
Program Analysis Last Lesson Mooly Sagiv. Goals u Show the significance of set constraints for CFA of Object Oriented Programs u Sketch advanced techniques.
Spring 2016 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Pointer Analysis Lecture 2
G. Ramalingam Microsoft Research, India & K. V. Raghavan
Iterative Program Analysis Abstract Interpretation
Program Analysis and Verification
Pointer Analysis Lecture 2
An information flow model FM is defined by
Prevail: Simple and Precise Static Analysis of Untrusted Linux Kernel Extensions This work is about static analysis of kernel extensions, also known as.
Presentation transcript:

Pentagons: A Weakly Relational Abstract Domain for the Efficient Validation of Array Accesses Francesco Logozzo, Manuel Fahndrich Microsoft Research, Redmond

The Background 2  Efficient static checking of.NET assemblies  Foxtrot  Foxtrot: a language agnostic contract language  Clousot  Clousot: a language agnostic static analyzer  Based on abstract interpretation  Checks contracts, array bounds, memory accesses, nullness, …

Demo 3 Wrong? Ok: not null

Demo 4 Ok: index in bounds Ok: not null Ok: index in bounds

The paper in a nutshell Program executions  Is 0 ≤ y < x ?  Testing: try some points  What for the others?  Model checking: try all the points  What if we have ∞ points?  Abstract interpretation: approximation Intervals No  in O(n) Octagons Yes! in Θ (n 3 ) Polyhedra Yes! in O(2 n ) Pentagons Yes! In O(n) 5

Pentagons? 6  A lightweight numerical domain  Keep relations in the form a ≤ x ≤ b && x < y  a, b numerical constants  x, y variables  Enough to validate > 83% of the accesses of mscorlib.dll  Mscorlib.dll is the main library in.NET  Fast: Analyze it in a couple of minutes

Abstract domain 7  An abstract domain is a complete lattice endowed with  Widening operator  To ensure the convergence of the analysis  Ex. The increasing chain [0,1] ⊑ [0,2] ⊑ [0,3] ⊑ [0, 4] ⊑... Is extrapolated by widening to [0, +∞]  Transfer functions  To capture the abstract semantics of statements x → [4,5]  Ex. x := y + 3([y → [1, 2]) = [y → [1,2], x → [4,5]]

Interval domain 8  Elements:  { [a, b] | a ∈ Z ∪ { -∞ }, b ∈ Z ∪ { +∞ } }  Order  [a,b] ⊑ [c,d] iff c ≤ a and b ≤ d  Join  [a,b] ⊔ [c,d] =[min(a,c), max(b,d)]  Meet  [a,b] ⊓ [c,d] = [max(a,c), min(b,d)]  Widening: Keep the stable bounds  Transfer functions: ordinary interval arithmetic

LT Domain 9  Elements  ℘ ({ X < Y | X and Y are variables })  Efficient representation with Hashtables  Order  A ⊑ B iff B ⊆ A  Join  A ⊔ B = A \cap B  Meet  A ⊓ B = A ∪ B  Widening: just the join as the lattice has finite height  Transfer functions: y := x + 1 (A) = (A-{y}) ∪ { x < y }

Pentagons 10  Reduced  Reduced Cartesian product of Intervals and LT  Reduced?  Not just pairs: information flows from one element to the other  Ex. 2  (x → [1, 4], y → [3, 3], { x (x → [1, 2], y → [3, 3], { x < y })  May introduce cubic slowdown  Reduction is applied  In precise points of the analysis  Lazily at join points

The (Naif) Join of Pentagons 11  Left_P = (left_intv, left_lt), Right_P = (right_intv, right_lt) 1. Close Left_P and Right_P 2. Apply the join pairwisely  Closure (intv, lt) iterates until saturation this rule: if x → [a,b], y → [c,d] ∈ intv. If b< c then lt = lt ∪ { x < y }  Problem: It introduces a quadratic slowdown

The smarter join on Pentagons 12  Idea: 1. Apply the pairwise join 2. If a symbolic constraint x < y is dropped, check if the other branch implies it 3. If it does, then keep the constraint  Formal details in the paper  Results:  For mscorlib we moved from > 1h to a couple of minutes  No access is lost!

Experiment: Array bounds analysis 13  Assemblies as shipped  No pre-processing  No pre-selection  Intra-procedular analysis only  Contracts will improve the precision

Conclusions 14  A lightweight abstract domain  Used for array bounds validation  Efficient, and scalable  Implemented in Clousot  To be used  as a first pass to drop most of the proof obligations  In combination with other domains