Data Structures and Algorithms for Efficient Shape Analysis by Roman Manevich Prepared under the supervision of Dr. Shmuel (Mooly) Sagiv.

Slides:

Advertisements

Similar presentations

Model Checking Lecture 4. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.

Advertisements

Representing Boolean Functions for Symbolic Model Checking Supratik Chakraborty IIT Bombay.

Predicate Abstraction and Canonical Abstraction for Singly - linked Lists Roman Manevich Mooly Sagiv Tel Aviv University Eran Yahav G. Ramalingam IBM T.J.

Shape Analysis by Graph Decomposition R. Manevich M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine B. Cook MSR Cambridge.

Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.

Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.

3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.

Some Properties of SSA Mooly Sagiv. Outline Why is it called Static Single Assignment form What does it buy us? How much does it cost us? Open questions.

1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.

Efficient Reachability Analysis for Verification of Asynchronous Systems Nishant Sinha.

SYMBOLIC MODEL CHECKING: STATES AND BEYOND J.R. Burch E.M. Clarke K.L. McMillan D. L. Dill L. J. Hwang Presented by Rehana Begam.

Parallel Inclusion-based Points-to Analysis Mario Méndez-Lojo Augustine Mathew Keshav Pingali The University of Texas at Austin (USA) 1.

Efficient Query Evaluation on Probabilistic Databases

SAT and Model Checking. Bounded Model Checking (BMC) A.I. Planning problems: can we reach a desired state in k steps? Verification of safety properties:

1 Lecture 07 – Shape Analysis Eran Yahav. Previously  LFP computation and join-over-all-paths  Inter-procedural analysis  call-string approach  functional.

Class Presentation on Binary Moment Diagrams by Krishna Chillara Base Paper: “Verification of Arithmetic Circuits using Binary Moment Diagrams” by.

1 Lecture 08(a) – Shape Analysis – continued Lecture 08(b) – Typestate Verification Lecture 08(c) – Predicate Abstraction Eran Yahav.

DATE-2002TED1 Taylor Expansion Diagrams: A Compact Canonical Representation for Symbolic Verification M. Ciesielski, P. Kalla, Z. Zeng B. Rouzeyre Electrical.

Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.

Finite Differencing of Logical Formulas for Static Analysis Thomas Reps University of Wisconsin Joint work with M. Sagiv and A. Loginov.

3-Valued Logic Analyzer (TVP) Part II Tal Lev-Ami and Mooly Sagiv.

1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3

ECE Synthesis & Verification - Lecture 18 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Word-level.

A New Approach to Structural Analysis and Transformation of Networks Alan Mishchenko November 29, 1999.

Model Checking of Concurrent Software: Current Projects Thomas Reps University of Wisconsin.

1 Eran Yahav and Mooly Sagiv School of Computer Science Tel-Aviv University Verifying Safety Properties.

A Compressed Breadth-First Search for Satisfiability DoRon B. Motter and Igor L. Markov University of Michigan, Ann Arbor.

Taylor Expansion Diagrams (TED): Verification EC667: Synthesis and Verification of Digital Systems Spring 2011 Presented by: Sudhan.

Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits.

 2001 CiesielskiBDD Tutorial1 Decision Diagrams Maciej Ciesielski Electrical & Computer Engineering University of Massachusetts, Amherst, USA

1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6

Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)

Propositional Calculus Math Foundations of Computer Science.

Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.

Imperative Programming

Digitaalsüsteemide verifitseerimise kursus1 Formal verification: BDD BDDs applied in equivalence checking.

Dagstuhl Seminar "Applied Deductive Verification" November Symbolically Computing Most-Precise Abstract Operations for Shape.

Program Analysis and Verification Noam Rinetzky Lecture 10: Shape Analysis 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

Systems Architecture I1 Propositional Calculus Objective: To provide students with the concepts and techniques from propositional calculus so that they.

Binary Decision Diagrams (BDDs)

Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.

June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.

Shape Analysis Overview presented by Greta Yorsh.

Mark Marron 1, Deepak Kapur 2, Manuel Hermenegildo 1 1 Imdea-Software (Spain) 2 University of New Mexico 1.

Daniel Kroening and Ofer Strichman 1 Decision Procedures An Algorithmic Point of View BDDs.

Symbolically Computing Most-Precise Abstract Operations for Shape Analysis Greta Yorsh Thomas Reps Mooly Sagiv Tel Aviv University University of Wisconsin.

Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.

On the Relation between SAT and BDDs for Equivalence Checking Sherief Reda Rolf Drechsler Alex Orailoglu Computer Science & Engineering Dept. University.

Daniel Kroening and Ofer Strichman 1 Decision Procedures An Algorithmic Point of View BDDs.

1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6

Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.

Verification & Validation By: Amir Masoud Gharehbaghi

Verifying Programs with BDDs Topics Representing Boolean functions with Binary Decision Diagrams Application to program verification class-bdd.ppt

1 Program Analysis via 3-Valued Logic Mooly Sagiv, Tal Lev-Ami, Roman Manevich Tel Aviv University Thomas Reps, University of Wisconsin, Madison Reinhard.

Program Analysis via 3-Valued Logic Thomas Reps University of Wisconsin Joint work with Mooly Sagiv and Reinhard Wilhelm.

1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.

Interprocedural shape analysis for cutpoint-free programs Noam Rinetzky Tel Aviv University Joint work with Mooly Sagiv Tel Aviv University Eran Yahav.

2009/6/30 CAV Quantifier Elimination via Functional Composition Jie-Hong Roland Jiang Dept. of Electrical Eng. / Grad. Inst. of Electronics Eng.

Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm.

Partially Disjunctive Heap Abstraction

Compactly Representing First-Order Structures for Static Analysis

Spring 2016 Program Analysis and Verification

Planning as model checking, (OBDDs)

Concepts of programming languages

ECE 667 Synthesis and Verification of Digital Systems

Binary Decision Diagrams

Discrete Controller Synthesis

Symbolic Characterization of Heap Abstractions

Precise Condition Synthesis for Program Repair

Overview Functional Testing Boundary Value Testing (BVT)

Presentation transcript:

Data Structures and Algorithms for Efficient Shape Analysis by Roman Manevich Prepared under the supervision of Dr. Shmuel (Mooly) Sagiv

Motivation TVLA is a powerful and general abstract interpretation system Abstract interpretation in TVLA Operational semantics is expressed with first-order logic + TC formulae Program states are represented as sets of Evolving First-Order Structures Efficiency is an issue

Outline Shape Analysis quick intro Compactly representing structures Tuning abstraction to improve performance

What is Shape Analysis Determines Shape Invariants for imperative programs Can be used to verify a wide range of properties over different programming languages

reverse Example /* list.h */ typedef struct node { struct node * n; int data; } * List; /* print.c */ #include “list.h” List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x  n; y  n = t; } return y; }

reverse Example ynn... Shape before Shape after xnn...

Definition of a First-Order Logical Structure S = U – a set of individuals (“node set”)  – a mapping p (r)  (U r  {0,1}) the “interpretation” of p

1: True 0: False 1/2: Unknown A join semi-lattice: 0  1 = 1/2 Three-Valued Logic   1/2 Information order

Canonical Abstraction Partition the individuals into equivalence classes based on the values of their unary predicates Collapse other predicates via  p S (u ’ 1,..., u ’ k ) =  {p B (u 1,..., u k ) | f(u 1 )=u ’ 1,..., f(u ’ k )=u ’ k ) } At most 3 n abstract individuals

Canonical Abstraction Example u 0 r[n,x] u 1 r[n,x] n x u 2 r[n,x] n u 3 r[n,x] n u 0 r[n,x] u r[n,x] n n x

Compactly Representing First-Order Logical Structures Space is a major bottleneck Analysis explores many logical structures Reduce space by sharing information across structures

Desired Properties Sparse data structures Share common sub-structures Inherited sharing Incidental sharing due to program invariants But feasible time performance Phase sensitive data structures

Chapter Outline Background First-order structure representations Base representation (TVLA 0.91) BDD representation Empirical evaluation Conclusion

First-Order Logical Structures Generalize shape graphs Arbitrary set of individuals Arbitrary set of predicates on individuals Dynamically evolving Usually small changes Properties are extracted by evaluating first order formula: ∃ v 1, v: x(v 1 ) ∧ n(v 1, v) Join operator requires isomorphism testing

First-Order Structure ADT Structure : new() /* empty structure */ SetOfNodes : nodeSet(Structure) Node : newNode(Structure) removeNode(Structure, node) Kleene eval(Structure, p (r), ) update(Structure, p (r),, Kleene) Structure copy(Structure)

print_all Example /* list.h */ typedef struct node { struct node * n; int data; } * L; /* print.c */ #include “list.h” void print_all(L y) { L x; x = y; while (x != NULL) { /* assert(x != NULL) */ printf(“elem=%d”, x  data); x = x  n; } }

print_all Example S0S0 copy(S 0 ) : S 1 x = y x’(v) := y(v) nodeset(S 0 ) : {u 1, u} eval(S 0, y, u 1 ) : 1 update(S 1, x, u 1, 1) eval(S 0, y, u) : 0 update(S 1, x, u, 0) u 1 y=1 u sm=½ n=½ S1S1 u 1 y=1 u sm=½ n=½ x=1

print_all Example x = x  n focus : ∃ v 1 x(v 1 ) ∧ n(v 1, v) x’(v) := ∃ v 1 x(v 1 ) ∧ n(v 1, v) S 2.0 u 1 y=1 u sm=½ n=½ S 2.1 u 1 y=1 u x=1 n=1 n=½ S 2.2 u 1 y=1 u.1 x=1 n=1 n=½ S1S1 u 1 x=1 y=1 u sm=½ n=½ u.0 sm=½ while (x != NULL) precondition : ∃ v x(v)

Overview and Main Results 1. Two novel representations of first-order structures New BDD representation New representation using functional maps 2. Implementation techniques 3. Empirical evaluation Comparison of different representations Space is reduced by a factor of 4 – 10 New representations scale better

Base Representation (Tal Lev-Ami SAS 2000) Two-Level Map : Predicate  (Node Tuple  Kleene) Sparse Representation Limited inherited sharing by “ Copy-On-Write ”

fx3x3 x2x2 x1x x3x3 x3x3 x3x3 x3x3 x2x2 x2x2 x1x BDDs in a Nutshell (Bryant 86) Ordered Binary Decision Diagrams Data structure for Boolean functions Functions are represented as (unique) DAGs

x3x3 x3x3 x3x3 x3x3 x2x2 x2x2 x1x1 01 x3x3 x3x3 x2x2 x2x2 x1x1 01 x3x3 x2x2 x1x1 01 Duplicate TerminalsDuplicate NonterminalsRedundant Tests BDDs in a Nutshell (Bryant 86) Ordered Binary Decision Diagrams Data structure for Boolean functions Functions are represented as (unique) DAGs Also achieve sharing across functions

Encoding Structures Using Integers Static encoding of Predicates Kleene values Dynamic encoding of nodes 0, 1, …, n-1 Encode predicate p ’ s values as e p (p).e n (u 1 ). e n (u 2 ). …. e n (u n ). e k (Kleene)

BDD Representation of Integer Sets Characteristic function S={1,5} 1= 5=  S = ( ¬ x 1  ¬ x 2  x 3 )  (x 1  ¬ x 2  x 3 ) 10 x2x2 x1x1 x3x3 x2x2

BDD Representation of Integer Sets Characteristic function S={1,5} 1= 5=  S = ( ¬ x 1  ¬ x 2  x 3 )  (x 1  ¬ x 2  x 3 ) 1 x2x2 x1x1 x3x3 x2x2

1 S0S0 BDD Representation Example S0S0 u 1 y=1 u sm=½ n=½

1 S0S0 S1S1 BDD Representation Example x=y S1S1 u 1 x=1 y=1 u sm=½ n=½ S0S0 u 1 y=1 u sm=½ n=½

1 S0S0 S1S1 S 2.2 BDD Representation Example x=y x=x  n S 2.2 u 1 y=1 u.1 x=1 n=1 n=½ u.0 sm=½ S1S1 u 1 x=1 y=1 u sm=½ n=½ S0S0 u 1 y=1 u sm=½ n=½

1 S0S0 S1S1 S 2.2 BDD Representation Example x=y x=x  n S 2.2 u 1 y=1 u.1 x=1 n=1 n=½ u.0 sm=½ S1S1 u 1 x=1 y=1 u sm=½ n=½ S0S0 u 1 y=1 u sm=½ n=½

Improved BDD Representation Using this representation directly doesn ’ t save space – canonicity doesn ’ t carry over from propositional to first-order logic Observation Node names can be arbitrarily remapped without affecting the ADT semantics Our heuristics Use canonic node names to encode nodes and obtain a canonic representation Increases incidental sharing Reduces isomorphism test to pointer comparison 4-10 space reduction

Reducing Time Overhead Current implementation not optimized Expensive formula evaluation Hybrid representation Distinguish between phases: mutable phase  Join  immutable phase Dynamically switch representations

Functional Representation Alternative representation for first-order structures Structures represented by maps from integers to Kleene values Tailored for representing first-order structures Achieves better results than BDDs Techniques similar to the BDD representation More details in the thesis

Introduction to Functional Maps A mapping N  {0,½,1} ½ 3 Nodes contain a fixed number of values Hierarchical maps

Introduction to Functional Maps Sparse maps ½ size = ½ size = 27

Introduction to Functional Maps Share unique sub-maps ½ size = ½ size = 27

Introduction to Functional Maps Share unique sub-maps ½ size = 9 size = 27

Functional Representation Example yxsm 100 yx 00½ n ½ size=9 size=27 S0S0 binaryunarynullary u 1 y=1 u sm=½ n=½

Functional Representation Example yxsm 100 yx 00½ yx 110 n ½ size=9 size=27 S0S0 binaryunarynullary S1S1 binaryunarynullary u 1 y=1 u sm=½ n=½ u 1 x=1 y=1 u sm=½ n=½

Functional Representation Example yxsm 100 yx 00½ yx 010 yx 110 n ½ n 1 size=9 size=27 size=81 S0S0 binaryunarynullary S 2.2 binaryunarynullary S1S1 binaryunarynullary u 1 y=1 u.1 x=1 n=1 n=½ u.0 sm=½ u 1 y=1 u sm=½ n=½ u 1 x=1 y=1 u sm=½ n=½

Reducing Time Overhead “ Lazy ” normalization is used to balance time/space performance

Empirical Evaluation Benchmarks: Cleanness Analysis (SAS 2000) Garbage Collector CMP (PLDI 2002) of Java Front-End and Kernel Benchmarks Mobile Ambients (ESOP 2000) Stress testing the representations We use “ relational analysis ” Save structures in every CFG location

Space Results

Abstract Counters Ignore language/implementation details A more reliable measurement technique Count only crucial space information Independent of C/Java

Abstract Counters Results

Trends in the Cleanness Analysis Benchmark

Conclusions Two novel representations of first-order structures New BDD representation New representation using functional maps Implementation techniques Substantially better than inherited sharing Structure canonization is crucial Normalization via hash-consing is the key technique

Conclusions The use of BDDs for static analysis is not a panacea for space saving Domain-specific encoding crucial for saving space Failed attempts Original implementation of Veith ’ s encoding PAG

Tuning Abstraction for Improved Performance Analysis can be very costly Explores many structures GC example explores >180,000 structures

Existing Analysis Modes Relational analysis Doubly-exponential in worst case Our most precise method Single-structure analysis (Tal Lev-Ami SAS 2000) Singly-exponential in worst case Can be very efficient Can be very imprecise Sometimes very inefficient

Single-Structure Analysis u1u1 x u n u1u1 x u1u1 x u n S1S1 S0S0 S 0  S 1 May exist

Single-Structure Analysis Active property ac=0 doesn ’ t exist in every concrete structure ac=1 exists in every concrete structure ac=1/2 may exist in some concrete structure u 1 ac=1 x u ac=1 n u 1 ac=1 x x u ac=1/2 n S1S1 S0S0 S 0  S 1

Single-Structure Analysis Sometimes overly imprecise Refine analysis by using nullary predicates to distinguish between different structures

Is there a “ sweet spot ” ? Relational Analysis Efficiency Precision

Chapter Outline Removing embedded structures Merging structures with same set of canonical names Staged analysis to localize abstraction Merging pseudo-embedded structures

Order Relations on Structures and Sets of Structures S, S ’  3-STRUCT S  ƒ S ’ if for every predicate p 1. p s (u 1, …,u k )  p s ’ ( ƒ (u 1 ), …, ƒ (u k ) ) 2. ( { u | ƒ (u)=u ’ } > 1)  sm s ’ (u ’ ) X, X ’  2 3-STRUCT X  X ’ Every S  X has S ’  X ’ and S  S ’

Compacting Transformations We look for transformation T: 2 3-STRUCT  2 3-STRUCT with the following properties: 1. Compacting – |T(x)|  |x| 2. Conservative – T(x)  x Without sacrificing precision

Removing Embedded Structures u 2 r[n,t] r[n,y] u 1 r[n,t] r[n,y] n y t u 0 r[n,x] x S0S0 u 2 r[n,t] r[n,y] n y t u 0 r[n,x] x S1S1 u 1 r[n,t] r[n,y] n ƒ ƒ ƒ

Removing Embedded Structures u 2 r[n,t] r[n,y] u 1 r[n,t] r[n,y] n y t u 0 r[n,x] x S0S0 u 2 r[n,t] r[n,y] n y t u 0 r[n,x] x S1S1 u 1 r[n,t] r[n,y] n Reversing a list with exactly 3 cells Reversing a list with at least 3 cells

Detecting Embedding is hard In general, as hard as GRAPH ISOMORPHISM Conditions for a unique mapping: Canonical abstraction Definite values Polynomial time check

Results (#structures explored)

Canonical Names Method Canonical abstraction merges individuals with same canonical names (unary abstraction predicate values) Merge structures with same set of canonical names Both transformations preserve “ definity ” of abstraction predicates But ignores precision of non-abstraction predicates

Canonical Abstraction Example u 0 r[n,x] u 1 r[n,x] n x u 2 r[n,x] n u 3 r[n,x] n u 0 r[n,x] u r[n,x] n n x

Merging Structures with Same Canonical Names Example u 0 r[n,x] u r[n,x] n n x u 0 r[n,x] u r[n,x] n x u 0 r[n,x] u r[n,x] n n x S1S1 S0S0 S 0  S 1

Merging Structures with Same Canonical Names Example u0u0 u n x S1S1 S0S0 S 0  S 1 u0u0 ux u0u0 u n x

Results (#structures explored)

Localizing Abstraction Find an appropriate subset of abstraction predicates for every CFG node Observation: programs contain dead variables – exploit to make corresponding predicates “ dead ” Compute “ predicate liveness ” to determine subset of abstraction predicates

reverse Example List reverse (List x) { L0: List y, t; L1: y = NULL; L2: while (x != NULL) { L3: t = y; L4: y = x; L5: x = x  n; L6: y  n = t; } L7: return y; } y dead t dead all dead

Results (#structures explored)

Compaction via Pseudo-Embedding Pseudo-Embedding – similar to embedding with respect to abs. predicates S, S ’  3-STRUCT S  ’ ƒ S ’ if for every abstract predicate p 1. p s (u)  p s ’ ( ƒ (u ) ) 2. ( { u | ƒ (u)=u ’ } > 1)  sm s ’ (u ’ )

Modified blur Order relation on nodes: u 1  u 2 if for every abstraction predicate p p s (u 1 )  p s ’ (u 2 ) blur ’ merges u 1 with u 2 if u 1  u 2

blur ’ Example u 0 r[n,x] u r[n,x] n x n x blur’

Merging Pseudo-Embedded Structures Example u 0 r[n,x] u r[n,y] r[n,x] n x S1S1 S0S0 S 0  S 1 x y n y u r[n,y] r[n,x] x y n u r[n,y] =1/2 r[n,x] Abstraction predicates = {x,y} Non-abstraction predicates = {r[n,x], r[n,y], n}

Results (#structures explored)

Empirical Evaluation Benchmarks: Garbage Collector Mobile Ambients (ESOP 2000) Sorting procedures (ISSTA 2000) MA + J2 : completed without instrumentation predicates and without messages

Results (#structures explored) False alarms Out of memory Out of time

Conclusion New method is usually much more efficient (by orders of magnitude) Doesn ’ t lose precision on benchmarks Performance more stable than other methods

Future and Ongoing Work Time optimizations Symbolic (BDD) execution of TVLA operations Compactly represent sets of structures Improving abstraction locality Truly live predicates Analyzing liveness for core predicates and deriving for instrumentation predicates Experiment with other compacting transformations Achieve polynomial complexity

The End