Program Analysis and Verification 0368-4479 Noam Rinetzky Lecture 10: Shape Analysis 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

Slides:



Advertisements
Similar presentations
Abstract Interpretation Part II
Advertisements

Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Predicate Abstraction and Canonical Abstraction for Singly - linked Lists Roman Manevich Mooly Sagiv Tel Aviv University Eran Yahav G. Ramalingam IBM T.J.
Shape Analysis by Graph Decomposition R. Manevich M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine B. Cook MSR Cambridge.
Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
1 Lecture 07 – Shape Analysis Eran Yahav. Previously  LFP computation and join-over-all-paths  Inter-procedural analysis  call-string approach  functional.
1 Lecture 08(a) – Shape Analysis – continued Lecture 08(b) – Typestate Verification Lecture 08(c) – Predicate Abstraction Eran Yahav.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Program analysis Mooly Sagiv html://
Program analysis Mooly Sagiv html://
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Model Checking of Concurrent Software: Current Projects Thomas Reps University of Wisconsin.
1 Eran Yahav and Mooly Sagiv School of Computer Science Tel-Aviv University Verifying Safety Properties.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Modular Shape Analysis for Dynamically Encapsulated Programs Noam Rinetzky Tel Aviv University Arnd Poetzsch-HeffterUniversität Kaiserlauten Ganesan RamalingamMicrosoft.
Overview of program analysis Mooly Sagiv html://
Modular Shape Analysis for Dynamically Encapsulated Programs Noam Rinetzky Tel Aviv University Arnd Poetzsch-HeffterUniversität Kaiserlauten Ganesan RamalingamMicrosoft.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Comparison Under Abstraction for Verifying Linearizability Daphna Amit Noam Rinetzky Mooly Sagiv Tom RepsEran Yahav Tel Aviv UniversityUniversity of Wisconsin.
A Semantics for Procedure Local Heaps and its Abstractions Noam Rinetzky Tel Aviv University Jörg Bauer Universität des Saarlandes Thomas Reps University.
Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
Precision Going back to constant prop, in what cases would we lose precision?
Dagstuhl Seminar "Applied Deductive Verification" November Symbolically Computing Most-Precise Abstract Operations for Shape.
Thread Quantification for Concurrent Shape Analysis Josh BerdineMSR Cambridge Tal Lev-AmiTel Aviv University Roman ManevichTel Aviv University Mooly Sagiv.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
TVLA: A system for inferring Quantified Invariants Tal Lev-Ami Tom Reps Mooly Sagiv Reinhard Wilhelm Greta Yorsh.
Program Analysis and Verification
Shape Analysis Overview presented by Greta Yorsh.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.
Symbolically Computing Most-Precise Abstract Operations for Shape Analysis Greta Yorsh Thomas Reps Mooly Sagiv Tel Aviv University University of Wisconsin.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.
Data Structures and Algorithms for Efficient Shape Analysis by Roman Manevich Prepared under the supervision of Dr. Shmuel (Mooly) Sagiv.
1 Combining Abstract Interpreters Mooly Sagiv Tel Aviv University
Program Analysis and Verification
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
1 Program Analysis via 3-Valued Logic Mooly Sagiv, Tal Lev-Ami, Roman Manevich Tel Aviv University Thomas Reps, University of Wisconsin, Madison Reinhard.
Quantified Data Automata on Skinny Trees: an Abstract Domain for Lists Pranav Garg 1, P. Madhusudan 1 and Gennaro Parlato 2 1 University of Illinois at.
Program Analysis via 3-Valued Logic Thomas Reps University of Wisconsin Joint work with Mooly Sagiv and Reinhard Wilhelm.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Roman Manevich Ben-Gurion University Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 16: Shape Analysis.
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Shape & Alias Analyses Jaehwang Kim and Jaeho Shin Programming Research Laboratory Seoul National University
Interprocedural shape analysis for cutpoint-free programs Noam Rinetzky Tel Aviv University Joint work with Mooly Sagiv Tel Aviv University Eran Yahav.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm.
Spring 2017 Program Analysis and Verification
Noam Rinetzky Lecture 13: Numerical, Pointer & Shape Domains
Interprocedural shape analysis for cutpoint-free programs
Partially Disjunctive Heap Abstraction
Compactly Representing First-Order Structures for Static Analysis
Spring 2016 Program Analysis and Verification
Program Analysis and Verification
Compile-Time Verification of Properties of Heap Intensive Programs
Symbolic Implementation of the Best Transformer
Parametric Shape Analysis via 3-Valued Logic
Program Analysis and Verification
Program Analysis and Verification
Parametric Shape Analysis via 3-Valued Logic
Program Analysis and Verification
Program Analysis and Verification
Symbolic Characterization of Heap Abstractions
Presentation transcript:

Program Analysis and Verification Noam Rinetzky Lecture 10: Shape Analysis 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

Abstract Interpretation [Cousot’77] Mathematical foundation of static analysis – Abstract domains Abstract states Join (  ) – Transformer functions Abstract steps – Chaotic iteration Abstract computation Structured Programs Lattices (D, , , , ,  ) Lattices (D, , , , ,  ) Monotonic functions Fixpoints

The collecting lattice Lattice for a given control-flow node v: L v =(2 State, , , , , State) Lattice for entire control-flow graph with nodes V: L CFG = Map(V, L v ) We will use this lattice as a baseline for static analysis and define abstractions of its elements

Galois Connection 4 1   c 2 (c)(c) 3  (  (c))  The most precise (least) element in A representing c CA c   (  (c))

Galois Connection Given two complete lattices C = (D C,  C,  C,  C,  C,  C )– concrete domain A = (D A,  A,  A,  A,  A,  A )– abstract domain A Galois Connection (GC) is quadruple (C, , , A) that relates C and A via the monotone functions – The abstraction function  : D C  D A – The concretization function  : D A  D C for every concrete element c  D C and abstract element a  D A  (  (a))  a and c   (  (c)) Alternatively  (c)  a iff c   (a) 5

Abstract (conservative) interpretation set of states operational semantics (concrete semantics) statement S set of states  abstract representation abstract semantics statement S abstract representation concretization

Shape Analysis 7 Automatically verify properties of programs manipulating dynamically allocated storage Identify all possible shapes (layout) of the heap

Analyzing Singly Linked Lists

Limitations of pointer analysis 9 // Singly-linked list // data type. class SLL { int data; public SLL n; // next cell SLL(Object data) { this.data = data; this.n = null; } } // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; }

Flow&Field-sensitive Analysis 10 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; }

Flow&Field-sensitive Analysis 11 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t

Flow&Field-sensitive Analysis 12 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t L1 n

Flow&Field-sensitive Analysis 13 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n

Flow&Field-sensitive Analysis 14 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n

Flow&Field-sensitive Analysis 15 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n

Flow&Field-sensitive Analysis 16 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n

Flow&Field-sensitive Analysis 17 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n tmp null h t L1 n  tmp != null tmp == null

Flow&Field-sensitive Analysis 18 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n tmp null h t L1 n 

Flow&Field-sensitive Analysis 19 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n tmp == null ?

Flow&Field-sensitive Analysis 20 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n

Flow&Field-sensitive Analysis 21 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n Possible null dereference!

Flow&Field-sensitive Analysis 22 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n Fixed-point for first loop

Flow&Field-sensitive Analysis 23 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n

Flow&Field-sensitive Analysis 24 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n

Flow&Field-sensitive Analysis 25 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } tmp null h t L1 n L2 n Possible null dereference!

What was the problem? Pointer analysis abstract all objects allocated at same program location into one summary object. However, objects allocated at same memory location may behave very differently – E.g., object is first/last one in the list – Assign extra predicates to distinguish between objects with different roles Number of objects represented by summary object  1 – does not allow strong updates – Distinguish between concrete objects (#=1) and abstract objects (#  1) Join operator very coarse – abstracts away important distinctions (tmp=null/tmp!=null) – Apply disjunctive completion 26

Improved solution Pointer analysis abstract all objects allocated at same program location into one summary object. However, objects allocated at same memory location may behave very differently – E.g., object is first/last one in the list – Add extra instrumentation predicates to distinguish between objects with different roles Number of objects represented by summary object  1 – does not allow strong updates – Distinguish between concrete objects (#=1) and abstract objects (#  1) Join operator very coarse – abstracts away important distinctions (tmp=null/tmp!=null) – Apply disjunctive completion 27

Adding properties to objects Let’s first drop allocation site information and instead… Define a unary predicate x(v) for each pointer variable x meaning x points to x Predicate holds for at most one node Merge together nodes with same sets of predicates 28

Flow&Field-sensitive Analysis 29 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t

Flow&Field-sensitive Analysis 30 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {h, t} n

Flow&Field-sensitive Analysis 31 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {h, t} tmp n

Flow&Field-sensitive Analysis 32 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {h, t} tmp n {tmp} n No null dereference

Flow&Field-sensitive Analysis 33 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {h, t} tmp n {tmp} n

Flow&Field-sensitive Analysis 34 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n {h, tmp} n

Flow&Field-sensitive Analysis 35 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n {h, tmp} n  null h t {h, t} tmp n

Flow&Field-sensitive Analysis 36 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n {h, tmp} n  null h t {h, t} tmp n Do we need to analyze this shape graph again?

Flow&Field-sensitive Analysis 37 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n {h, tmp} n  null h t {h, t} tmp n

Flow&Field-sensitive Analysis 38 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { h} n { tmp} n No null dereference

Flow&Field-sensitive Analysis 39 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { h} n { tmp} n

Flow&Field-sensitive Analysis 40 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n { h, tmp} n

Flow&Field-sensitive Analysis 41 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n { h, tmp} n

Flow&Field-sensitive Analysis 42 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n { h} n { tmp} n No null dereference

Flow&Field-sensitive Analysis 43 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n { h} n { tmp} n

Flow&Field-sensitive Analysis 44 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n n {h, tmp} n

Flow&Field-sensitive Analysis 45 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n {h, tmp} n n Summarization How many objects does it represent?

Flow&Field-sensitive Analysis 46 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n {h} n n { tmp} No null dereference n

Flow&Field-sensitive Analysis 47 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n {h} n n { tmp} n

Flow&Field-sensitive Analysis 48 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n {h, tmp} n n Fixed-point for first loop

Flow&Field-sensitive Analysis 49 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n {h, tmp} n n

Flow&Field-sensitive Analysis 50 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { } n {h, tmp} n n

Flow&Field-sensitive Analysis 51 // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; } null h t {t} tmp n { tmp?} n { tmp} n n

Best (induced) transformer 52 CA 2 3 f f # (a)=  (f(  (a))) 1 f#f# 4 Problem:  incomputable directly

Best transformer for tmp=tmp.n 53 null h t {t} tmp n { } n {h, tmp} n n null h t {t} tmp n { } n {h, tmp} n null h t {t} tmp n { } n {h, tmp} n { } 

Best transformer for tmp=tmp.n:  54 null h t {t} tmp n { } n {h, tmp} n n null h t {t} tmp n { } n {h, tmp} n null h t {t} tmp n { } n {h, tmp} n { }  null h t {t} tmp n { } n {h, tmp} n { } …

Best transformer for  tmp=tmp.n  55 null h t {t} tmp n { } n {h, tmp} n n null h t {t} tmp n {tmp } n {h} n null h t {t} tmp n { } n {h} n { } {tmp } null h t {t} tmp n { } n {h} n { } {tmp } …

Best transformer for tmp=tmp.n:  56 null h t {t} tmp n { } n {h, tmp} n n null h t {t} tmp n {tmp } n {h} n null h t {t} tmp n { } n {h} n {tmp } null h t {t} tmp n { } n {h} n {tmp } …

Handling updates on summary nodes Transformers accessing only concrete nodes are easy Transformers accessing summary nodes are complicated Can’t concretize summary nodes – represents potentially unbounded number of concrete nodes We need to split into cases by “materializing” concrete nodes from summary node – Introduce a new temporary predicate tmp.n – Partial concretization 57

Transformer for tmp=tmp.n:  ’ 58 null h t {t} tmp n { } n {h, tmp} n n ’’ Case 1: Exactly 1 object. Case 2: >1 objects.

Transformer for tmp=tmp.n:  ’ 59 null h t {t} tmp n { } n {h, tmp} n n null h t {t} tmp n {tmp.n } n {h, tmp} n null h t {t} tmp n { } n {h, tmp} n {tmp.n } ’’ tmp.n is a concrete node Summary node spaghetti

Transformer  tmp=tmp.n  60 null h t {t} tmp n { } n {h, tmp} n n null h t {t} tmp n {tmp, tmp.n } n {h} n null h t {t} tmp n { } n {h} n {tmp, tmp.n }

Transformer for tmp=tmp.n:  61 null h t {t} tmp n {tmp} n {h} n null h t {t} tmp n { } n {h} n {tmp} // Build a list SLL h=null, t = null; L1: h=t= new SLL(-1); SLL tmp = null; while (…) { int data = getData(…); L2: tmp = new SLL(data); tmp.n = h; h = tmp; } // Process elements tmp = h; while (tmp != t) { assert tmp != null; tmp.data += 1; tmp = tmp.n; }

Transformer via partial-concretization 62 CA 2 3 f f # (a)=  ’(f # ’(  ’(a))) 1 f#f# 4 A’ 5 f #’ 6 ’’ ’’

Recap Adding more properties to nodes refines abstraction Can add temporary properties for partial concretization – Materialize concrete nodes from summary nodes – Allows turning weak updates into strong ones – Focus operation in shape-analysis lingo – Not trivial in general and requires more semantic reduction to clean up impossible edges – General algorithms available via 3-valued logic and implemented in TVLA system 63

3-Value logic based shape analysis

Sequential Stack void push (int v) { Node  x = malloc(sizeof(Node)); x->d = v; x->n = Top; Top = x; } 65 int pop() { if (Top == NULL) return EMPTY; Node *s = Top->n; int r = Top->d; Top = s; return r; } Want to Verify No Null Dereference Underlying list remains acyclic after each operation Want to Verify No Null Dereference Underlying list remains acyclic after each operation

Shape Analysis via 3-valued Logic 1) Abstraction – 3-valued logical structure – canonical abstraction 2) Transformers – via logical formulae – soundness by construction embedding theorem, [SRW02] 66

Concrete State represent a concrete state as a two-valued logical structure – Individuals = heap allocated objects – Unary predicates = object properties – Binary predicates = relations parametric vocabulary 67 Top nn u1u2u3 (storeless, no heap addresses)

Concrete State S = over a vocabulary P U – universe  - interpretation, mapping each predicate from p to its truth value in S 68 Top nn u1u2u3  U = { u1, u2, u3}  P = { Top, n }   (n)(u1,u2) = 1,  (n)(u1,u3)=0,  (n)(u2,u1)=0,…   (Top)(u1)=1,  (Top)(u2)=0,  (Top)(u3)=0

 v1,v2: n(v1, v2)  n*(v2, v1) Formulae for Observing Properties void push (int v) { Node  x = malloc(sizeof(Node)); x  d = v; x  n = Top; Top = x; } 69 Top nn u1u2u3 1 No Cycles  v1,v2: n(v1, v2)  n*(v2, v1) 1  w:Top(w) Top != null  w: x(w) No node precedes Top  v1,v2: n(v1, v2)  Top(v2) No node precedes Top  v1,v2: n(v1, v2)  Top(v2) 1

Concrete Interpretation Rules 70

Example: s = Top  n 71 Top s ’ (v) =  v1: Top(v1)  n(v1,v) u1 u2u3 Top s u1 u2u3 Top u11 u20 u30 nu1u2U3 u1010 u2001 u3000 s u10 u20 u30 Top u11 u20 u30 nu1u2U3 u1010 u2001 u3000 s u10 u21 u30

Collecting Semantics 72  {  st(w)  (S) | S  CSS[w] }  (w,v)  E(G), w  Assignments(G) { } CSS [v] = if v = entry  { S | S  CSS[w] }  (w,v)  E(G), w  Skip(G)  { S | S  CSS[w] and S  cond(w)}  (w,v)  True-Branches(G)  { S | S  CSS[w] and S   cond(w)} (w,v)  False-Branches(G) othrewise

Collecting Semantics At every program point – a potentially infinite set of two-valued logical structures Representing (at least) all possible heaps that can arise at the program point Next step: find a bounded abstract representation 73

3-Valued Logic 1 = true 0 = false 1/2 = unknown  A join semi-lattice, 0  1 = 1/2 74 1/2 0 1 information order logical order

3-Valued Logical Structures A set of individuals (nodes) U Relation meaning – Interpretation of relation symbols in P p 0 ()  {0,1, 1/2} p 1 (v)  {0,1, 1/2} p 2 (u,v)  {0,1, 1/2} A join semi-lattice: 0  1 = 1/2 75

Boolean Connectives [Kleene]

Property Space 3-struct[P] = the set of 3-valued logical structures over a vocabulary (set of predicates) P Abstract domain –  (3-Struct[P]) –  is  77

Embedding Order Given two structures S =, S’ = and an onto function f : U  U’ mapping individuals in U to individuals in U’ We say that f embeds S in S’ (denoted by S  S’) if – for every predicate symbol p  P of arity k: u1, …, uk  U,  (p)(u1,..., uk)   ’(p)(f(u1),..., f (uk)) – and for all u’  U’ ( | { u | f (u) = u’ } | > 1)   ’(sm)(u’) We say that S can be embedded in S’ (denoted by S  S’) if there exists a function f such that S  f S’ 78

Tight Embedding S’ = is a tight embedding of S= with respect to a function f if: – S’ does not lose unnecessary information  ’(u’ 1,…, u’ k ) =  {  (u 1..., u k ) | f(u 1 )=u’ 1,..., f(u k )=u’ k } One way to get tight embedding is canonical abstraction 80

Canonical Abstraction 81 Top nn u1u2u3 u1u2u3 [Sagiv, Reps, Wilhelm, TOPLAS02]

Canonical Abstraction 82 Top nn u1u2u3 u1u2u3 Top [Sagiv, Reps, Wilhelm, TOPLAS02]

Canonical Abstraction 83 Top nn u1u2u3 u1u2u3 Top

Canonical Abstraction 84 Top nn u1u2u3 u1u2u3 Top

Canonical Abstraction 85 Top nn u1u2u3 u1u2u3 Top

Canonical Abstraction 86 Top nn u1u2u3 u1u2u3 Top

Canonical Abstraction (  ) Merge all nodes with the same unary predicate values into a single summary node Join predicate values  ’(u’ 1,..., u’ k ) =  {  (u 1,..., u k ) | f(u 1 )=u’ 1,..., f(u k )=u’ k } Converts a state of arbitrary size into a 3-valued abstract state of bounded size  (C) =  {  (c) | c  C } 87

Information Loss 88 Top nn nn n Canonical abstraction Top n n

Instrumentation Predicates Record additional derived information via predicates 89 r x (v) =  v1: x(v1)  n*(v1,v) c(v) =  v1: n(v1, v)  n*(v, v1) Top nn c c c c nn n c c Canonical abstraction

Embedding Theorem: Conservatively Observing Properties 90 No Cycles  v1,v2: n(v1, v2)  n*(v2, v1) 1/2 Top r Top No cycles (derived)  v:  c(v) 1

Operational Semantics void push (int v) { Node  x = malloc(sizeof(Node)); x->d = v; x->n = Top; Top = x; } 91 int pop() { if (Top == NULL) return EMPTY; Node *s = Top->n; int r = Top->d; Top = s; return r; } s ’ (v) =  v1: Top(v1)  n(v1,v)  s = Top->n  Top nn u1u2u3 Top nn u1u2u3 s

Abstract Semantics 92 Top s = Top  n ? r Top Top r Top s s ’ (v) =  v1: Top(v1)  n(v1,v)  s = Top->n 

Top 93 Best Transformer (s = Top  n) Concrete Semantics  Canonical Abstraction Abstract Semantics ? r Top Top r Top Top r Top Top s r Top Top s r Top Top s r Top Top s r Top s ’ (v) =  v1: Top(v1)  n(v1,v)

Semantic Reduction Improve the precision of the analysis by recovering properties of the program semantics A Galois connection (C, , , A) An operation op:A  A is a semantic reduction when –  l  L 2 op(l)  l and –  (op(l)) =  (l) l CA   op

The Focus Operation Focus: Formula  (  (3-Struct)   (3-Struct)) Generalizes materialization For every formula  – Focus(  )(X) yields structure in which  evaluates to a definite values in all assignments – Only maximal in terms of embedding – Focus(  ) is a semantic reduction – But Focus(  )(X) may be undefined for some X

Top r Top Top r Top 96 Partial Concretization Based on Transformer (s=Top  n) Abstract Semantics Partial Concretization Canonical Abstraction Abstract Semantics Top s r Top Top s r Top Top r Top s Top s r Top Top r Top s ’ (v) =  v1: Top(v1)  n(v1,v)  u: top(u)  n(u, v) Focus (Top  n)

97 Partial Concretization Locally refine the abstract domain per statement Soundness is immediate Employed in other shape analysis algorithms [Distefano et.al., TACAS’06, Evan et.al., SAS’07, POPL’08]

The Coercion Principle Another Semantic Reduction Can be applied after Focus or after Update or both Increase precision by exploiting some structural properties possessed by all stores (Global invariants) Structural properties captured by constraints Apply a constraint solver

Apply Constraint Solver Top r Top x x Top r Top x x x x rxrx r x, r y n n n n n n n n y y n n Top r Top  x x rxrx r x, r y n n n n y y n n

Sources of Constraints Properties of the operational semantics Domain specific knowledge – Instrumentation predicates User supplied

Example Constraints x(v1)  x(v2)  eq(v1, v2) n(v, v1)  n(v,v2)  eq(v1, v2) n(v1, v)  n(v2,v)   eq(v1, v2)  is(v) n*(v3, v4)  t[n](v1, v2)

Abstract Transformers: Summary Kleene evaluation yields sound solution Focus is a statement-specific partial concretization Coerce applies global constraints

Abstract Semantics 103  { t_embed(coerce(  st(w)  3 (focus F(w) (SS[w] ))))  (w,v)  E(G), w  Assignments(G) { } SS [v] = if v = entry  { S | S  SS[w] }  (w,v)  E(G), w  Skip(G)  { t_embed(S) | S  coerce(  st(w)  3 (focus F(w) (SS[w] ))) and S  3 cond(w)}  (w,v)  True-Branches(G)  { t_embed(S) | S  coerce(  st(w)  3 (focus F(w) (SS[w] ))) and S  3  cond(w)}  (w,v)  False-Branches(G) othrewise

Recap Abstraction – canonical abstraction – recording derived information Transformers – partial concretization (focus) – constraint solver (coerce) – sound information extraction 104

Stack Push void push (int v) { Node  x = alloc(sizeof(Node)); x  d = v; x  n = Top; Top = x; } 105 Top x x x x x x emp x x x x x x Top  v:  c(v)  v: x(v)  v1,v2: n(v1, v2)  Top(v2) …