Shape & Alias Analyses Jaehwang Kim and Jaeho Shin Programming Research Laboratory Seoul National University 2005-08-08.

Slides:



Advertisements
Similar presentations
Model Checking Lecture 4. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.
Advertisements

Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Techniques for proving programs with pointers A. Tikhomirov.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Lecture 24 MAS 714 Hartmut Klauck
Predicate Abstraction and Canonical Abstraction for Singly - linked Lists Roman Manevich Mooly Sagiv Tel Aviv University Eran Yahav G. Ramalingam IBM T.J.
Gennaro Parlato (LIAFA, Paris, France) Joint work with P. Madhusudan Xiaokang Qie University of Illinois at Urbana-Champaign.
Shape Analysis by Graph Decomposition R. Manevich M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine B. Cook MSR Cambridge.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
1 PROPERTIES OF A TYPE ABSTRACT INTERPRETATER. 2 MOTIVATION OF THE EXPERIMENT § a well understood case l type inference in functional programming à la.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
Lambda Calculus and Lisp PZ03J. Lambda Calculus The lambda calculus is a model for functional programming like Turing machines are models for imperative.
Program Representations. Representing programs Goals.
ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
Compiler Construction
Introduction to Computability Theory
1 Lecture 08(a) – Shape Analysis – continued Lecture 08(b) – Typestate Verification Lecture 08(c) – Predicate Abstraction Eran Yahav.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Finite Differencing of Logical Formulas for Static Analysis Thomas Reps University of Wisconsin Joint work with M. Sagiv and A. Loginov.
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Normal forms for Context-Free Grammars
Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.
Functional programming: LISP Originally developed for symbolic computing First interactive, interpreted language Dynamic typing: values have types, variables.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Static Program Analysis via Three-Valued Logic Thomas Reps University of Wisconsin Joint work with M. Sagiv (Tel Aviv) and R. Wilhelm (U. Saarlandes)
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
Symbol Table (  ) Contents Map identifiers to the symbol with relevant information about the identifier All information is derived from syntax tree -
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Dagstuhl Seminar "Applied Deductive Verification" November Symbolically Computing Most-Precise Abstract Operations for Shape.
Program Analysis and Verification Noam Rinetzky Lecture 10: Shape Analysis 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
CS 2104 Prog. Lang. Concepts Dr. Abhik Roychoudhury School of Computing Introduction.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 2: Operational Semantics I Roman Manevich Ben-Gurion University.
Automatic Verification of Pointer Programs using Grammar-based Shape Analysis Hongseok Yang Seoul National University (Joint Work with Oukseh Lee and Kwangkeun.
DECIDABILITY OF PRESBURGER ARITHMETIC USING FINITE AUTOMATA Presented by : Shubha Jain Reference : Paper by Alexandre Boudet and Hubert Comon.
1 Automatic Refinement and Vacuity Detection for Symbolic Trajectory Evaluation Orna Grumberg Technion Haifa, Israel Joint work with Rachel Tzoref.
COP4020 Programming Languages Semantics Prof. Xin Yuan.
Basic Semantics Associating meaning with language entities.
1 Bisimulations as a Technique for State Space Reductions.
Semantics. Semantics is a precise definition of the meaning of a syntactically and type-wise correct program. Ideas of meaning: –Operational Semantics.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
CSE 425: Control Abstraction I Functions vs. Procedures It is useful to differentiate functions vs. procedures –Procedures have side effects but usually.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Chapter 6 Properties of Regular Languages. 2 Regular Sets and Languages  Claim(1). The family of languages accepted by FSAs consists of precisely the.
Quantified Data Automata on Skinny Trees: an Abstract Domain for Lists Pranav Garg 1, P. Madhusudan 1 and Gennaro Parlato 2 1 University of Illinois at.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Roman Manevich Ben-Gurion University Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 16: Shape Analysis.
Points-to Analysis as a System of Linear Equations Rupesh Nasre. Computer Science and Automation Indian Institute of Science Advisor: Prof. R. Govindarajan.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Counterexample-Guided Abstraction Refinement By Edmund Clarke, Orna Grumberg, Somesh Jha, Yuan Lu, and Helmut Veith Presented by Yunho Kim Provable Software.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Programming Languages Translator
Spring 2016 Program Analysis and Verification
RE-Tree: An Efficient Index Structure for Regular Expressions
Graph-Based Operational Semantics
G. Ramalingam Microsoft Research, India & K. V. Raghavan
Parametric Shape Analysis via 3-Valued Logic
Parametric Shape Analysis via 3-Valued Logic
Static Single Assignment
Symbolic Characterization of Heap Abstractions
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Oukseh Lee, Hongseok Yang, and Kwangkeun Yi {cookcu; hyang;
Program Verification with Graph Types and Monadic Second-order Logic
Presentation transcript:

Shape & Alias Analyses Jaehwang Kim and Jaeho Shin Programming Research Laboratory Seoul National University

Overview K-limited Graph Graph Type Shape Type Memory Type Shape Grammar May-Alias Analysis with Access Paths Shape Analysis via 3-valued Logic

k-limited Graph [Jones&Muchnick79] Parts of AST Motivation –Optimization Storage allocation Garbage collection –Memory error detection

Store = Set of Graphs Semantics of assignments  2 Store node: Var -> Store -> Node AS[[ X := atom]]  = add a new leaf node labeled a; move X to the new node AS[[ X := Y ]]  = move X to (node Y  ) AS[[ X := cons(Y,Z)]]  = make a new node n and move X to label n; add hd, tl edge to (node Y  ) and (node Z  ) respectively; X Y Z a bc hd tl

Share = Set of k-limited Graphs k-limited graph All nodes are accessible by k or less steps from at least one variable Each node may be labeled by one or more variables unknown nodes labeled ? –node represents a set of nodes whose internal structure is not known –?s: node may contains sharings –?c: node may contains cycles Edges from unknown nodes are represented by ---> ?c ?s ? Z X Y W Finite variables, finite size, finite selectors => Share is finite!

AS[[assign]] 2 Share -> Share [[s]] D = [  2 D clean(next[[s]]  ) –next[[s]]  : creates a node or edges moves a variable to other nodes –clean(  ): make graphs k-limited 1.U(  ) = subgraph inaccessible from any variables by path of length k-1 or less 2.remove inaccessible nodes 3.partition U(  ) into SCCs, coalesce each SCC unknown node labeled ?c 4.partition U(  ) into CC, coalesce each CC unknown node labeled ?c, ?s, or ?

Example: clean for 2-limited graph X Y Z ?s remove garbage coalesce cycles coalesce CC X Y Z ?s X Y Z ?c X Y Z ?s ?c

Graph Types [Klarlund&Schwartzbach93] Recursive data type is useful for tree- shaped value L ! nonempty(head:Int, tail:L) | empty() 123 Recursive data type can’t represent graph shaped value!

Routing expression Graph type = Recursive data type + Routing expression –T ! v(…, a:T[R], …) –R = {  a, ", " a, ^, $}* Examples –List with head H ! (first:L, last:L[  first (  tail)*$ " ]) L ! (head:Int, tail:L) | () –Cyclic lilst C ! (next:C) | (next:C[ " *^])

Routing Expression (cont) Well-formedness –Every routing expression always defines a unique destination –Checked by the monadic second-order logic Evaluation –Linear-time evaluation of routing expression by finite automata –Runtime evaluation cost is major weakness of graph type

Shape Type [Fradet&Metayer97] Context free graph grammar H = NT : ranked non-terminal T : ranked terminal PR : productions l ! r s.t. l 2 NT, r 2 2 Term Term : built from NT, T and set of variables Terminal : T and set of variables O : origin or derivation Shape(H) = {M| M ! *{O} and M 2 Terminal} X + (  r) ! * X + (  l) iff l ! r 2 PR and (Var(  r) – Var(  l)) Å Var(X) = ;  : variable substitution

Example a1a2a3 p pred next pred Doubly -> p x, pred x x, L x L x -> next x y, pred y x, L y | next x x

Shape-C Transformer P = (C => A)If Check(C,A,PR,O), then for all X and , X + (  C) ! * {O} => X + (  A) ! * {O} shape int cir { pt x, L x x; L x y = L x z, L z y; L x y = next x y; }; … cir s = [|=> pt x; next x x; $x=1; |]; … s:[| pt x; next x y; => pt x; next x z; next z y; $z=i; |] pt x; next x z; next z y pt x; L x z; L z y pt x; L x y

Memory Type[LeYaYi03] Goal: Separating reusable heap cells from others fun insert i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert i t in h::z fun insert b i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert b i t in free l when b; h::z

Multiset Formula A term for an abstract multiset of locations: L 2 Locations ! {0,1, 1 }

Memory Type For trees, For functions which have tree ! tree type, left right root resultusage input new

Estimating Memory Usage

X (1) Memory-Usage Analysis fun copyleft a = case a of Leaf => Leaf | Node(al,ar) => let r = copyleft al (1) in Node(r,ar) (2) X (2) A.rightA.left A.root copyleft : resultusage Result: Usage: A R

Constructing Dynamic Flags 1/2

Find condition to free “ ” under two preservation constraints: Constructing Dynamic Flags 2/2

Final copyleft fun copyleft [ ,  ns ] a = case a of Leaf => Leaf | Node(al,ar) => let r = copyleft [  ^  ns,  ns ] al in (free a when  ; Node(r,ar))  : whether it is safe to free the input tree excluding the result  ns :whether the input tree has no shared sub- parts

Shape Grammar[LeYaYi05] Language Abstract Domain & Semantics Correctness: is provable by the separation logic

Normalization rmjunkfold unifysimplyfy a cb x   a x   ::= a cb x  d fe x  nil a cb x    ::= R(  ) [ R(  )  ::= R(  ) [ {nil} a x y b nil a x y a’a’  ::= nil | |  ::=  ::=  ::= nil |  ::= | {/}{/} = {  /nil} [ {  ::= nil} =  ::= nil |  ::= | nil {  /  } = {  ::= nil | }

Cyclic Structure Non-terminals have parameters Ex. Doubly linked list dll ::= nil | c ba x c ba x  c (a)  c ::=

Conclusion

May-alias Analysis for Pointers: Beyond k- limiting A. Deutsch PLDI’94

Symbolic Access Path String of the form e1.e2. ….en, ei is either: –a selector: name of a variable or structure field, dereference operator –an expression of form B k where B is the basis of some recursive type e.g. –X  (tl  ) i hd –T  {left ,right  } j key Note: x  f for a shorthand of (*x).f

Symbolic Alias Pairs (, K) –Symbolic access paths f1 and f2 are aliases where equations K of numeric domain satisfy –e.g. (, {k1=k2})

Program Semantics Every operation a statement applies to an access path is also applied to all aliases e.g.

But This Approach Might be effective for finding aliases But not appropriate for memory leak detection Since unreachable/leaking memory cells (i.e. cells that have no access paths) cannot be expressed, hence ignored

Shape Analysis via 3-valued Logic M. Sagiv, T. Reps, R. Wilhelm POPL’99

Representing Stores Using predicates –x(u): Is u pointed by x? –n(u,u’): Is u’ pointed by field n of u? –sm(u): Is u summarizing multiple cells?

3-valued Logic Definite/indefinite values –x(u) = 1 means x points u –x(u) = 0 means x does not point u –x(u) = ½ means x may or may not point u First order formulae with transitive closure are used Unbiased analysis is possible 10 ½

3-valued Structures 3-valued structure ∈ 3-STRUCT[P] –P: Some given predicates (vocabularies) –U: universe of cells (individuals) –i: mapping 3-value for predicates over cells e.g. i(x)(u) = ½ is the value for x(u) Ordered by existence of embedding –f embeds S= in S’= if i S (p)(u 1, …, u k ) ≤ i S’ (p)(u 1, …, u k ) (|{u | f(u) = u’}| > 1) ≤ i S’ (sm)(u’)

Abstraction Renaming cells to their canonical names –partitions cells by their properties –bounds the size of a structure finite x y u1u1 unun u n+ 1 x y u {x},{y} u {y},{x} … t_embed c

Abstract Semantics 1/2 Semantics of each statement –is given by predicate update formulae which defines the new value of each predicate after running the statement e.g. st: x = t->n

Abstract Semantics 2/2 3-STRUCT[P]  3-STRUCT[P] transformer associated with st –Especially, x = malloc() changes the universe

Shape Analysis Algorithm Computes the least fixpoint of set of structures –For each node v in control flow graph G

Precision Improvements Decompose transformer for st into 3 steps 1.Focus 2.Transform (the original transformer) 3.Coerce

Focus For given formulae, Focus returns set of structures that each makes the formulae evaluate to a definite value Abstract interpretation continues separately with each case Improves precision

Coerce Compatible structures –Remove impossible structures –Recover indefinite values to definite –e.g. single variable cannot point two cells x(u) = 1 and x(u’) = ½ then x(u’) = 0 Improves precision

Partially Isomorphism Heap Abstraction R. Manevich et al., SAS2004 Merges structures with same universes Hence, reduces the size of sets used in fixpoint computation Improves analysis speed 3~100 times Reduces memory usage up to 1/200

What we can learn Sharing is not a simple problem We might need relation rather than function for points-to information …