Pointer and Shape Analysis Seminar Mooly Sagiv Schriber 317 Office Hours Thursday 15-16
General Information Prerequisites –Compilers | Program Analysis –Select 3 topics by Sunday –Participate in 9 seminar talks –Present a paper
Outline 1.Schedule 2.Point-to analysis
Tentative Schedule 7/2Shachar ItzhakyPractical virtual method call resolution for Java 14/2Roy GanorEffective Static Race Detection for Java 17/ Hongseok YangScalable Shape Analysis 28/2Roza PogalnikovaContext-Sensitive Points-to Analysis: Is It Worth It? 3/3Ory SamorodnitzkyThe undecidability of aliasing 14/3Alex ShapiroError detection using client driven poniter analysis 21/3Roman SimkinFree-Me: A Static Analysis for Automatic Individual Object Reclamation 28/3Uri Inon
Points-To Analysis Determine if a variable points to a variable at some (all) execution paths [1] p = &a; [2] q = &b; [3] if (getc()) [4] q = &c [5] p a q c q b
Iterative Program Analysis Start by optimistically assuming that nothing is wrong –No points-to set At every iteration apply the abstract meaning of programming language statements and add more points-to pairs Stop when no changes occur
Iterative Points-to Analysis t= &a y= &b z= &c p= &yp= &z tata t a, y b t a, y b, z c, p y t a, y b, z c t a, y b, z c, p z t a, y b, z c, p y, p z t a, y b, z c *p= t
Iterative Points-to Analysis t= &a y= &b z= &c p= &yp= &z tata t a, y b t a, y b, z c, p yt a, y b, z c, p z t a, y b, z c, p y, p z *p= t t a, y b, z c, p y, p z t a, y b, z c, p y, p z, y a, z a
Iterative Points-to Analysis t= &a y= &b z= &c p= &yp= &z tata t a, y b t a, y b, z c, p y, p z *p= t t a, y b, z c, p y, p z t a, y b, z c, p y, p z, y a, z a t a, y b, z c, p y, y a, z a t a, y b, z c, p z, y a, z a
Iterative Points-to Analysis t= &a y= &b z= &c p= &yp= &z tata t a, y b *p= t t a, y b, z c, p y, p z, y a, z a t a, y b, z c, p y, y a, z a t a, y b, z c, p z, y a, z a t a, y b, z c, p y, p z, y a, z a
A Simple Programming Language Arbitrary (uninterpreted) control flow statement Atomic statements –x = y –x = &y –x = *y –*x = y
Abstract Semantics For every atomic statement S S # : P(Var* Var*) P(Var* Var*) x := &y # (pt) = pt – {(x, *)} {(x, y)} x := y # (pt) = pt – {(x, *)} {(x, z)| (y, z) pt} x := *y # (pt) = pt – {(x, *)} {(x, z)| (y, w), (w, z) pt} *x := y # (pt) = pt {(w, t)| (x, w), (y, t) pt}
t= &a y= &b z= &c p= &yp= &z *p= t pt[1]={} 1pt[2]={(t, a)} 2pt[3]={(t, a), (y, b)} 3pt[4]={(t, a), (y, b), (z, c){ 4pt[5]= {(t, a), (y, b), (z, c)} pt[6]= {(t, a), (y, b), (z, c)} 5pt[7]= {(t, a), (y, b), (z, c), (p, y){ 6pt[7]= {(t, a), (y, b), (z, c), (p, y), (p, z)} 7pt[4]= {(t, a), (y, b), (z, c), (p, y), (p, z)} 4pt[5]= {(t, a), (y, b), (z, c), (p, y), (p, z), (y, a), (z, a)} pt[6]= {(t, a), (y, b), (z, c), (p, y), (p, z), (y, a), (z, a)} 5 6
Supporting Memory Allocation Uniform treatment of the memory allocated at an allocation statement For every atomic statement S – S #: P(Var* Var*) P(Var* Var*) – x := &y # (pt) = pt – {(x, *)} {(x, y)} – x := y # (pt) = pt – {(x, *)} {(x, z)| (y, z) pt} – x := *y # (pt) = pt – {(x, *)} {(x, z)| (y, w), (w, z) pt} – *x := y #(pt) = pt {(w, t)| (x, w), (y, t) pt} – l: x := malloc() #(pt) = pt – {(x, *)} {(x, l)}
Summary Flow-Sensitive Solution Limited destructive updates –Can be improved with must information O(N * Var 2 ) space
Context-Sensitivity How to handle procedures Separate points-to sets for every call A uniform set for all calls
Context Sensitivity Example x = &t1; a = &t2; foo(x, a); z = &t3; b = &t4; foo(z, b); void foo(source, target) { *source = target; }
Flow-Insensitive Analysis Ignore control flow statements Arbitrary statement order Only accumulate Points-to Usually represented as a directed graph O(n 2 ) space
Flow Insensitive Solution t= &a y= &b z= &c p= &yp= &z *p= t
Set Constraints A set of rules of the form: –lhs rhs –t rhs’ lhs rhs (conditional constraint) lhs, rhs, rhs’ are variables over sets of terms t is a term The least solution can be found iteratively –start with empty sets –add terms when needed Cubic graph based solution
t := &a; {a} pt[t] y := &b; {b} pt[y] z := &c; {c} pt[z] if (nondet()) p:= &y; {y} pt[p] else p:= &z; {z} pt[p] *p := t; a pt[p] pt[t] pt[a] b pt[p] pt[t] pt[b] c pt[p] pt[t] pt[c] y pt[p] pt[t] pt[y] z pt[p] pt[t] pt[z] t pt[p] pt[t] pt[t] p pt[p] pt[t] pt[p] tyz abc p
Unification Based Solution Steengard 1996 Treat assignments as equalities Employ union-find algorithm Almost linear time complexity
Conclusions Points-to analysis is a simple pointer analysis problem Effective solutions (8MLoc) But rather imprecise Set constraints are useful beyond pointer analysis –Class level analysis