Abstract Interpretation Part II Mooly Sagiv
Outline Tarski’s fixed point theorem The Soundness Theorem Infinite Domains (Widening & Narrowing) Canonic Abstraction Shape analysis is a separate 3 hours lecture with demos
Fixed Points f() f2() A monotone function f: L L Red(f) f() f() f2() f2() A monotone function f: L L l1 l2 f(l1 ) f(l2 ) (L, , , , , ) is a complete lattice Fix(f) = { l: l L, f(l) = l} Red(f) = {l: l L, f(l) l} Ext(f) = {l: l L, l f(l)} Tarski’s Theorem 1955: lfp(f) = Fix(f) = Red(f) Fix(f) gfp(f) = Fix(f) = Ext(f) Fix(f) gfp(f) Ext(f) Fix(f) lfp(f)
Abstract (Conservative) interpretation Operational semantics statement s Set of states Set of states abstraction abstraction abstract representation Abstract semantics statement s abstract representation abstract representation
Abstract (Conservative) interpretation Operational semantics statement s Set of states concretization Set of states Set of states concretization abstract representation Abstract semantics statement s abstract representation
Abstract (Conservative) interpretation Operational semantics statement s Set of states concretization Set of states abstraction abstract representation Abstract semantics statement s abstract representation abstract representation
Soundness Theorem [CC] Let (, ) form Galois connection from C to A (c) a iff c (a) and are monotone ( (a)) a c ((c)) f: C C be a monotone function f# : A A be a monotone function aA: f((a)) (f#(a)) cC: (f(c)) f#((a)) aA: (f((a)) f#(a) lfp(f) (lfp(f#)) (lfp(f)) lfp(f#)
f(x)x f() f() f2() f2() f#(y)y f#() f#() f#2() f#2() gfp(f) gfp(f#) f(x)x f#(y)y f(x)=x f#(y)=y lfp(f) lfp(f#)
Finite Height Case Lfp(f) f Lfp(f#) f# f f# f f#
Example Interval Analysis Find a lower and an upper bound of the value of a variable Usages? Lattice L = (Z{-, }Z {-, }, , , , ,) [a, b] [c, d] if c a and d b [a, b] [c, d] = [min(a, c), max(b, d)] [a, b] [c, d] = [max(a, c), min(b, d)] = =
The need for disjunctions if (…) … [1, 5] else … [7, 8] assert x !=6
Widening for Interval Analysis [c, d] = [c, d] [a, b] [c, d] = [ if a c then a else -, if b d then b else ]
Example Program Interval Analysis [x := 1]1 ; while [x 1000]2 do [x := x + 1;]3 IntEntry(1) = [-, ] IntExit(1) = [1,1] IntEntry(2) = InExit(2) (IntExit(1) IntExit(3)) IntExit(2) = IntEntry(2) IntEntry(3) = IntExit(2) [-,1000] IntExit(3) = IntEntry(3)+[1,1] [x:=1]1 [x 1000]2 [x := x+1]3 [exit]4 IntEntry(4) = IntExit(2) [1001, ] IntExit(4) = IntEntry(4)
Requirements on Widening For all elements l1 l2 l1 l2 For all ascending chains l0 l1 l2 … the following sequence is finite y0 = l0 yi+1 = yi li+1 For a monotonic function f: L L define x0 = xi+1 = xi f(xi ) Theorem: There exits k such that xk+1 = xk xk Red(f) = {l: l L, f(l) l}
Narrowing Improve the result of widening y x y (x y) x For all decreasing chains x0 x1 … the following sequence is finite y0 = x0 yi+1 = yi xi+1 For a monotonic function f: L L and x Red(f) = {l: l L, f(l) l} define y0 = x yi+1 = yi f(yi ) Theorem: There exits k such that yk+1 =yk yk Red(f) = {l: l L, f(l) l}
Narrowing for Interval Analysis [a, b] = [a, b] [a, b] [c, d] = [ if a = - then c else a, if b = then d else b ]
Example Program Interval Analysis [x := 1]1 ; while [x 1000]2 do [x := x + 1;]3 IntEntry(1) = [- , ] IntExit(1) = [1,1] IntEntry(2) = InExit(2) ( IntExit(1) IntExit(3)) IntExit(2) = IntEntry(2) IntEntry(3) = IntExit(2) [-,1000] IntExit(3) = IntEntry(3)+[1,1] [x:=1]1 [x 1000]2 [x := x+1]3 [exit]4 IntEntry(4) = IntExit(2) [1001, ] IntExit(4) = IntEntry(4)
Non Montonicity of Widening [0,1] [0,2] = [0, ] [0,2] [0,2] = [0,2]
Widening and Narrowing Summary Very simple but produces impressive precision Sometimes non-monotonic The McCarthy 91 function Also useful in the finite case Can be used as a methodological tool int f(x) [- , ] if x > 100 then [101, ] return x -10 [91, -10]; else [-, 100] return f(f(x+11)) [91, 91] ;
Numerical Abstractions Octagon only maintains correlations between two variables y x c y c Interval x y c Octagon c1x c2y c Polyhedron x
Non-Numerical Abstractions
Predicate Abstraction L = (P(P(B)), , , , ,) X Y if X Y X Y = X Y X Y = X Y = P(B) =
Example Programs if (x > 0) y = malloc(); … if (x >0) z = *y; while x != y do x = x n;
Canonical Abstraction Abstract unbounded sets of memory locations into a bounded set Partition based abstraction Use unary relations (symbols as distinctions) Maintain binary relations when necessary
Canonical Abstraction x = null; while (…) do { t = malloc(); t.next=x; x = t } n n u1 u2 u3 x t n n u1 x t u2,3 n
Canonical Abstraction and Equality x = null; while (…) do { t = malloc(); t .next=x; x = t } eq n n u1 u2 u3 eq x t eq eq eq eq eq n u1 u2,3 u2,3 x t n
is(v) = v1,v2: n(v1,v) n(v2,v) v1 v2 Heap Sharing relation is(v) = v1,v2: n(v1,v) n(v2,v) v1 v2 is(v)=0 is(v)=0 is(v)=0 u1 u2 … un x n n n t n u1 u2..n x t n is(v)=0 is(v)=0
is(v) = v1,v2: n(v1,v) n(v2,v) v1 v2 Heap Sharing relation is(v) = v1,v2: n(v1,v) n(v2,v) v1 v2 is(v)=0 is(v)=1 is(v)=0 u1 u2 … un x n n n u1 x t u2 n is(v)=0 is(v)=1 u3..n t n
Reachability relation t[n](v1, v2) = n*(v1,v2) t[n] ... u1 u2 un x n n n t t[n] n u1 u2..n x t[n] t t[n] n
List Segments u1 u2 u3 u4 u5 u6 u7 u8 n n n n n n n x y u1 x
Reachability from a variable r[y](v) =w: y(w) n*(w, v) r[y]=0 r[y]=1 u1 u2 u3 u4 u5 u6 u7 u8 n n n n n n n x y u1 x u2,3,4 u5 n y u6,7,8
Sortedness dle ... u1 u2 un x n n n t dle n u1 u2..n x dle t dle n
inOrder(v) = v1: n(v,v1) dle(v, v1) Example: Sortedness inOrder(v) = v1: n(v,v1) dle(v, v1) inOrder = 1 inOrder = 1 inOrder = 1 dle ... u1 u2 un x n n n t dle n u1 u2..n x dle dle t n inOrder = 1 inOrder = 1
Example: InsertSort Run Demo List InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r n; pl = NULL; while (l != r) { if (l data > r data) { pr n = rn; r n = l; if (pl = = NULL) x = r; else pl n = r; r = pr; break; } pl = l; l = l n; pr = r; r = rn; return x; typedef struct list_cell { int data; struct list_cell *n; } *List; Run Demo
Example: InsertSort Run Demo 14 typedef struct list_cell { int data; List InsertSort(List x) { if (x == NULL) return NULL pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { pr->n = rn ; r->n = l; pl->n = r; r = pr; break; } pl = l; l = l->n; pr = r; r = rn; typedef struct list_cell { int data; struct list_cell *n; } *List; Run Demo 14
void Mark(Node root) { if (root != NULL) { pending = pending = pending {root} marked = while (pending ) { x = SelectAndRemove(pending) marked = marked {x} t = x left if (t NULL) if (t marked) pending = pending {t} /* t = x right * if (t NULL) * if (t marked) * pending = pending {t} */ } } assert(marked = = Reachset(root))}
There may exist an individual that is reachable from the root, but not marked left right root right r[root] m r[root] x left right
Conclusions(1) Good static analysis = Good static analysis Precise enough (for the client) Efficient enough Good static analysis Good domain Abstract non-important details Represent relevant concrete information Precise and efficient abstract meaning of abstract interpreters Efficient join implementation Small height or widening
Conclusions(2) The Theory of Static Analysis is well founded Abstraction Soundness Chaotic iterations Elimination methods Modular methods Weak Parts Transformations Predictable approximations User defined abstractions System