1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine
Subjects
Goals u Infer inductive invariants on numeric values u Abstract sets of points in P(R n ) u Applications: –Array bound –Termination »infer ranking functions with value in N –Cost Analysis »time, memory consumption are numeric quantities –Pointer analysis with pointer arithmetic »pointer offest –String analysis in C »Length, index
Numeric Semantics
Arithmetic Expressions & Commands u ::= V V Var| | - | op op {+, -, , /}| | [c, c’]c, c’ R {- , } u ::= V := V Var | assume relop 0 | assert relop 0 relop {=, ,, , } u Control Flow Graph G(N, E, s) where E N N is annotated with commands –s N is the start node
Example Program 1: X := [1, 10] ; 2: Y := 100; while 3: X>= 0 do { 4: X := X – 1; 5: Y := Y + 10 } 6: X := [1, 10] Y := [100, 100] assume X 0 X := X -1 assume X<0 Y := Y +10
Concrete Operational Semantics
Semantics of Expressions u States = Var R u Semantics E : R u E V = V u E c, c’ = { x R | c x c’} u E - = { - x | x E } u E op ={x op x’ | x E , x’ E } op {+, -, } u E / =
Semantics of Commands u States = Var R u Semantics C : P( ) P( ) u C V := Z = { [V x] | Z, x E } u C assume relop 0 Z ={ | Z, x E : x relop 0 } u C assert relop 0 Z
Distributivity u C exp is distributive u C exp ( Z) = Z C exp { }
Concrete Semantics of Programs u G(s, N, E) : P( ) N P( ) –The set of reachable states –D = <P( ), , , , , ) u The smallest simultaneous solution to the set of equations G(s, N, E) u Uniquely defined from Tarski’s theorem but not computable CS s = CS n = E C c CS m n s
Numeric Abstract Domains u Representation: a set D # of representable abstract values u <D #, #, #, #, #, # ) –relating the amount of information given by abstract values u A concretization function : D # D = P( ) = P(Var R) u Required algebraic properties: – need to be monotonic: d # d’ d # d’ –Strictness # = – # = Var R u need not be one-to-one
Numeric Abstract Domain Examples y x signs x 0 y x intervals x [a, b] y x octagons x y c y x polyhedra a i x i c
Requirements on abstract operators u Algorithmic requirements –For each c , c # c : D # D # is computable –Algorithm for # »Used for merging control paths and iterations –Algorithm for »Used for assume –Algorithm for # »Used for checking termination
Abstract Semantics of Programs u G(s, N, E) : D # N D # –The set of reachable abstract states –D # = <D #, #, #, #, #, # ) u The smallest simultaneous solution to the set of equations G(s, N, E) # # u Uniquely defined from Tarski’s theorem AS s = # AS n = # E C # c AS m n s
Soundness u The smallest simultaneous solution to the set of equations G(s, N, E) –CS u Any solution AS set of equations G(s, N, E) # # u CS n AS n for all n N CS s = CS n = E C c CS m n s AS s = # AS n = # E C # c AS m n s
Soundness requirement u # u For each c , d D#, c c ( d) (c # c d) D#D# Set of states C # c command c D#D# Set of states C c
Optimality (induced operation) u Requires existence of abstraction : D D # such that form a Galois connection u Define c c # = d. (c c ( d) u may not exist u c c # may be hard to compute
Widening u Accelerate the termination of Chaotic iterations by computing a more conservative solution u Can handle lattices of infinite heights u : D # D # D # such that –d # d’ d d’ –For every increasing chain d # 1 d # 2 …, »The sequence s0 = d # 0 and s i+1 = s i d # i is finite
Chaotic Iterations with widening for each n in N do AS[v] := # AS[s] = # WL = {s} while (WL ) do select and remove an element m WL for each n, such that. (m, c, n) E do temp = c c # AS[m] if m is a loop header then new := AS(n) temp else new := AS(n) # temp if (new AS[n]) then AS[n] := new; WL := WL {n}
Non-Relational Abstractions
Cartezian Abstraction (independent attribute) u Forget the relationship between variables
Example Program X := [1, 10] Y := [100, 100] assume X 0 X := X -1 assume X<0 Y := Y +10
The Interval Domain
The Interval Domain [Moore’66, Cousot’76] u D # = {[a, b] | a b R or a=- or b= } # u # = [- , ] u d # d’ = if d = # then d’ else if d’ = # then d else let d=[a, b] and d’=[c, d] in [min(a, c), max(b, d)] u d # d’ = if d = # then # else if d’ = # then # else let d=[a, b] and d’=[c, d] in let l = max(a, c) and u= min(b, d) if l > u then # else [l, u] u d d’ = if d = # then d’ else else let d=[a, b] and d’=[c, d] in [if a c then a else - , if b d then b else ]
Galois Connection
Abstract Expressions
Abstract Assignments
Optimality (Induced)
Abstract Assume
Example Program X := 0 assume X<40 assume X 40 X := X+1
Relational Domains
The need for relational domains u Non-relation domains cannot represent variable relationships Y :=0; while true do { X:=[-128,128]; D:=[0,16]; S:=Y; Y:=X; R:=X-S; if R<=-D then Y:=S-D fi; if R>=D then Y:=S+D fi } X: input signal Y: output signal S: last output R: Y-S D: max allowed for |R|
The need for relational domains u Infer strong enough inductive invariants X:=0; I:=1; while I<5000 do { if … then X:=X+1 else X:=X-1 fi; I:=I+1 }
The need for relational domains u Modular analysis of procedures Z :=X ; if Y > Z then Z :=Y ; if Z < 0 then Z :=0;
Weakly Relational Domains
The Zone Domain [Shacham’00, Mine’01] Constrains of the form V i – V j c V i c
Machine Representation u A potential constraint has the form V i – V j c u Represented as a directed graph G –Nodes are labeled with variables –An arc with weight c from V i to V j for each constraint V i – V j c u Difference Bound Matrix (DBM) –Adjacency matrix m of G –mij = c < V i – V j c –mij = No such constraints u Concretization
Machine Representation (cont) u Unary constraints –Add another variable V 0 –m has size n+1 n+1 –V i c is denoted as V i -V 0 c, i.e., m i,0 = c –V i c is denoted as V 0 - V i -c, i.e., m 0,i = -c – m = { (v 1, v 2, …, v n ) | (0, v 1, v 2, …, v n ) m} V0V1V2 V0 ++ 43 V1 ++ ++ V21 ++
The DBM Lattice
Relational Domains
The Polyhedra Domain [CH’78] i j a i, j x i, j c i
Summary u Numerical Domains are Powerful u Infer interesting invariants u Cost is an issue u Need to combine with other domains u Next week some applications