Pentagons: A Weakly Relational Abstract Domain for the Efficient Validation of Array Accesses Francesco Logozzo, Manuel Fahndrich Microsoft Research, Redmond
The Background 2 Efficient static checking of.NET assemblies Foxtrot Foxtrot: a language agnostic contract language Clousot Clousot: a language agnostic static analyzer Based on abstract interpretation Checks contracts, array bounds, memory accesses, nullness, …
Demo 3 Wrong? Ok: not null
Demo 4 Ok: index in bounds Ok: not null Ok: index in bounds
The paper in a nutshell Program executions Is 0 ≤ y < x ? Testing: try some points What for the others? Model checking: try all the points What if we have ∞ points? Abstract interpretation: approximation Intervals No in O(n) Octagons Yes! in Θ (n 3 ) Polyhedra Yes! in O(2 n ) Pentagons Yes! In O(n) 5
Pentagons? 6 A lightweight numerical domain Keep relations in the form a ≤ x ≤ b && x < y a, b numerical constants x, y variables Enough to validate > 83% of the accesses of mscorlib.dll Mscorlib.dll is the main library in.NET Fast: Analyze it in a couple of minutes
Abstract domain 7 An abstract domain is a complete lattice endowed with Widening operator To ensure the convergence of the analysis Ex. The increasing chain [0,1] ⊑ [0,2] ⊑ [0,3] ⊑ [0, 4] ⊑... Is extrapolated by widening to [0, +∞] Transfer functions To capture the abstract semantics of statements x → [4,5] Ex. x := y + 3([y → [1, 2]) = [y → [1,2], x → [4,5]]
Interval domain 8 Elements: { [a, b] | a ∈ Z ∪ { -∞ }, b ∈ Z ∪ { +∞ } } Order [a,b] ⊑ [c,d] iff c ≤ a and b ≤ d Join [a,b] ⊔ [c,d] =[min(a,c), max(b,d)] Meet [a,b] ⊓ [c,d] = [max(a,c), min(b,d)] Widening: Keep the stable bounds Transfer functions: ordinary interval arithmetic
LT Domain 9 Elements ℘ ({ X < Y | X and Y are variables }) Efficient representation with Hashtables Order A ⊑ B iff B ⊆ A Join A ⊔ B = A \cap B Meet A ⊓ B = A ∪ B Widening: just the join as the lattice has finite height Transfer functions: y := x + 1 (A) = (A-{y}) ∪ { x < y }
Pentagons 10 Reduced Reduced Cartesian product of Intervals and LT Reduced? Not just pairs: information flows from one element to the other Ex. 2 (x → [1, 4], y → [3, 3], { x (x → [1, 2], y → [3, 3], { x < y }) May introduce cubic slowdown Reduction is applied In precise points of the analysis Lazily at join points
The (Naif) Join of Pentagons 11 Left_P = (left_intv, left_lt), Right_P = (right_intv, right_lt) 1. Close Left_P and Right_P 2. Apply the join pairwisely Closure (intv, lt) iterates until saturation this rule: if x → [a,b], y → [c,d] ∈ intv. If b< c then lt = lt ∪ { x < y } Problem: It introduces a quadratic slowdown
The smarter join on Pentagons 12 Idea: 1. Apply the pairwise join 2. If a symbolic constraint x < y is dropped, check if the other branch implies it 3. If it does, then keep the constraint Formal details in the paper Results: For mscorlib we moved from > 1h to a couple of minutes No access is lost!
Experiment: Array bounds analysis 13 Assemblies as shipped No pre-processing No pre-selection Intra-procedular analysis only Contracts will improve the precision
Conclusions 14 A lightweight abstract domain Used for array bounds validation Efficient, and scalable Implemented in Clousot To be used as a first pass to drop most of the proof obligations In combination with other domains