Compactly Representing First-Order Structures for Static Analysis

Compactly Representing First-Order Structures for Static Analysis
Tel-Aviv University Roman Manevich Mooly Sagiv I.B.M T.J. Watson Ganesan Ramalingam John Field Deepak Goyal

Motivation Space is a major bottleneck
TVLA is a powerful and general abstract interpretation system Abstract interpretation in TVLA Operational semantics is expressed with first-order logic formulae Program states are represented as sets of Evolving First-Order Structures TVLA can do shape-analysis but is not restricted to it. Space is a major bottleneck

Desired Properties Share common sub-structures Sparse data structures
Inherited sharing Incidental sharing due to program invariants But feasible time performance Phase sensitive data structures Here are some properties that could be exploited to improve performance.

Outline Background First-order structure representations
Base representation (TVLA 0.91) BDD representation Empirical evaluation Conclusion

First-Order Logical Structures
Generalize shape graphs Arbitrary set of individuals Arbitrary set of predicates on individuals Dynamically evolving Usually small changes Properties are extracted by evaluating first order formula: ∃v1 , v: x(v1) ∧ n(v1, v) Join operator requires isomorphism testing

First-Order Structure ADT
Structure : new() /* empty structure */ SetOfNodes : nodeSet(Structure) Node : newNode(Structure) removeNode(Structure, node) Kleene eval(Structure, p(r), <u1, ,ur>) update(Structure, p(r), <u1, ,ur>, Kleene) Structure copy(Structure)

print_all Example /* list.h */ typedef struct node { struct node * n; int data; } * L; /* print.c */ #include “list.h” void print_all(L y) { L x; x = y; while (x != NULL) { /* assert(x != NULL) */ printf(“elem=%d”, xdata); x = xn; } }

print_all Example x = y x’(v) := y(v) S0 S1 n=½ u1 y=1 n=½ u sm=½
copy(S0) : S1 S1 u1 y=1 u sm=½ n=½ nodeset(S0) : {u1, u} eval(S0, y, u1) : 1 You can see an opportunity for sharing between S0 and S1, since the only difference between their predicate values Is in the interpretation of the predicate x for the node u1. update(S1, x, u1, 1) eval(S0, y, u) : 0 x=1 update(S1, x, u, 0)

print_all Example while (x != NULL) precondition : ∃v x(v)
u1 x=1 y=1 u sm=½ S1 n=½ n=½ x = x  n focus : ∃v1 x(v1) ∧ n(v1, v) x’(v) := ∃v1 x(v1) ∧ n(v1, v) S2.0 u1 y=1 u sm=½ n=½ There are more opportunities for sharing here, since the right-most nodes, which represents the tail of the linked-list Has similar predicate values in most of the structures. S2.1 u1 y=1 n=1 u x=1 n=½ n=½ S2.2 u1 y=1 n=1 n=½ u.0 sm=½ u.1 x=1

Overview and Main Results
Two novel representations of first-order structures New BDD representation New representation using functional maps Implementation techniques Empirical evaluation Comparison of different representations Space is reduced by a factor of 4–10 New representations scale better

Base Representation (Tal Lev-Ami SAS 2000)
Two-Level Map : Predicate  (Node Tuple  Kleene) Sparse Representation Limited inherited sharing by “Copy-On-Write” This is the reference against which we compare the new representations (improvements). The maps are implemented with hash-tables.

BDDs in a Nutshell (Bryant 86)
Ordered Binary Decision Diagrams Data structure for Boolean functions Functions are represented as (unique) DAGs f x3 x2 x1 1 x1 x2 x2 x3 x3 x3 x3 1 1 1

BDDs in a Nutshell (Bryant 86)
Ordered Binary Decision Diagrams Data structure for Boolean functions Functions are represented as (unique) DAGs Also achieve sharing across functions x1 x1 x1 x2 x2 x2 x2 x2 Every sub-function has a single copy in the BDD system. x3 x3 x3 x3 x3 x3 x3 1 1 1 Duplicate Terminals Duplicate Nonterminals Redundant Tests

Encoding Structures Using Integers
Static encoding of Predicates Kleene values Dynamic encoding of nodes 0, 1, …, n-1 Encode predicate p’s values as ep(p).en(u1). en(u2) . … . en(un) . ek(Kleene)

BDD Representation of Integer Sets
Characteristic function S={1,5} 1=<001> =<101> S = (¬x1¬x2x3)  (x1¬x2x3) 1 x2 x1 x3 Only 3 significant bits are needed for the binary encoding of the members of the set, and therefore the BDD representation requires three Boolean variables.

BDD Representation of Integer Sets
Characteristic function S={1,5} 1=<001> =<101> S = (¬x1¬x2x3)  (x1¬x2x3) 1 x2 x1 x3 The edges connecting nodes to the 0 terminal will be removed in the following diagrams to avoid clutter.

BDD Representation Example
u sm=½ S0 u1 y=1 n=½ S0 This demonstrates the exploitation of predicates’ sparsity. 1

u1 y=1 n=½ u sm=½ S0 S1 x=y n=½ u1 x=1 y=1 n=½ u sm=½ S1 This figure demonstrates inherited type of sharing. 1

u1 y=1 n=½ u sm=½ S0 S1 x=y n=½ u1 x=1 y=1 n=½ u sm=½ S1 There are two new BDD variables because of materialization (three nodes in S2.2). x=xn n=½ n=½ u1 y=1 n=1 n=½ u.0 sm=½ S2.2 u.1 x=1 1

u1 y=1 n=½ u sm=½ S0 S1 x=y n=½ u1 x=1 y=1 n=½ u sm=½ S1 This figure demonstrates how all three structures share the predicate values that correspond to the tail of the linked-list. x=xn n=½ n=½ u1 y=1 n=1 n=½ u.0 sm=½ S2.2 u.1 x=1 1

Improved BDD Representation
Using this representation directly doesn’t save space Observation Node names can be arbitrarily remapped without affecting the ADT semantics Our heuristics Use canonic node names to encode nodes Increases incidental sharing Reduces isomorphism test to pointer comparison 4-10 space reduction

Reducing Time Overhead
Current implementation not optimized Expensive formula evaluation Hybrid representation Distinguish between phases: mutable phase  Join  immutable phase Dynamically switch representations

Functional Representation
Alternative representation for first-order structures Structures represented by maps from integers to Kleene values Tailored for representing first-order structures Achieves better results than BDDs Techniques similar to the BDD representation More details in the paper

Empirical Evaluation Benchmarks: Stress testing the representations
Cleanness Analysis (SAS 2000) Garbage Collector CMP (PLDI 2002) of Java Front-End and Kernel Benchmarks Mobile Ambients (ESOP 2000) Stress testing the representations We use “relational analysis” Save structures in every CFG location CA - a C program implementing the instructions selection phase of the Tiger educational compiler. GC - partial verification of the mark phase of a mark-and-sweep garbage collector. JFE + Kernel - instances of the Concurrent Modification Problem (CMP) CMP requires identifying a specific type of misuse of Java Collection Classes. MA - verifies certain safety properties of a packet router in the mobile ambient calculus. We don’t choose the best TVLA options for the analyses, because we’re interested in testing the data structures (in isolation).

Space Results

Abstract Counters Ignore language/implementation details
A more reliable measurement technique Count only crucial space information Independent of C/Java

Abstract Counters Results

Trends in the Cleanness Analysis Benchmark

What’s Missing from this Work?
Investigate other node mapping heuristics Compactly represent sets of structures Time optimizations Since the representation of structures takes only a small number of nodes (2-3 in some benchmarks) then not much room for improvement exists, which is why we’re considering representing sets of structures.

Conclusions Two novel representations of first-order structures
New BDD representation New representation using functional maps Implementation techniques Normalization techniques are crucial Empirical evaluation Comparison of different representations Space is reduced by a factor of 4–10 New representations scale better

Conclusions The use of BDDs for static analysis is not a panacea for space saving Domain-specific encoding crucial for saving space Failed attempts Original implementation of Veith’s encoding PAG

The End

Compactly Representing First-Order Structures for Static Analysis

Similar presentations

Presentation on theme: "Compactly Representing First-Order Structures for Static Analysis"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Compactly Representing First-Order Structures for Static Analysis

Similar presentations

Presentation on theme: "Compactly Representing First-Order Structures for Static Analysis"— Presentation transcript:

Similar presentations

About project

Feedback