Download presentation
Presentation is loading. Please wait.
Published byMadison Carr Modified over 6 years ago
1
Compactly Representing First-Order Structures for Static Analysis
Tel-Aviv University Roman Manevich Mooly Sagiv I.B.M T.J. Watson Ganesan Ramalingam John Field Deepak Goyal
2
Motivation Space is a major bottleneck
TVLA is a powerful and general abstract interpretation system Abstract interpretation in TVLA Operational semantics is expressed with first-order logic formulae Program states are represented as sets of Evolving First-Order Structures TVLA can do shape-analysis but is not restricted to it. Space is a major bottleneck
3
Desired Properties Share common sub-structures Sparse data structures
Inherited sharing Incidental sharing due to program invariants But feasible time performance Phase sensitive data structures Here are some properties that could be exploited to improve performance.
4
Outline Background First-order structure representations
Base representation (TVLA 0.91) BDD representation Empirical evaluation Conclusion
5
First-Order Logical Structures
Generalize shape graphs Arbitrary set of individuals Arbitrary set of predicates on individuals Dynamically evolving Usually small changes Properties are extracted by evaluating first order formula: ∃v1 , v: x(v1) ∧ n(v1, v) Join operator requires isomorphism testing
6
First-Order Structure ADT
Structure : new() /* empty structure */ SetOfNodes : nodeSet(Structure) Node : newNode(Structure) removeNode(Structure, node) Kleene eval(Structure, p(r), <u1, ,ur>) update(Structure, p(r), <u1, ,ur>, Kleene) Structure copy(Structure)
7
print_all Example /* list.h */ typedef struct node { struct node * n; int data; } * L; /* print.c */ #include “list.h” void print_all(L y) { L x; x = y; while (x != NULL) { /* assert(x != NULL) */ printf(“elem=%d”, xdata); x = xn; } }
8
print_all Example x = y x’(v) := y(v) S0 S1 n=½ u1 y=1 n=½ u sm=½
copy(S0) : S1 S1 u1 y=1 u sm=½ n=½ nodeset(S0) : {u1, u} eval(S0, y, u1) : 1 You can see an opportunity for sharing between S0 and S1, since the only difference between their predicate values Is in the interpretation of the predicate x for the node u1. update(S1, x, u1, 1) eval(S0, y, u) : 0 x=1 update(S1, x, u, 0)
9
print_all Example while (x != NULL) precondition : ∃v x(v)
u1 x=1 y=1 u sm=½ S1 n=½ n=½ x = x n focus : ∃v1 x(v1) ∧ n(v1, v) x’(v) := ∃v1 x(v1) ∧ n(v1, v) S2.0 u1 y=1 u sm=½ n=½ There are more opportunities for sharing here, since the right-most nodes, which represents the tail of the linked-list Has similar predicate values in most of the structures. S2.1 u1 y=1 n=1 u x=1 n=½ n=½ S2.2 u1 y=1 n=1 n=½ u.0 sm=½ u.1 x=1
10
Overview and Main Results
Two novel representations of first-order structures New BDD representation New representation using functional maps Implementation techniques Empirical evaluation Comparison of different representations Space is reduced by a factor of 4–10 New representations scale better
11
Base Representation (Tal Lev-Ami SAS 2000)
Two-Level Map : Predicate (Node Tuple Kleene) Sparse Representation Limited inherited sharing by “Copy-On-Write” This is the reference against which we compare the new representations (improvements). The maps are implemented with hash-tables.
12
BDDs in a Nutshell (Bryant 86)
Ordered Binary Decision Diagrams Data structure for Boolean functions Functions are represented as (unique) DAGs f x3 x2 x1 1 x1 x2 x2 x3 x3 x3 x3 1 1 1
13
BDDs in a Nutshell (Bryant 86)
Ordered Binary Decision Diagrams Data structure for Boolean functions Functions are represented as (unique) DAGs Also achieve sharing across functions x1 x1 x1 x2 x2 x2 x2 x2 Every sub-function has a single copy in the BDD system. x3 x3 x3 x3 x3 x3 x3 1 1 1 Duplicate Terminals Duplicate Nonterminals Redundant Tests
14
Encoding Structures Using Integers
Static encoding of Predicates Kleene values Dynamic encoding of nodes 0, 1, …, n-1 Encode predicate p’s values as ep(p).en(u1). en(u2) . … . en(un) . ek(Kleene)
15
BDD Representation of Integer Sets
Characteristic function S={1,5} 1=<001> =<101> S = (¬x1¬x2x3) (x1¬x2x3) 1 x2 x1 x3 Only 3 significant bits are needed for the binary encoding of the members of the set, and therefore the BDD representation requires three Boolean variables.
16
BDD Representation of Integer Sets
Characteristic function S={1,5} 1=<001> =<101> S = (¬x1¬x2x3) (x1¬x2x3) 1 x2 x1 x3 The edges connecting nodes to the 0 terminal will be removed in the following diagrams to avoid clutter.
17
BDD Representation Example
u sm=½ S0 u1 y=1 n=½ S0 This demonstrates the exploitation of predicates’ sparsity. 1
18
BDD Representation Example
u1 y=1 n=½ u sm=½ S0 S1 x=y n=½ u1 x=1 y=1 n=½ u sm=½ S1 This figure demonstrates inherited type of sharing. 1
19
BDD Representation Example
u1 y=1 n=½ u sm=½ S0 S1 x=y n=½ u1 x=1 y=1 n=½ u sm=½ S1 There are two new BDD variables because of materialization (three nodes in S2.2). x=xn n=½ n=½ u1 y=1 n=1 n=½ u.0 sm=½ S2.2 u.1 x=1 1
20
BDD Representation Example
u1 y=1 n=½ u sm=½ S0 S1 x=y n=½ u1 x=1 y=1 n=½ u sm=½ S1 This figure demonstrates how all three structures share the predicate values that correspond to the tail of the linked-list. x=xn n=½ n=½ u1 y=1 n=1 n=½ u.0 sm=½ S2.2 u.1 x=1 1
21
Improved BDD Representation
Using this representation directly doesn’t save space Observation Node names can be arbitrarily remapped without affecting the ADT semantics Our heuristics Use canonic node names to encode nodes Increases incidental sharing Reduces isomorphism test to pointer comparison 4-10 space reduction
22
Reducing Time Overhead
Current implementation not optimized Expensive formula evaluation Hybrid representation Distinguish between phases: mutable phase Join immutable phase Dynamically switch representations
23
Functional Representation
Alternative representation for first-order structures Structures represented by maps from integers to Kleene values Tailored for representing first-order structures Achieves better results than BDDs Techniques similar to the BDD representation More details in the paper
24
Empirical Evaluation Benchmarks: Stress testing the representations
Cleanness Analysis (SAS 2000) Garbage Collector CMP (PLDI 2002) of Java Front-End and Kernel Benchmarks Mobile Ambients (ESOP 2000) Stress testing the representations We use “relational analysis” Save structures in every CFG location CA - a C program implementing the instructions selection phase of the Tiger educational compiler. GC - partial verification of the mark phase of a mark-and-sweep garbage collector. JFE + Kernel - instances of the Concurrent Modification Problem (CMP) CMP requires identifying a specific type of misuse of Java Collection Classes. MA - verifies certain safety properties of a packet router in the mobile ambient calculus. We don’t choose the best TVLA options for the analyses, because we’re interested in testing the data structures (in isolation).
25
Space Results
26
Abstract Counters Ignore language/implementation details
A more reliable measurement technique Count only crucial space information Independent of C/Java
27
Abstract Counters Results
28
Trends in the Cleanness Analysis Benchmark
29
What’s Missing from this Work?
Investigate other node mapping heuristics Compactly represent sets of structures Time optimizations Since the representation of structures takes only a small number of nodes (2-3 in some benchmarks) then not much room for improvement exists, which is why we’re considering representing sets of structures.
30
Conclusions Two novel representations of first-order structures
New BDD representation New representation using functional maps Implementation techniques Normalization techniques are crucial Empirical evaluation Comparison of different representations Space is reduced by a factor of 4–10 New representations scale better
31
Conclusions The use of BDDs for static analysis is not a panacea for space saving Domain-specific encoding crucial for saving space Failed attempts Original implementation of Veith’s encoding PAG
32
The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.