Set Constraint-Based Program Analysis Manuel Fähndrich CS590 UW Spring 2001
This Lecture Constraint-based program analysis –Set constraint basics –Application: closure analysis –Relation to typing Constraint resolution Extensions –Context-sensitivity –Control-flow sensitivity
Constraint-Based Analysis Static Info Source Constraint Generator Solu- tions Solver Mapping Constraints
Specification—Implementation Static Info Source Constraint Generator Solu- tions Solver Mapping Implementation ProblemSpecification Constraints
Example: Andersen’s Points-To int **x,*y; int *z,w; if (..) x=&y; else x=&z; *x=&w; Constraint Generator Solver Mapping x z y w
Set Constraints Set expressions E ::= X | 0 | E E | E E | E | c(E,...,E) | c -i (E) Constructors c C, fixed arity Constraints ^ i L i R i Solution : X ! H ^ i (L i ) (R i )
Complexity Full language: NEXPTIME complete Useful polynomial subset O(n 3 ) –no variable negation –restricted intersection and union –equivalent to CFL reachability, 2NPDA In practice? –Proportional to explicit solutions
Brief History 1969: Reynolds 1979: Jones and Muchnick 90’s –Heintze: Set-Based Analysis –Complexity results –Applications 00’s –Efficient resolution techniques
Example: Closure Analysis calculus Question: which lambda’s x are applied where? One solution in paper, but we’ll do another one. –Local specification instead of global –First: type inference e ::= x | x.e | e 1 e 2 E ::= X | E 1 ! E 2 | E 1 E 2 | E 1 E 2 | x
Quick reminder: Function types R 1 ! L 1 L 2 ! R 2 L 1 R 2 ^ L 2 R 1
Constraint Generation Rules x.e [x] ! [e] [ x.e] e 1 e 2 [e 1 ] [e 2 ] ! [e 1 e 2 ] Example twice = f. x.f(f(x)) ((A ! B) (B ! C)) ! (A ! C)
Twice Example Set Variables [f] = F [x] = A [f x] = B [f(f x)] = C [ x.f(f(x))] = R [ f. x.f(f(x))] = T Constraints F ! R T A ! C R F B ! C F A ! B Simplify R = A ! C F = A ! B B ! C T = F ! R
Observation Types + constraints establish value flow Think of it as pipes Can flow information in these pipes –E.g. closure analysis –Tokens = lambda names x, f
twice ( z.z) [z] = Z [ z.z] = I [twice s ( z.z)] = S Z ! Z I T I ! S
Closure Analysis using Typing Inject at lambda abstraction x.e [x] ! [e] x [ x.e] Observe at application e 1 p e 2 [e 1 ] [e 2 ] ! [e 1 e 2 ] X p
Closure Analysis: twice ( z.z) [f] = F [x] = A [f x] = B [f c (f b x)] = C [ x.f(f(x))] = R [ f. x.f(f(x))] = T [z] = Z [ z.z] = I [twice s ( z.z)] = S Constraints F ! R f T A ! C x R F B ! C X c F A ! B X b Z ! Z z I T I ! S X s Results f X s z X c z X b x S
Closure on Closure Analysis Purely local formulation –In paper: at application (f x) q.e [f] => [x] [q] ^ [e] [f x]for all q.e Size of constraints? Non-standard resolution –In lecture [f] [x] ! [f x] Standard resolution
Constraint Resolution Can be black box –But not when figuring out a good encoding Resolution = Constraint rewriting –In practice graph completion
Graph Completion Sources and sinks are non-variable expressions, eg. a function ! node New edges when sources meet sinks ! ! ( ) [ ] CFL reachability: [... ] (.. )
Extensions: Context Sensitivity CFL reachability again –Interleaved CFL is undecidable –Reduce to single CFL perfectly nested CFL non-recursive case: –expand call problem away by inlining –expand data problem away by flattening Other approximations?
Extensions: Control-flow Sensitivity Local state –SSA form Heap state –??? –Need symbolic methods
Conclusions Specification vs. implementation –Separation –Reuse –Non-obvious algorithms Set constraints are one hammer –if it fits, great –otherwise, don’t bother, but be sure It’s all graphs and reachability