Download presentation
Presentation is loading. Please wait.
Published byMolly Osborne Modified over 9 years ago
1
Polymorphic Type-Based Flow Analysis Jakob Rehof Microsoft Research Redmond, WA, USA
2
Type-Based Program Analysis Common vocabulary Data access paths Function summary Context-sensitivity Directional flow Type-based Type structure ( ) Function type (->) Type instantiation, polymorphism ( ) Subtyping ( )
3
Type-Based Program Analysis Type inference is a flow-Engine Works directly on higher-order programs (->) Takes advantage of type discipline in language analyzed Gives different complexity tradeoffs (n = size of typed program)
4
Scaleable Program Analysis Based on Type Inference +CS -DI ( ,=) -CS +DI (=, ) +CS +DI ( , ) -CS -DI (=,=) [ RF, POPL 01 ] [ Das, PLDI 00 ] [ FRD, PLDI 00 ] research.microsoft.com/spa
5
Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w. + New Method Based on Instantiation Constraints Summary
6
Polymorphic Type-Based Analysis via Algorithm W Strictness analysis –[Kuo&Mishra 89] Region inference –[Tofte&Talpin 94] Binding time analysis –[Henglein&Mossin 94] Flow analysis –[Mossin 96], [Heintze&McAllester 97]
7
Flow Analysis … (*fp)(); … x = malloc(sizeof(S)); … (*y) = 5; … f(p) { … } g(q) { … }
8
Polymorphic Type Inference (W) fst(x,y) = x; let fst = lambda(x,y). x in fst(1,2) … fst(true,false) end; fst: AB.(A * B) -> A
9
Polymorphic Type-Based Flow Analysis fst(x:real,y:real) = x; standard type fst: real * real -> real
10
Polymorphic Type-Based Flow Analysis fst(x:real:a,y:real:b) = x; ab.real:a * real:b -> real:a analysis type flow label
11
Polymorphic Type-Based Flow Analysis max(s:real,t:real) = if s<=t then t else s max: real * real -> real standard type
12
Polymorphic Type-Based Flow Analysis max(s:real:a,t:real:a) = (if s<=t then t else s) :a a.real:a * real:a -> real:a analysis type flow label
13
Polymorphic Type-Based Flow Analysis max(s:a,t:a) : real:a * real:a -> real:a max(x0:b,y0:b):b max(x1:c,y1:c):c real:b * real:b -> real:b real:c * real:c -> real:c
14
Shortcomings of W Modularity problem (def < use) Flow summarization problem Not directly demand-driven Not scaling when combined with subtyping (directional flow)
15
Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w. + New Method Based on Instantiation Constraints Summary
16
Flow Analysis Overview Source Code Type Instantiation Graph Polymorphic Type Inference Equivalence class based Flow Graph A B On-demand queries O(|G|).
17
Flow Analysis Overview Constraint extraction : –C(e) = { T < T’, T = T’ } Constraint resolution: –C(e) Cnf Query on type instantiation graph: –Cnf |- b ----> p ?
18
Constraint Extraction Phase [ def max ] <i [ x0 ] * [ y0 ] -> [ max(x0,y0) ] [ def max ] <j [ x1 ] * [ y1 ] -> [ max(x1,y1) ] [ def max ] = real:a*real:a->real:a
19
Constraint Resolution Phase [ real:a ] <i [ x0 ] [ real:a ] <i [ y0 ] [ real:a ] <i [ max(x0,y0) ] [ x0 ] = [ real:b ] [ y0 ] = [ real:b ] [ max(x0,y0) ] = [ real:b ]
20
Query Phase? Type Graph: –Node equivalence classes –Instantiation edges Query = +- graph reachability Flow Graph A B
21
Another View of Instantiation real:a*real:a->real:a real:b*real:b->real:b real:c*real:c->real:c S1 S2
22
Type Graphs -> * a a * a -> a
23
Instantiation Constraints, T.I.G. -> * a * b * c
24
Flow Interpretation? -> * a * b * c
25
Type Theory to the Rescue ! Polarity (+,-) -> - + + - - + +
26
Polarized Constraints -> * a * b * c + - - + - - + +
27
Reverse Negative Edges! -> * a * b * c + - - + - - + +
28
Insight: +- paths Every individual flow path has the form: PN-paths: +- ()*( Phases are present locally, on-demand Furthermore: These paths are valid in the higher-order case. Flow by regular reachability, linear time
29
Contrast w. Common 2-Phase approach Callee -> Callers, Callers -> Callee Unsatisfactory treatment of higher-order programs: –Ad-hoc or initial call graph Phase 1 is a transitive closure step and may produce n 2 size result No straight-forward on-demand queries
30
Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w. + New Method Based on Instantiation Constraints Summary
31
W-Based Flow Analysis w. + (Mossin) max(s,t) = if s<=t then t else s real * real -> real standard type
32
W-Based Flow Analysis w. + max(s:a,t:b) = (if s<=t then t else s) :c {a c, b c} => real:a * real:b -> real:c analysis type subtyping constraints flow label
33
W-Based Flow Analysis w. + max(s:a,t:b) = (if s<=t then t else s) :c {a c, b c} => real:a * real:b -> real:c
34
max(s:a,t:b) : {a c, b c} => real:a * real:b -> real:c max(x0,y0) max(x1,y1) W-Based Flow Analysis w. +
35
max(s:a,t:b) : {a c, b c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0 c0,b0 c0}=>c0
36
max(s:a,t:b) : {a c, b c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0 c0,b0 c0}=>c0 {a1 c1,b1 c1}=>c1 W-Based Flow Analysis w. +
37
W-Based Method ( ) Polymorphism by copying types ( ) Subtyping by constrained types ( + ) constraint copying
38
Problems w. W-Based Method Constraint copying is expensive (memory) Constraint simplification is hard Previous algorithm (Mossin) No on-demand algorithms (n = size of type-annotated program)
39
Results No constraint copying On-demand queries All flow in
40
Flow Analysis w. + with and
41
Without Subtyping: norm(x,y ) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a * real:a -> real:a :a’
42
Without Subtyping: norm(x:a’,y:a’) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a * real:a -> real:a
43
Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w. + New Method Based on Instantiation Constraints Summary
44
Flow Analysis Overview Source Code Type Instantiation Graph Flow Graph A B Type Inference CFL- Reachability Polymorphic Subtyping
45
Eliminating constraint copies max(s:a,t:b) : {a c, b c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0 c0, b0 c0} => real:a0 * real:b0 -> real:c0 {a1 c1, b1 c1} => real:a1 * real:b1 -> real:c1
46
1. Get a graph max(s:a,t:b) : real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1
47
2. Label instantiation sites max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1
48
3. Represent substitutions max(s:a,t:b) : real:a * real:b -> real:c a a0 a a1 b b0 b b1 c c0 c c1 i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1
49
3.a. … as a graph max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i
50
3.a. … as a graph max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j
51
4. Eliminate constraint copies ! max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j
52
? ? ? max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j
53
Type Theory to the Rescue ! Polarity (+,-) -> - + + - - + +
54
5. Polarities (+,-) max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +
55
6. Reverse negative edges max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +
56
7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +
57
7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +
58
7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +
59
8. Be careful ! max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + + Spurious !
60
9. Do CFL-reachability max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 [i ]i [j ]j M [k M ]k d d CFG
61
Further Issues Polymorphic type structure Recursive type structure –context-sensitive data-dependence analysis is uncomputable [Reps 00] –our techniques require finite types –regular unbounded data types handled via finite approximations: recursive type expressions
62
One-level implementation GOLF analysis system for C by Manuvir Das (MSR) and Ben Liblit (Berkeley) Exhaustive points-to sets for MS Word 97, 1.4 Mloc, in 2 minutes
63
Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w. + New Method Based on Instantiation Constraints Summary
64
Polymorphic flow analysis based on instantiation constraints Efficient flow-summarization Demand-driven Highly modular Higher-order Graph-based algorithms
65
Summary Reformulation of polymorphic subtyping with instantiation constraints Elimination of constraint copying Transfer of CFL-reachability techniques to type-based flow analysis
66
Summary Type-based flow analysis –all flow in, n = typed pgm size –context-sensitive (polymorphism) –directional (subtyping) –demand-driven algorithm –incorporates label-polymorphic recursion –works directly on H.O. programs –structured data of finite type –unbounded data structures via approx.
67
A Closer Look at … Type System, Flow Logic & Grammar Flow Interpretation of Henglein’s Algebra (The Meaning of Loops) Regular Flow = ML Polymorphism Soundness
68
Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end a a0, b b0, c c0, a a1, b b1, c c1 a cb ca cb c |-; ; e : c0*c1
69
Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end a a0, b b0, c c0, a a1, b b1, c c1 a cb ca cb c |-; ; e : c0*c1 instantiation constraints subtyping constraints type environment
70
Flow Logic I; C; A |- e: I |- a0 a1 I;C |- a1 m a2 I |- a2 a3 I I;C |- a0 a3 pp M [i M ]i
71
CFL Formulation S P N P M P | ] P | N M N | [ N | M [k M ]k | M M | d |
72
A Closer Look at … Type System, Flow Logic & Grammar Flow Interpretation of Henglein’s Algebra (The Meaning of Loops) Regular Flow = ML Polymorphism Soundness
73
Consistency of Substitution (Henglein) (F) a b, a c b = c a -> a b -> c b = c a -> a d -> e d = e
74
Monomorphic situation (Henglein) G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end
75
Monomorphic situation (Henglein) G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end a c a d a b a b (F) a = b = c = d a = b
76
Flow Generalization of (F) (F) a b, a c b = c a1 a2 b c M [k M ]k [k ]k
77
Self-loops G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end
78
Self-loops G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end
79
Self-loops G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end ]i ]j
80
Self-loops G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end ]i ]j [i [j
81
A Closer Look at … Type System, Flow Logic & Grammar Flow Interpretation of Henglein’s Algebra (The Meaning of Loops) Regular Flow = ML Polymorphism Soundness
82
ML-Polymorphic Flow: d = M [k M ]k | M M | By induction, M = denotes only self-loops: [k ]k (F) [k]k
83
ML-Polymorhic Flow: P* N* S P N P ] P | N [ N | Flow computable in time [FRD, PLDI 00]
84
ML-Polymorhic Flow: P* N* a a b b c c N P P N
85
A Closer Look at … Type System, Flow Logic & Grammar Flow Interpretation of Henglein’s Algebra (The Meaning of Loops) Regular Flow = ML Polymorphism Soundness
86
Theorem For every judgement |- I; C; A |- e: in POLYFLOW(CFL), there exists a judgedmentC; A e: in POLYFLOW(Copy) [Mossin], such that a cp b implies a cfl b
87
Research Problems Application to OO-languages Application to region inference, effect systems Connection to subtype simplification Non-structural subtyping Subtyping over partial orders
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.