Presentation is loading. Please wait.

Presentation is loading. Please wait.

Polymorphic Type-Based Flow Analysis Jakob Rehof Microsoft Research Redmond, WA, USA.

Similar presentations


Presentation on theme: "Polymorphic Type-Based Flow Analysis Jakob Rehof Microsoft Research Redmond, WA, USA."— Presentation transcript:

1 Polymorphic Type-Based Flow Analysis Jakob Rehof Microsoft Research Redmond, WA, USA

2 Type-Based Program Analysis Common vocabulary Data access paths Function summary Context-sensitivity Directional flow Type-based Type structure (  ) Function type (->) Type instantiation, polymorphism (  ) Subtyping (  )

3 Type-Based Program Analysis Type inference is a flow-Engine Works directly on higher-order programs (->) Takes advantage of type discipline in language analyzed Gives different complexity tradeoffs (n = size of typed program)

4 Scaleable Program Analysis Based on Type Inference +CS -DI ( ,=) -CS +DI (=,  ) +CS +DI ( ,  ) -CS -DI (=,=) [ RF, POPL 01 ] [ Das, PLDI 00 ] [ FRD, PLDI 00 ] research.microsoft.com/spa

5 Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w.  +  New Method Based on Instantiation Constraints Summary

6 Polymorphic Type-Based Analysis via Algorithm W Strictness analysis –[Kuo&Mishra 89] Region inference –[Tofte&Talpin 94] Binding time analysis –[Henglein&Mossin 94] Flow analysis –[Mossin 96], [Heintze&McAllester 97]

7 Flow Analysis … (*fp)(); … x = malloc(sizeof(S)); … (*y) = 5; … f(p) { … } g(q) { … }

8 Polymorphic Type Inference (W) fst(x,y) = x; let fst = lambda(x,y). x in fst(1,2) … fst(true,false) end; fst: AB.(A * B) -> A

9 Polymorphic Type-Based Flow Analysis fst(x:real,y:real) = x; standard type fst: real * real -> real

10 Polymorphic Type-Based Flow Analysis fst(x:real:a,y:real:b) = x; ab.real:a * real:b -> real:a analysis type flow label

11 Polymorphic Type-Based Flow Analysis max(s:real,t:real) = if s<=t then t else s max: real * real -> real standard type

12 Polymorphic Type-Based Flow Analysis max(s:real:a,t:real:a) = (if s<=t then t else s) :a a.real:a * real:a -> real:a analysis type flow label

13 Polymorphic Type-Based Flow Analysis max(s:a,t:a) : real:a * real:a -> real:a max(x0:b,y0:b):b max(x1:c,y1:c):c real:b * real:b -> real:b real:c * real:c -> real:c

14 Shortcomings of W Modularity problem (def < use) Flow summarization problem Not directly demand-driven Not scaling when combined with subtyping (directional flow)

15 Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w.  +  New Method Based on Instantiation Constraints Summary

16 Flow Analysis Overview Source Code Type Instantiation Graph Polymorphic Type Inference Equivalence class based Flow Graph A B On-demand queries O(|G|).

17 Flow Analysis Overview Constraint extraction : –C(e) = { T < T’, T = T’ } Constraint resolution: –C(e) Cnf Query on type instantiation graph: –Cnf |- b ----> p ?

18 Constraint Extraction Phase [ def max ] <i [ x0 ] * [ y0 ] -> [ max(x0,y0) ] [ def max ] <j [ x1 ] * [ y1 ] -> [ max(x1,y1) ] [ def max ] = real:a*real:a->real:a

19 Constraint Resolution Phase [ real:a ] <i [ x0 ] [ real:a ] <i [ y0 ] [ real:a ] <i [ max(x0,y0) ] [ x0 ] = [ real:b ] [ y0 ] = [ real:b ] [ max(x0,y0) ] = [ real:b ]

20 Query Phase? Type Graph: –Node equivalence classes –Instantiation edges Query = +- graph reachability Flow Graph A B

21 Another View of Instantiation real:a*real:a->real:a real:b*real:b->real:b real:c*real:c->real:c S1 S2

22 Type Graphs -> * a a * a -> a

23 Instantiation Constraints, T.I.G. -> * a * b * c

24 Flow Interpretation? -> * a * b * c

25 Type Theory to the Rescue ! Polarity (+,-) ->    - + + - - + +

26 Polarized Constraints -> * a * b * c + - - + - - + +

27 Reverse Negative Edges! -> * a * b * c + - - + - - + +

28 Insight: +- paths Every individual flow path has the form: PN-paths: +- ()*( Phases are present locally, on-demand Furthermore: These paths are valid in the higher-order case. Flow by regular reachability, linear time

29 Contrast w. Common 2-Phase approach Callee -> Callers, Callers -> Callee Unsatisfactory treatment of higher-order programs: –Ad-hoc or initial call graph Phase 1 is a transitive closure step and may produce n 2 size result No straight-forward on-demand queries

30 Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w.  +  New Method Based on Instantiation Constraints Summary

31 W-Based Flow Analysis w.  +  (Mossin) max(s,t) = if s<=t then t else s real * real -> real standard type

32 W-Based Flow Analysis w.  +  max(s:a,t:b) = (if s<=t then t else s) :c {a  c, b  c} => real:a * real:b -> real:c analysis type subtyping constraints flow label

33 W-Based Flow Analysis w.  +  max(s:a,t:b) = (if s<=t then t else s) :c {a  c, b  c} => real:a * real:b -> real:c

34 max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0,y0) max(x1,y1) W-Based Flow Analysis w.  + 

35 max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0,b0  c0}=>c0

36 max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0,b0  c0}=>c0 {a1  c1,b1  c1}=>c1 W-Based Flow Analysis w.  + 

37 W-Based Method (  ) Polymorphism by copying types (  ) Subtyping by constrained types (  +  )  constraint copying

38 Problems w. W-Based Method Constraint copying is expensive (memory) Constraint simplification is hard Previous algorithm (Mossin) No on-demand algorithms (n = size of type-annotated program)

39 Results No constraint copying On-demand queries All flow in

40 Flow Analysis w.  +  with and

41 Without Subtyping: norm(x,y ) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a * real:a -> real:a  :a’

42 Without Subtyping: norm(x:a’,y:a’) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a * real:a -> real:a 

43 Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w.  +  New Method Based on Instantiation Constraints Summary

44 Flow Analysis Overview Source Code Type Instantiation Graph Flow Graph A B Type Inference CFL- Reachability Polymorphic Subtyping

45 Eliminating constraint copies max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0, b0  c0} => real:a0 * real:b0 -> real:c0 {a1  c1, b1  c1} => real:a1 * real:b1 -> real:c1

46 1. Get a graph max(s:a,t:b) : real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

47 2. Label instantiation sites max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

48 3. Represent substitutions max(s:a,t:b) : real:a * real:b -> real:c a a0 a a1 b b0 b b1 c c0 c c1 i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

49 3.a. … as a graph max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i

50 3.a. … as a graph max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

51 4. Eliminate constraint copies ! max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

52 ? ? ? max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

53 Type Theory to the Rescue ! Polarity (+,-) ->    - + + - - + +

54 5. Polarities (+,-) max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

55 6. Reverse negative edges max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

56 7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

57 7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

58 7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

59 8. Be careful ! max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + + Spurious !

60 9. Do CFL-reachability max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 [i ]i [j ]j M  [k M ]k d d CFG

61 Further Issues Polymorphic type structure Recursive type structure –context-sensitive data-dependence analysis is uncomputable [Reps 00] –our techniques require finite types –regular unbounded data types handled via finite approximations: recursive type expressions

62 One-level implementation GOLF analysis system for C by Manuvir Das (MSR) and Ben Liblit (Berkeley) Exhaustive points-to sets for MS Word 97, 1.4 Mloc, in 2 minutes

63 Outline Polymorphic Type-Based Analysis (W) New Method Based on Instantiation Constraints W-Based Flow Analysis w.  +  New Method Based on Instantiation Constraints Summary

64 Polymorphic flow analysis based on instantiation constraints Efficient flow-summarization Demand-driven Highly modular Higher-order Graph-based algorithms

65 Summary Reformulation of polymorphic subtyping with instantiation constraints Elimination of constraint copying Transfer of CFL-reachability techniques to type-based flow analysis

66 Summary Type-based flow analysis –all flow in, n = typed pgm size –context-sensitive (polymorphism) –directional (subtyping) –demand-driven algorithm –incorporates label-polymorphic recursion –works directly on H.O. programs –structured data of finite type –unbounded data structures via approx.

67 A Closer Look at … Type System, Flow Logic & Grammar Flow Interpretation of Henglein’s Algebra (The Meaning of Loops) Regular Flow = ML Polymorphism Soundness

68 Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end a a0, b b0, c c0, a a1, b b1, c c1 a  cb  ca  cb  c  |-; ; e : c0*c1

69 Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end a a0, b b0, c c0, a a1, b b1, c c1 a  cb  ca  cb  c  |-; ; e : c0*c1 instantiation constraints subtyping constraints type environment

70 Flow Logic I; C; A |- e:  I |- a0 a1 I;C |- a1  m a2 I |- a2 a3 I I;C |- a0 a3 pp M  [i M ]i

71 CFL Formulation S  P N P  M P | ] P |  N  M N | [ N |  M  [k M ]k | M M | d | 

72 A Closer Look at … Type System, Flow Logic & Grammar Flow Interpretation of Henglein’s Algebra (The Meaning of Loops) Regular Flow = ML Polymorphism Soundness

73 Consistency of Substitution (Henglein) (F) a b, a c  b = c a -> a b -> c  b = c a -> a d -> e  d = e

74 Monomorphic situation (Henglein) G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end

75 Monomorphic situation (Henglein) G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end a c a d a b a b (F)  a = b = c = d a = b

76 Flow Generalization of (F) (F) a b, a c  b = c a1 a2 b c M  [k M ]k [k ]k

77 Self-loops G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end

78 Self-loops G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end

79 Self-loops G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end ]i ]j

80 Self-loops G = x:a. let F = y. x:b in (i:(F 0):c, j:(F 1):d) end ]i ]j [i [j

81 A Closer Look at … Type System, Flow Logic & Grammar Flow Interpretation of Henglein’s Algebra (The Meaning of Loops) Regular Flow = ML Polymorphism Soundness

82 ML-Polymorphic Flow: d =  M  [k M ]k | M M |  By induction, M =  denotes only self-loops:    [k ]k  (F)   [k]k

83 ML-Polymorhic Flow: P* N* S  P N P  ] P |  N  [ N |  Flow computable in time [FRD, PLDI 00]

84 ML-Polymorhic Flow: P* N* a a b b c c N P P N

85 A Closer Look at … Type System, Flow Logic & Grammar Flow Interpretation of Henglein’s Algebra (The Meaning of Loops) Regular Flow = ML Polymorphism Soundness

86 Theorem For every judgement |- I; C; A |- e:  in POLYFLOW(CFL), there exists a judgedmentC; A e:  in POLYFLOW(Copy) [Mossin], such that a  cp b implies a  cfl b

87 Research Problems Application to OO-languages Application to region inference, effect systems Connection to subtype simplification Non-structural subtyping Subtyping over partial orders


Download ppt "Polymorphic Type-Based Flow Analysis Jakob Rehof Microsoft Research Redmond, WA, USA."

Similar presentations


Ads by Google