Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reachability Analysis for Callbacks 北京大学 唐浩 2015.4.25.

Similar presentations


Presentation on theme: "Reachability Analysis for Callbacks 北京大学 唐浩 2015.4.25."— Presentation transcript:

1 Reachability Analysis for Callbacks 北京大学 唐浩 tanghaoth90@pku.edu.cn 2015.4.25

2 Outline  Introduction  Program Analysis Graph reachability problem  Summary-based Analysis One challenge: callbacks  CFL-reachability  Reachability Analysis for Callbacks  Callbacks: conditions  TAL-reachability: conditional reachability

3 Introduction  Program Analysis  General speaking, automated analysis of program behaviors  Flow analysis tasks data/control flow analysis information flow analysis (security) points-to/alias analysis … can be modeled as graph reachability problems

4 Introduction  Example: transitive data dependence analysis int gcd(int a, int b, string msg) { write(msg); while (b != 0) { int tmp = a % b; a = b; b = tmp; } return a; } a b tm p ret msg All transtive data dependence relationships a --> b, b --> a, a --> tmp, b --> tmp, tmp --> a, tmp --> b, a --> ret, b --> ret, tmp --> ret

5 Introduction  Summary-based Analysis  Summarizing behaviors of a component (modular/compositional analysis) Result: summary Goal: reusable: to reuse analysis result concise: to hind internal complexity efficient: to avoid unnecessary re-computation A general model transferring summary function from entries to exits

6 Introduction  Example: transitive data dependence analysis int gcd(int a, int b, string msg) { write(msg); while (b != 0) { int tmp = a % b; a = b; b = tmp; } return a; } a b tm p ret msg All transtive data dependence relationships a --> b, b --> a, a --> tmp, b --> tmp, tmp --> a, tmp --> b, a --> ret, b --> ret, tmp --> ret Summary: a --> ret b --> ret

7 Introduction  Summary-based Analysis  Summarizing behaviors of a component (modular/compositional analysis)  Challenge: handling incompleteness (incomplete/partial program analysis) calling context unknown parameters global variables … callbacks (due to dynamic dispatch) unknown client code ? ?

8 Introduction  Example class Math { int gcd(int a, int b); int gcd20(int a) { return gcd(a, 20); } class Math1 extends Math { int gcd(int a, int b) {…} } class Math2 extends Math { int gcd(int a, int b) {…} } // main Math2 m = new Math2(); int x = 30; int y = m.gcd20(x); Math::gcd20 Math::gcd Math2::gcd Math1::gcd Main client code library code incomplete

9 CFL-reachability  Interprocedural Analysis  IFDS/IDE [Reps et al. 1995, Sagiv et al. 1996] Realizable path: matched parentheses Filtering out unrealizable paths void fun() { … y1 = p(x1); … y2 = p(x2); … } int p(int x) { … } {1{1 }1}1 }2}2 {2{2 matched parenthesis language S  e S  SS S  { i S } i, i = 1,2,… * We only discuss realizable paths (reachability) defined by matched parenthesis language in the following part.

10 CFL-reachability  Algorithm: Dynamic Programming  (similar to Floyd-Warshall Algorithm)  O(n 3 ) {1{1 {2{2 }1}1 }2}2 a p b q e S S S x y matched parenthesis language S  e S  SS S  { i S } i, i = 1,2,… Graph Invocation edge: { i Return edge: } i Normal edge: e

11 Reachability Analysis for Callbacks  Summarizing “Incomplete” Graph  Postponing analysis of callbacks  Leaving unnecessary nodes in the summary {2{2 {3{3 {4{4 }4}4 }3}3 }2}2 S  e S  SS S  { i S } i, i = 1,2,… library a c d b {1{1 }1}1 {5{5 }5}5 u v x y d=g(c), [g: callback function]

12 Reachability Analysis for Callbacks {2{2 {3{3 {4{4 }4}4 }3}3 }2}2 library a c d b callback site Conditional Reachability

13 Reachability Analysis for Callbacks  Conditional Reachability  CR a,b (x,y): x ~> y, if a ~> b  Unconditional Reachability (by CFL reachability)  UR(x,y): x ~> y  Summary: CR a,b (x,y) and UR(x,y) x a b y { } x a b y { }

14 Reachability Analysis for Callbacks  Client-code Analysis  Turn conditional into unconditional if the condition is satisfied CR c,d (a,b) library a c d b {1{1 }1}1 {5{5 }5}5 x y UR(a,b)

15 Reachability Analysis for Callbacks  Library Summarization  Unconditional Reachability CFL-reachability  Conditional Reachability ?

16 Reachability Analysis for Callbacks  Tree Adjoining Language (TAL)  Mildly Context Sensitive Language  Parsable in O(n 6 )  Application: Natural Language Processing  Our Contribution  TAL-Reachability: Conditional Reachability

17 Reachability Analysis for Callbacks  Tree Adjoining Language  Strings  Non-terminals

18 Reachability Analysis for Callbacks  Operators

19 Reachability Analysis for Callbacks  First-order “S”  One string  Reachability for a 2-tuple (x,y)  One path x a b y { } UR(x,y)

20 Reachability Analysis for Callbacks  Second-order “”  A pair of strings  Reachability for a 4-tuple (x,a,b,y)  A pair of paths x a b y { } CR a,b (x,y)

21 Reachability Analysis for Callbacks  Operators  Operations for TAL-reachability α β a b q p

22 Reachability Analysis for Callbacks  Algorithm  Result: concise and efficient summary  Keep three types of node (empirically 10%) boundary nodes (entries and exits of the library) chaining nodes hidden chaining nodes  Evaluation: 8X

23 Reachability Analysis for Callbacks  Future Work  Callback analysis for real applications Android / Web applications  A more general case Handling multiple callbacks in a path

24 Conclusion  An important question, but few research papers  Callbacks in summary-based analysis techniques  Borrow ideas from other research field  Tree adjoining language (NLP)  Create conditions for unknown facts  Instantiate when facts are available

25 Thank you!

26 Library Summarization: TAL Reachability  Complete TAL Grammar

27 Library Summarization: TAL Reachability x1x1 x2x2 y1y1 y2y2 {i{i }i}i { i (x 1,y 1 ) + } i (y 2,x 2 )  CR y1,y2 (x 1,x 2 )

28 Library Summarization: TAL Reachability x1x1 x2x2 y CR y,y (x 1,x 2 )  UR(x 1,x 2 )

29 Library Summarization: TAL Reachability x1x1 x2x2 y1y1 y2y2 z1z1 z2z2 CR y1,y2 (x 1,x 2 )+CR z1,z2 (y 1,y 2 )  CR z1,z2 (x 1,x 2 )

30 Library Summarization: TAL Reachability x1x1 x2x2 y1y1 y2y2 x0x0 CR y1,y2 (x 1,x 2 )+UR (x 0,x 1 )  CR y1,y2 (x 0,x 2 )

31 Reachability Analysis for Callbacks  Keeping reachability between only boundary nodes are not sufficient  Chaining nodes & Hidden chaining nodes  Chaining nodes (“connectors”): x 1, x 2  Hidden chaining nodes (“start/end nodes”): x 0, x 3 x0x0 x1x1 x2x2 x3x3 {2{2 }2}2 }3}3 {3{3 }4}4 {4{4

32 Reachability Analysis for Callbacks  { 2 +} 2  CR p,q (a,b)  { 3 +} 3  CR r,s (p,q)  { 4 +} 4  CR c,d (r,s) {2{2 {3{3 {4{4 }4}4 }3}3 }2}2 library a c d b p q r s

33 Reachability Analysis for Callbacks  { 2 +} 2  CR p,q (a,b)  { 3 +} 3  CR r,s (p,q)  { 4 +} 4  CR c,d (r,s)  CR p,q (a,b)+CR r,s (p,q)  CR r,s (a,b)  CR r,s (p,q)+CR c,d (r,s)  CR c,d (p,q)  CR r,s (a,b)+CR c,d (r,s)  CR c,d (a,b) {2{2 {3{3 {4{4 }4}4 }3}3 }2}2 library a c d b p q r s

34 Reachability Analysis for Callbacks  CR p,q (a,b)  CR r,s (p,q)  CR c,d (r,s)  CR r,s (a,b)  CR c,d (p,q)  CR c,d (a,b) {2{2 {3{3 {4{4 }4}4 }3}3 }2}2 library a c d b Redundant reachability relationships p q r s boundary nodes

35 Reachability Analysis for Callbacks  Evaluation: 15 subjects <10% Fund. Fundamental nodes Boundary nodes Chaining nodes Hidden chaining nodes

36 Reachability Analysis for Callbacks  Evaluation: library summarization 3.16X slow-down More memory required

37 Reachability Analysis for Callbacks  Evaluation: client-code analysis 8.24X Speed-up Less memory required


Download ppt "Reachability Analysis for Callbacks 北京大学 唐浩 2015.4.25."

Similar presentations


Ads by Google