Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spring 2016 Program Analysis and Verification

Similar presentations


Presentation on theme: "Spring 2016 Program Analysis and Verification"— Presentation transcript:

1 Spring 2016 Program Analysis and Verification
Lecture 18: Interprocedural Analysis Roman Manevich Ben-Gurion University

2 Syllabus Program Verification Program Analysis Basics
Operational semantics Hoare Logic Applying Hoare Logic Weakest Precondition Calculus Proving Termination Data structures Automated Verification Program Analysis Basics From Hoare Logic to Static Analysis Control Flow Graphs Equation Systems Collecting Semantics Using Soot Abstract Interpretation fundamentals Lattices Fixed-Points Chaotic Iteration Galois Connections Domain constructors Widening/ Narrowing Analysis Techniques Numerical Domains Pointer analysis Shape Analysis Interprocedural Analysis

3 Previously Shape analysis TVLA

4 Today Handling procedures

5 Analyzing Procedures

6 Interprocedural Analysis
Until now assumed no procedures Intraprocedural analysis: analyzing one procedure at a time x = foo(…) interpreted by forgetting everything about x Very conservative analysis Does not handle global variables or heap Interprocedural analysis: uses calling relationships among procedures Enables more precise analysis information

7 Interprocedural analysis
foo() bar() Call bar() The effect of calling a procedure is the effect of executing its body – a big step

8 Interprocedural analysis
foo() bar() Call bar() Goal: compute the abstract effect of calling a procedure

9 Interprocedural analysis challenges
Stack can grow without a bound Matching of call/return

10 Reduction to intraprocedural analysis
Procedure inlining Naive solution: call-as-goto

11 Solution attempt #1 Inline callees into callers Good Bad
End up with one big procedure CFGs of individual procedures = duplicated many times Good Reduce to intraprocedural analysis: can use existing analyses as is Precise: distinguishes different calls to the same function Bad Exponential blow-up of code size Not efficient: re-analyze procedure bodies many times Doesn’t work with recursion

12 Inlining example void main() { int x; x = p(7); x = p(9); }
int p(int a) { return a + 1; }

13 Inlining example void main() { int x; int ret_p; { int a = 7; ret_p = a + 1; } x = ret_p; { int a = 9; ret_p = a + 1; } x = ret_p; } int p(int a) { return a + 1; } {x =8} {x =10}

14 Inlining: exponential blowup
main() { f1(); } f1() { f2(); fn() { ... }

15 Solution attempt #2 Build a “supergraph” = inter-procedural CFG
Replace each call from P to Q with An edge from point before the call (call point) to Q’s entry point An edge from Q’s exit point to the point after the call (return pt) Add assignments of actual arguments to formal arguments, and assignment of return value Good: efficient Graph of each function included exactly once in the supergraph Works for recursive functions (although local variables need additional treatment) Bad: imprecise, “context-insensitive” The “unrealizable paths problem”: dataflow facts can propagate along infeasible control paths

16 Method calls in detail x=1; y=2; z=0; L1: z = foo(x, y, z)
assert x==1 && y==2 && z==3 foo(a, b, c) { int i = a+b; return i; }

17 Method calls in detail entry node call node Call context
x=1; y=2; z=0; L1: z = foo(x, y, z) L1 foo(a, b, c) int i = a+b; foo_ret = i; Call context assert x==1 && y==2 && z==3 L1 return; exit node return node

18 Simple example: CP int p(int a) { void main() { return a + 1; int x ;
} void main() { int x ; x = p(7); x = p(9) ; }

19 Simple example: CP int p(int a) { void main() { [a 7] int x ;
return a + 1; } void main() { int x ; x = p(7); x = p(9) ; }

20 Simple example: CP Special returned-value variable int p(int a) {
return a + 1; [a 7, $$ 8] } void main() { int x ; x = p(7); x = p(9) ; }

21 Simple example: CP int p(int a) { void main() { [a 7] int x ;
return a + 1; [a 7, $$ 8] } void main() { int x ; x = p(7); [x 8] x = p(9) ; }

22 Simple example: CP int p(int a) { void main() { [a 7] int x ;
return a + 1; [a 7, $$ 8] } void main() { int x ; x = p(7); [x 8] x = p(9) ; }

23 Simple example: CP int p(int a) { void main() { [a ] int x ;
return a + 1; [a 7, $$ 8] } void main() { int x ; x = p(7); [x 8] x = p(9) ; }

24 Simple example: CP int p(int a) { void main() { [a ] int x ;
return a + 1; [a , $$ ] } void main() { int x ; x = p(7); [x 8] x = p(9); }

25 Simple example: CP int p(int a) { void main() { [a ] int x ;
return a + 1; [a , $$ ] } void main() { int x ; x = p(7) ; [x ] x = p(9) ; }

26 A naive interprocedural solution
Treat procedure calls as gotos Pros: Simple Usually fast Cons: Abstracts away call/return correlations: context-insensitive Obtain a conservative solution

27 why was the naive solution less precise?
Analysis by reduction Procedure inlining Call-as-goto void main() { int x ; x = p(7) ; [x ] x = p(9) ; } int p(int a) { [a ] return a + 1; [a , $$ ] } void main() { int a, x, ret; a = 7; ret = a+1; x = ret; a = 9; ret = a+1; x = ret; } [a ⊥, x ⊥, ret ⊥] [a 7, x 8, ret 8] [a 9, x 10, ret 10] why was the naive solution less precise?

28 Stack regime R P P() { … R(); } Q() { … R(); } R(){ … } call call
return return R P

29 Guiding light Exploit stack regime Precision Efficiency
Main idea / guiding light

30 Simplifying assumptions
Parameter passed by value No procedure nesting No concurrency Recursion is supported

31 Unrealizable paths zoo() foo() bar() Call bar() Call bar()

32 IVP: Interprocedural Valid Paths
f1 callq ret f2 fk-1 fk f3 ( ) fk-2 enterq exitq f4 fk-3 f5 IVP: all paths with matching calls and returns And prefixes

33 Valid paths zoo() foo() bar() (1 (2 Call bar() Call bar() )2 )1

34 Interprocedural valid paths
IVP set of paths Start at program entry Only considers matching calls and returns aka, valid Can be defined via context-free grammar matched ::= matched (i matched )i | ε valid ::= valid (i matched | matched paths can be defined by a regular expression

35 Sharir and Pnueli ‘82 Call String approach Functional approach
Blend interprocedural flow with intra procedural flow Tag every dataflow fact with call history Functional approach Determine the effect of a procedure E.g., in/out map Construct abstract transformers for entire procedure On-the-fly Functional approach very similar to Cousot & Cousot

36 The call string approach
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Ctx = Calling contexts in program = {c1, c2, …} Program locations of method call statements CtxStr = Ctx* Intuitively – represent all possible call stacks A flat abstract domain (different call strings incomparable) Abstract transformers C: call f()# = ? return from C: call f()# = ?

37 Call strings abstract domain
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Ctx = Calling contexts in program = {c1, c2, …} Program locations of method call statements CtxStr = Ctx* Intuitively – represent all possible call stacks IPA = (CtxStrDA, , , , , ) (c1c2c3, d1)  (c1c2c3, d2) = ?

38 Call strings abstract domain
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Ctx = Calling contexts in program = {c1, c2, …} Program locations of method call statements CtxStr = Ctx* Intuitively – represent all possible call stacks IPA = (CtxStrDA, , , , , ) (c1c2c3, d1)  (c1c2c3, d2) = (c1c2c3, d1A d2)

39 Call strings abstract domain
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Ctx = Calling contexts in program = {c1, c2, …} Program locations of method call statements CtxStr = Ctx* Intuitively – represent all possible call stacks IPA = (CtxStrDA, , , , , ) (c1c2c3, d1)  (c1c2c3, d2) = (c1c2c3, d1A d2) Obeys ascending chain condition? Use Chaotic iterations over the supergraph

40 Extending analysis with call strings
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Construct abstract domain associating a single abstract state with each call string IPA = (CtxStrDA, , , , , ) (c1c2c3, d1)  (c1c2c3, d2) = ?

41 Extending analysis with call strings
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Construct abstract domain associating a single abstract state with each call string IPA = (CtxStrDA, , , , , ) (c1c2c3, d1)  (c1c2c3, d2) = (c1c2c3, d1A d2)

42 Extending analysis with call strings
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Construct abstract domain associating a single abstract state with each call string IPA = (CtxStrDA, , , , , ) (c1c2c3, d1)  (c1c2c3, d2) = (c1c2c3, d1A d2) (c1c2c3, d1)  (c1c2c4, d2) = ?

43 Extending analysis with call strings
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Construct abstract domain associating a single abstract state with each call string IPA = (CtxStrDA, , , , , ) (c1c2c3, d1)  (c1c2c3, d2) = (c1c2c3, d1A d2) (c1c2c3, d1)  (c1c2c4, d2) = {(c1c2c3, d1), (c1c2c4, d2)}

44 Extending analysis with call strings
Abstract domain used for intraprocedural analysis A = (DA, A, A, A, A, A) Construct abstract domain associating a single abstract state with each call string IPA = (CtxStrDA, , , , , ) Use Chaotic iterations over the supergraph Obeys ascending chain condition?

45 Supergraph Q P R fc2e f1 fc2e f1 f1 f1 f2 f3 f1 fx2r f5 f3 fx2r f6
What happens here? Q P R Entry node sP sQ sR Call node fc2e f1 Call node fc2e f1 f1 f1 n4 n6 n1 n2 Call R Call R f2 f3 f1 n5 n3 n7 Return node Return node fx2r f5 f3 fx2r f6 eP eR eQ What happens here? Exit node

46 Simple call-string example
int p(int a) { return a + 1; } void main() { int x ; c1: x = p(7); c2: x = p(9) ; }

47 Simple call-string example
int p(int a) { c1: [a7] return a + 1; } void main() { int x ; c1: x = p(7); c2: x = p(9) ; }

48 Simple call-string example
int p(int a) { c1: [a7] return a + 1; c1: [a7, $$8] } void main() { int x ; c1: x = p(7); c2: x = p(9) ; }

49 Simple call-string example
int p(int a) { c1: [a7] return a + 1; c1: [a7, $$8] } void main() { int x ; c1: x = p(7); : [x8] c2: x = p(9) ; }

50 Simple call-string example
int p(int a) { c1: [a7] return a + 1; c1: [a7, $$8] } void main() { int x ; c1: x = p(7); : [x8] c2: x = p(9) ; }

51 Simple call-string example
int p(int a) { c1: [a7] c2: [a9] return a + 1; c1: [a7, $$8] } void main() { int x ; c1: x = p(7); : [x8] c2: x = p(9) ; }

52 Simple call-string example
int p(int a) { c1: [a7] c2: [a9] return a + 1; c1: [a7, $$8] c2: [a9, $$10] } void main() { int x ; c1: x = p(7); : [x8] c2: x = p(9) ; }

53 Simple call-string example
int p(int a) { c1: [a7] c2: [a9] return a + 1; c1: [a7, $$8] c2: [a9, $$10] } void main() { int x ; c1: x = p(7); : [x8] c2: x = p(9) ; : [x10] }

54 Enforcing termination
Adding arbitrary call strings violates ACC Analysis will not terminate for recursive procedures Too heavy even without recursion – exponential blow-up in number of possible call strings What can we do?

55 Abstracting call strings
CtxStr = Ctx* Can abstract CtxStr into a finite domain in different ways Most popular: k-limited suffix abstraction (keep only last k contexts) [k](c1…cn cn+1…cn+k) = cn+1…cn+k What we will talk about in this class Set abstraction: store all contexts in a set set(c1, c2, c1, c2, c3) = {c1, c2, c3} Forgets order of calls Combinations More generally – abstract using finite automata/regular expressions

56 Another example (|cs|=2)
void main() { int x ; c1: x = p(7);  : [x  16] c2: x = p(9) ;  : [x  20] } int p(int a) { c1:[a 7] c2:[a 9] return c3: p1(a + 1); c1:[a 7, $$ 16] c2:[a 9, $$ 20] } int p1(int b) { c1.c3:[b 8] c2.c3:[b 10] return 2 * b; c1.c3:[b 8,$$16] c2.c3:[b 10,$$20] }

57 Another example (|cs|=1)
void main() { int x ; c1: x = p(7);  : [x  ] c2: x = p(9) ; } int p(int a) { c1:[a 7] c2:[a 9] return c3: p1(a + 1); c1:[a 7, $$ ] c2:[a 9, $$ ] } int p1(int b) { (c1|c2)c3:[b ] return 2 * b; (c1|c2)c3:[b , $$] }

58 Handling recursion (|cs|=2)
void main() { c1: p(7);  : [x ] } int p(int a) { c1: [a  7] if (…) { a = a -1 ; c1: [a  6] c2: p (a); a = a + 1; } x = -2*a + 5;

59 Handling recursion (|cs|=2)
void main() { c1: p(7);  : [x ] } int p(int a) { c1: [a  7] c1.c2: [a  6] if (…) { c1: [a  7] c1.c2: [a  6] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2: p (a); a = a + 1; } x = -2*a + 5;

60 Handling recursion (|cs|=2)
void main() { c1: p(7);  : [x ] } int p(int a) { c1: [a  7] c1.c2: [a  6] c2.c2: [a  5] if (…) { c1: [a  7] c1.c2: [a  6] c2.c2: [a  5] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2.c2: [a  4] c2: p (a); a = a + 1; } x = -2*a + 5;

61 Handling recursion (|cs|=2)
void main() { c1: p(7);  : [x ] } int p(int a) { c1: [a  7] c1.c2: [a  6] c2.c2: [a  5]  c2.c2: [a  4] if (…) { c1: [a  7] c1.c2: [a  6] c2.c2: [a  5] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2.c2: [a  4] c2: p (a); a = a + 1; } x = -2*a + 5;

62 Handling recursion (|cs|=2)
void main() { c1: p(7);  : [x ] } int p(int a) { c1: [a  7] c1.c2: [a  6] c2.c2: [a  ] if (…) { c1: [a  7] c1.c2: [a  6] c2.c2: [a  5] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2.c2: [a  4] c2: p (a); a = a + 1; } x = -2*a + 5;

63 Handling recursion (|cs|=2)
void main() { c1: p(7);  : [x ] } int p(int a) { c1: [a  7] c1.c2: [a  6] c2.c2: [a ] if (…) { c1: [a  7] c1.c2: [a  6] c2.c2: [a ] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2.c2: [a ] c2: p (a); a = a + 1; } x = -2*a + 5;

64 Handling recursion (|cs|=2)
void main() { c1: p(7);  : [x ] } int p(int a) { c1: [a  7] c1.c2: [a  6] c2.c2: [a ] if (…) { c1: [a  7] c1.c2: [a  6] c2.c2: [a ] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2.c2: [a ] c2: p (a); a = a + 1; } x = -2*a + 5; c1: [a7, x9] c1.c2: [a6, x7] c2.c2: [a , x7]

65 Handling recursion (|cs|=2)
void main() { c1: p(7);  : [x ] } int p(int a) { c1: [a  7] c1.c2: [a  6] c2.c2: [a ] if (…) { c1: [a  7] c1.c2: [a  6] c2.c2: [a ] a = a -1 ; c1: [a  6] c1.c2: [a  5] c2.c2: [a ] c2: p (a); a = a + 1; c1: [a  7] c1.c2: [a  6] c2.c2: [a ] } x = -2*a + 5; c1: [a7, x9] c1.c2: [a6, x7] c2.c2: [a , x7] What should the value of a be here?

66 Summary: call string Associate abstract state with each call string
Apply chaotic iterations to supergraph To guarantee termination apply abstraction to call string (e.g., suffix  2) Represents tails of calls Exponential in call-string length A kind of abstract inline Simple Often loses precision under recursion Although can still be precise in some cases

67 Functional approach The meaning of a procedure is mapping from states into states The abstract meaning of a procedure is function from abstract state to abstract states Procedure summary Usually implemented as a table and constructed on-the-fly

68 Interprocedural shape analysis for cutpoint-free programs
Noam Rinetzky Tel Aviv University Mooly Sagiv Tel Aviv University Eran Yahav Technion

69 How to handle procedures?
Pure functions Procedure  input/output relation No side-effects p ret 1 2 3 .. main() { int w=0,x=0,y=0,z=0; w = inc(y); x = inc(z); assert: w+x is even } int inc(int p) { return 2 + p - 1; }

70 How to handle procedures?
Pure functions Procedure  input/output relation No side-effects p ret Even Odd Tabulation main() { int w=0,x=0,y=0,z=0; w = inc(y); x = inc(z); assert: w+x is even } int inc(int p) { return 2 + p - 1; } w x y z E O E O E

71 What about global variables?
p g ret g’ 1 Procedures have side-effects Easy fix p g ret g’ Even E/O Odd int g = 0; g = p; int g = 0; main() { int w=0,x=0,y=0,z=0; w = inc(y); x = inc(z); assert: w+x+g is even } int inc(int p) { g = p; return 2 + p - 1; }

72 But what about pointers and heap?
Aliasing Destructive update Heap Global resource Anonymous objects x.n.n ~ y n n append(y,z) n y.n=z x z x.n.n.n ~ z y How to tabulate append?

73 How to tabulate procedures?
Procedure  input/output relation Not reachable  Not effected proc: local (reachable) heap  local heap main() { append(y,z); } p q append(List p, List q) { } p q x n t x n t n y z y n z p q n n

74 How to handle sharing? External sharing may break the functional view
main() { append(y,z); } p q n append(List p, List q) { } p q n n n y t t x z y n z p q n x n

75 What’s the difference? n n n n x y append(y,z); append(y,z); t t y z x
1st Example 2nd Example append(y,z); append(y,z); x n t n n n y y t z x z

76 Cutpoints An object is a cutpoint for an invocation
Reachable from actual parameters Not pointed to by an actual parameter Reachable without going through a parameter append(y,z) append(y,z) n n n n y y n n t t x z z

77 Cutpoint freedom Cutpoint-free Invocation: has no cutpoints
Execution: every invocation is cutpoint-free Program: every execution is cutpoint-free append(y,z) append(y,z) n n n n y x t x t y z z

78 Memory states A memory state encodes a local heap
Local variables of the current procedure invocation Relevant part of the heap Relevant  Reachable main append p q n n x t y z

79 Interprocedural shape analysis
Tabulation exits p p call f(x) y x x y

80 Interprocedural shape analysis
Analyze f p p Tabulation exits p p call f(x) y x x y

81 Interprocedural shape analysis
Procedure  input/output relation Input Output q q rq rq p q p q n rp rq rp rq rp p n n q p q n n n rp rp rq rp rp rp rq

82 Interprocedural shape analysis
Reusable procedure summaries Heap modularity q p rp rq n y n z x append(y,z) rx ry rz y x z n rx ry rz append(y,z) g h i k n rg rh ri rk append(h,i)

83 Interprocedural analysis summary
Call string approach Simple Efficient for coarse abstractions of call stacks Not modular Functional approach More complex chaotic iteration algorithm (not shown in detail here) Often uses tabulation Modular Better handles recursion

84 We’ve covered all the course material Good luck with the home exam!


Download ppt "Spring 2016 Program Analysis and Verification"

Similar presentations


Ads by Google