Presentation is loading. Please wait.

Presentation is loading. Please wait.

Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei.

Similar presentations


Presentation on theme: "Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei."— Presentation transcript:

1 Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo Institute of Computing Technology, Chinese Academy of Sciences { htyu, zqzhang, fxb, huowei }@ict.ac.cn 1 Jingling Xue University of New South Wales jingling@cse.unsw.edu.au

2 INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Framework Analyzing a Level Experiments Conclusion 2

3 INSTITUTE OF COMPUTING TECHNOLOGY Introduction Motivation –Who needs flow- and context-sensitive (FSCS) pointer analysis ? Software checking tools Program understanding Parallelization tools Hardware synthesis –Existed methods cannot scale to large real programs –Aiming at millions of lines of C code 3

4 INSTITUTE OF COMPUTING TECHNOLOGY Improve scalability For flow-sensitivity –Decreasing iterations in dataflow analysis –Saving space of points-to graph For context-sensitivity –Summary-based –Low storage penalty –Low apply penalty 4

5 INSTITUTE OF COMPUTING TECHNOLOGY Idea Level by Level analysis –Analyze the pointers in decreasing order of their points-to levels Suppose int **q, *p, x; q has a level 2, p has a level 1 and x has a level 0. –Fast flow-sensitive analysis on full sparse SSA –Fast and accurate context-sensitive analysis using a full transfer function 5

6 INSTITUTE OF COMPUTING TECHNOLOGY Contribution performs a full-sparse flow-sensitive pointer analysis using a flow-insensitive algorithm performs a context-sensitive pointer analysis efficiently with precise full transfer function yields a flow- and context-sensitive interproce- dural may/must mod/ref on a compact SSA form analyzes million lines of code in minutes, fast- er than the state-of-the art FSCS pointer ana- lysis algorithms 6

7 INSTITUTE OF COMPUTING TECHNOLOGY Framework Figure 1. Level-by-level pointer analysis (LevPA). Evalute transfer functions Bottom-up Top-down Propagate points-to set Compute points-to level for points-to level from the highest to lowest incremental build call graph 7

8 INSTITUTE OF COMPUTING TECHNOLOGY Points-to level Property 1. If a variable x is possibly pointed to by a pointer y, then ptl( x ) ≤ ptl( y ). Property 2. If a variable y is possibly assigned to x, then ptl( x ) = ptl( y ). Compute points-to level by a Unification-based pointer analysis 8

9 INSTITUTE OF COMPUTING TECHNOLOGY Example int o, t; main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; } void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; } 9 ptl(x, y, p, q) =2 ptl(a, b, c, d, e) =1 ptl(t, o) = 0 ptl(x, y, p, q) =2 ptl(a, b, c, d, e) =1 ptl(t, o) = 0 analyze first { x, y, p, q } then { a, b, c, d, e} last { t, o } analyze first { x, y, p, q } then { a, b, c, d, e} last { t, o }

10 INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 2 void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; } 10

11 INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 2 void foo( int **p, int **q) { L11: *p 1 = *q 1 ; L12: *q 1 = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; } 11 p 1 ’s points-to depend on formal-in p q 1 ’s points-to depend on formal-in q

12 INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 2 void foo( int **p, int **q) { L11: *p 1 = *q 1 ; L12: *q 1 = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x 1 = &a; y 1 = &b; L4: foo(x 1, y 1 ); L5: *b = 5; L6: if ( … ) { x 2 = &c; y 2 = &e; } L7: else { x 3 = &d; y 3 = &d; } x 4 = ϕ (x 2, x 3 ); y 4 = ϕ (y 2, y 3 ) L8: c = &t; L9: foo( x 4, y 4 ); L10: *e = 10; } 12 p 1 ’s points-to depend on formal-in p q 1 ’s points-to depend on formal-in q x 1 → { a } y 1 → { b } x 2 → { c } y 2 → { e } x 3 → { d } y 3 → { d } x 4 → { c, d } y 4 → { e, d }

13 INSTITUTE OF COMPUTING TECHNOLOGY Full-sparse Analysis Achieve flow-sensitivity flow-insensitively –Regard each SSA name as a unique variable –Set constraint-based pointer analysis Full sparse –Saving time –Saving space 13

14 INSTITUTE OF COMPUTING TECHNOLOGY Top-down analyze level 2 L4: foo.p → { a } foo.q → { b } L9: foo.p → { c, d } foo.q → { d, e } foo.p → { a, c, d } foo.q → { b, d, e } main: Propagate to callsite 14 void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

15 INSTITUTE OF COMPUTING TECHNOLOGY Top-down analyze level 2 void foo( int **p, int **q) { μ(b, d, e) L11: *p 1 = *q 1 ; χ(a, c, d) L12: *q 1 = &obj; χ(b, d, e) } foo: Expand pointer dereferences 15 Merging calling contexts here void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; } main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

16 INSTITUTE OF COMPUTING TECHNOLOGY Context Condition To be context-sensitive Points-to relation c i –p v (p→v ), p must (may) point to v, p is a formal parameter. Context Condition ℂ (c 1,…,c k ) –a Boolean function consists of higher-level points-to relations Context-sensitive μ and χ – μ(v i, ℂ (c 1,…,c k )) –v i+1 =χ(v i, M, ℂ (c 1,…,c k )) M ∈ {may, must}, indicates weak/strong update 16

17 INSTITUTE OF COMPUTING TECHNOLOGY Context-sensitive μ and χ void foo( int **p, int **q) { μ(b, q  b) μ(d, q→d) μ(e, q→e) L11: *p 1 = *q 1 ; a=χ(a, must, p  a) c=χ(c, may, p→c) d=χ(d, may, p→d) L12: *q1 = &obj; b=χ(b, must, q  b) d=χ(d, may, q→d) e=χ(e, may, q→e) } 17

18 INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 1 void foo( int **p, int **q) { μ(b 1, q  b) μ(d 1, q→d) μ(e 1, q→e) L11: *p 1 = *q 1 ; a 2 =χ (a 1, must, p  a ) c 2 =χ (c 1, may, p→c ) d 2 =χ(d 1, may, p→d) L12: *q 1 = &obj; b 2 =χ(b 1, must, q  b) d 3 =χ(d 2, may, q→d) e 2 =χ(e 1, may, q→e) } 18

19 INSTITUTE OF COMPUTING TECHNOLOGY Points-to Set Local Points-to Set –Loc (p) = { | ℂ (c 1,…,c k ) is a context condition}. –p can point to v if and only if ℂ (c 1,…,c k ) holds. –is computed explicitly during the bottom-up analysis. Dependence Set –Dep(p) = { | q is a formal-in parameter of level lev and ℂ (c 1,…,c k ) is a context condition –Ptr(p) includes Ptr(q) if and only if ℂ (c 1,…,c k ) holds. 19

20 INSTITUTE OF COMPUTING TECHNOLOGY Transfer function Trans(proc, v) – v is a formal-out parameter ℂ (c 1,…,c k ) is a context condition. –V can be modified at a callsite invoking proc only if ℂ (c 1,…,c k ) holds at the callsite M ∈ {may, must} , –indicates may/must mod effect Trans(proc) –a set of all individual transfer functions Trans(proc, v). 20

21 INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 1 void foo( int **p, int **q) { μ(b 1, q  b) μ(d 1, q→d) μ(e 1, q→e) L11: *p 1 = *q 1 ; a 2 =χ(a 1, must, p  a) c 2 =χ(c 1, may, p→c) d 2 =χ(d 1, may, p→d) L12: *q 1 = &obj; b 2 =χ(b 1, must, q  b) d 3 =χ(d 2, may, q→d) e 2 =χ(e 1, may, q→e) } Trans(foo, a) =,, }, pa, must > 21 Trans(foo, c) =,, }, p→c, may > Trans(foo, b) = }, { }, qb, must > Trans(foo, e) = }, { }, q→e, may > Trans(foo, d) = }, {,, }, p→d ∨ q→d, may >

22 INSTITUTE OF COMPUTING TECHNOLOGY Bottom-up analyze level 1 int obj, t; main() { L1: int **x, **y; L2: int *a, *b, *c, *d, *e; L3: x 1 = &a; y 1 = &b; μ(b 1, true) L4: foo(x 1, y 1 ); a 2 =χ(a 1, must, true) b 2 =χ(b 1, must, true) L5: *b 1 = 5; L6: if ( … ) { x 2 = &c; y 2 = &e; } L7: else { x 3 = &d; y 3 = &d; } x 4 = ϕ (x 2, x 3 ) y 4 = ϕ (y 2, y 3 ) L8: c 1 = &t; μ(d 1, true) μ(e 1, true) L9: foo(x 4, y 4 ); c 2 =χ(c 1, may, true) d 2 =χ(d 1, may, true) e 2 =χ(e 1, may, true) L10: *e 1 = 10; } at L4, p a holds, q b holds at L4, p a holds, q b holds at L9, p → c, p → d holds, q → e, q → d holds, at L9, p → c, p → d holds, q → e, q → d holds, 22

23 INSTITUTE OF COMPUTING TECHNOLOGY BDD and context condition Context conditions are implemented using BDD –Compactly represented –Boolean operations efficiently 23 x1 x2 x3 0 1 0 1 0 1 1 0 variable x1 represents p → a variable x2 represents q → a variable x3 represents p → b BDD for ℂ = (p → a ∧ q → a) ∨ p → b if only p → b holds at a call site, we can write ℂ | x1=0;x2=0;x3=1 to see whether C holds at the call site.

24 INSTITUTE OF COMPUTING TECHNOLOGY Experiment Analyzes million lines of code in minutes Faster than the state-of-the art FSCS pointer analysis algorithms. Table 2. Performance (secs). 24 BenchmarkKLOC LevPA Bootstrapping(PLDI’08) 64bit32bit Icecast-2.3.1222.185.7329 sendmail11572.63143.68939 httpd12816.3235.42161 445.gombk19721.3740.78/ wine-0.9.241905502.29891.16/ wireshark-1.2.22383366.63 845.23 /

25 INSTITUTE OF COMPUTING TECHNOLOGY Conclusion We present a scalable method for flow- and context-sensitive pointer analysis Analyzes the pointers in a program level by level in terms of their points-to levels. –Fast flow-sensitive analysis on full sparse SSA form –Fast and accurate context-sensitive analysis using full transfer functions represented by BDD. Can analyze million lines of C code in minutes, faster than the state-of-the-art methods. 25

26 INSTITUTE OF COMPUTING TECHNOLOGY Thanks 26


Download ppt "Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei."

Similar presentations


Ads by Google