Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probabilistic Pointer Analysis [PPA]

Similar presentations


Presentation on theme: "Probabilistic Pointer Analysis [PPA]"— Presentation transcript:

1 Probabilistic Pointer Analysis [PPA]
Presented by: Jeff DaSilva CARG April 12, 2005

2 The Pointer Alias Analysis Problem
Definitely Not Definitely Maybe Typically, ‘Maybe’ is treated like ‘Definitely’ *A = ~ ~ = *B Statically decide for any pair of pointers, at any point in the program, whether two pointers point to the same memory location. This problem is known to be undecidable. [Landi 1992]

3 The Compiler Writer vs. The Programming Language Designer
Pointers are needed by programmers to realize complex data structures Pointers can make life difficult for optimizing compilers Helping the compiler consists of any specification of program details which are not strictly necessary to solve the problem at hand. For example, the "const" keyword See

4 Concluding Remarks (from last time)
Traditional pointer analysis techniques are either overly conservative or are so complex that they fail to scale with respect to code size Examples include: Address taken, Anderson’s analysis, Steensgaard, Emami Pointer analysis is a very difficult problem that may never be adequately solved. Does hardware support for data speculation make the analysis easier for the compiler?

5 Support for Data Speculation Exists
EPIC instruction sets Uses explicit load/store speculation Thread Level Speculation (TLS) Speculative Compiler Optimizations Speculative PRE, register promotion and strength reduction [ ] ‘Speculative’ behavioral synthesis Why are these techniques not more widely used? All techniques rely on data dependence profiling. One Reason: Currently, they rely on data dependence profiling.

6 Example Speculative Compiler Optimization
void foo_1(int *a, int *b) { x = *a + 1; *b = x; chk a != b } Speculate a != b void foo(int *a, int *b) { for(i = 0; i < N; i++) x = *a + 1; *b = x; } void foo_2(int *a, int *b) { x = *a + N; *b = x; chk a == b } Speculate a == b

7 Example Thread Level Speculation (TLS)
int *vector_add(int *a, int *b) { for(i = 0; i < N; i++) a[i] = a[i] + b[i]; } return a;  Can this loop be parallelized? To take advantage of data speculation support, a probability metric and a cost function are required.

8 Probabilistic Pointer Analysis (PPA)
Probability = 0.0 Probability = 1.0 0.0 < p < 1.0 Definitely Not Definitely Maybe Typically, ‘Maybe’ is treated like ‘Definitely’ *A = ~ ~ = *B Statically estimate for any pair of pointers, at any point in the program, the probability that two pointers point to the same memory location. Statically decide for any pair of pointers, at any point in the program, whether two pointers point to the same memory location. Isn’t this problem even worse? This problem is known to be undecidable. [Landi 1992]

9 Outline PPA Objectives Probabilistic Pointer Analysis (PPA) Theory
The Probabilistic Points-To Graph The Points-To Matrix The Transformation Matrix An Example Some Preliminary Results

10 How is Pointer Analysis used?
Optimizing Compiler Dependence Analysis Executable Source Code

11 Traditional Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; r s t x y z a b c

12 Traditional Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; r s t x y z a b c

13 Traditional Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) { x=&s; s=&c; } r s t x y z a b c

14 Traditional Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) { x=&s; s=&c; } r=&b; z=&r; r s t x y z a b c

15 Traditional Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) { x=&s; s=&c; } r=&b; z=&r; if(…) y = x; r s t x y z a b c

16 Traditional Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) { x=&s; s=&c; } r=&b; z=&r; if(…) y = x; *x = &a; r s t x y z a b c

17 How is PPA used? Probabilistic Pointer Analysis Edge Profiling
Control Flow Edge Profiling (Optional) Probabilistic Pointer Analysis Optimizing Compiler Probabilistic Dependence Analysis Speculative Executable Source Code with Data Speculation Support

18 Definition: Probability Analysis
Let <p,v> denote a points-to relationship from a pointer p to a location v. At every static program point s there exists a probability function P(s, <p,v>) that denotes the probability that p points to v during dynamic program execution. P(s, <p,v>) = D(s, <p,v>) / D(s) Where D(s) is the number of times s is (expected to be) dynamically visited and D(s, <p,v>) is the number of times that the points-to relation <p,v> dynamically holds.

19 Algorithm should output a Safe Conservative may alias Probability
A probability of 0.0 [P(s,<p,v>) = 0.0] indicates that a points-to relation <p,v> will never hold. The converse in not necessarily true A probability of 1.0 [P(s,<p,v>) = 1.0] indicates that a points to relation <p,v> will always hold. The converse is also not necessarily true: a dynamic points-to relationship <p,v> that always exists may not be reported with a probability of 1.0

20 Probabilistic Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; r s t x y z a b c 1.0 1.0 1.0 1.0 1.0 1.0

21 Probabilistic Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) { /*60% taken*/ x=&s; s=&c; } r s t x y z a b c 0.4 0.6 0.6 0.4

22 Probabilistic Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) { /*60% taken*/ x=&s; s=&c; } r=&b; z=&r; r s t x y z a b c 0.6 0.4 0.6 0.4

23 Probabilistic Points-To Graph
x y z a b c int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) { /*60% taken*/ x=&s; s=&c; } r=&b; z=&r; if(…) /*10% taken*/ y = x; 0.04 0.96 0.6 0.4 0.6 0.4

24 Probabilistic Points-To Graph
int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) { /*60% taken*/ x=&s; s=&c; } r=&b; z=&r; if(…) /*10% taken*/ y = x; *x = &a; y x z 0.04 0.6 0.96 0.4 r s t 0.4 0.16 0.4 0.24 0.6 0.6 0.6 a b c  What is the probability that **y points to a?

25 Our PPA Algorithm Objectives
An Interprocedural, Flow Sensitive, Context Sensitive approach that uses Linear Transfer Functions. Must be scalable in time and space. Provides an approximate probability for the ‘Maybe’ output.

26 Accuracy/Efficiency Tradeoff
PPA Polynomial Time Complexity Inaccurate – many false ‘maybe’ outputs, but provides probability metric Doubly Exponential Accurate – very few ‘maybe’ outputs Does not scale Address-taken Anderson BDD based Chen, et al: Only Other PPA Steensgaard SPAN Emami Linear Time Complexity Inaccurate - many false ‘maybe’ outputs Memory Required: Negligible

27 Encoding the Probabilistic Points-To Graph The Points-To Matrix
The Probabilistic Points-To graph is encoded using a sparse Markov Matrix All elements are real numbers in the closed interval [0,1] All rows sum to 1.0 Each pointer set and location set is given a unique id representing it’s matrix row & column Rules for linearity: Pointers can only point to Location sets Location sets always point to themselves with probability 1.0

28 Points-To Matrix Structure
Pointer Sets Location Sets 1 2 3 … N-1 N 1 2 ø Area Of Interest Pointer Sets ø I Location Sets N-1 N

29 Points-To Matrix Example
int a, b, *p, *q; void main() { p = &a; q = &b; foo(); } void foo() for(i = 0; i < 10; i++) if(…) q = &a; else p = q; int a, b, *p, *q; void main() { p = &a; q = &b; foo(); } void foo() for(i = 0; i < 10; i++) if(…) /* 50% chance taken */ q = &a; else p = q; p a q b und p a q b und p a q b und

30 Another Points-To Matrix Example
q b und p a q b und 0.04 0.46 0.5

31 What about double pointers?
und q p a q b und 0.04 0.46 0.5

32 Basic Pointer Assignments
x = &y Address-of Assignment x = y Copy Assignment x = *y Load Assignment *x = y Store Assignment

33 Transforming the points-to matrix
Let Xs represent the probabilistic points-to matrix at a specific program point s. XIN Basic pointer assignment instruction XOUT Claim: There exists a transformation function T(X) for every instruction i, such that XOUT = Ti(XIN).

34 The fundamental PPA Equation
Points-To Matrix Out Transformation Matrix Points-To Matrix In =

35 Transformation Matrix Structure
Pointer Sets Location Sets 1 2 3 … N-1 N 1 2 Area of Interest Pointer Sets ø I Location Sets N-1 N

36 Example int a, b, *p, *q; void main() { p = &a; q = &b; foo() }
void foo() for(i = 0; i < 10; i++) if(…) q = &a; else p = q; p a q b und

37 Example int a, b, *p, *q; void main() { p = &a; q = &b; foo() }

38 = Example int a, b, *p, *q; void main() { p = &a; q = &b; foo() }
p a q b und p a q b und =

39 Combining Transformation Matrices
XIN T1: Basic pointer assignment instruction T2: Basic pointer assignment instruction XOUT XOUT = T2 T1 XIN

40 Combining Transformation Matrices Example
int a, b, *p, *q; void main() { p = &a; q = &b; foo(); }

41 Combining Transformation Matrices Example
int a, b, *p, *q; void main() { p = &a; q = &b; foo(); }

42 Combining Transformation Matrices Example
int a, b, *p, *q; void main() { p = &a; q = &b; foo(); }

43 Combining Transformation Matrices Example
=

44 Combining Transformation Matrices Example
int a, b, *p, *q; void main() { p = &a; q = &b; foo(); }

45 Combining Transformation Matrices Example
int a, b, *p, *q; void main() { p = &a; q = &b; foo(); } p a q b und p a q b und 0.001 0.99 0.999 0.01 =

46 How to handle Control Flow?
q = &a; < 9 > p = q; 0.5 int a, b, *p, *q; void main() { p = &a; q = &b; foo(); } void foo() for(i = 0; i < 10; i++) if(…) q = &a; else p = q; 1 3 2 4 [ ]^10 TF_foo = [0.5 +0.5 ] 4 3 2 1

47 PPA Infrastructure ICFG MATLAB C Library Suif Infrastructure
.spd files .spx files PPA Results ICFG Points-To Matrix Propagator [TD] Edge Profile Abstract Memory Model (AMM) Transformation Matrix Collector [BU] Transfer Function Builder (Suif2TF) PPA Matrix Builder MATLAB C Library MATLAB Debugging Script

48 Current Results Total CPU Time (sec) (m:s) Mem Required (mB) gzip 0.6
Spec2000 Total CPU Time (sec) (m:s) Mem Required (mB) gzip 0.6 0:1 8.72 vpr 3 0:3 14.51 gcc 23:1 376.69 mcf 0.14 0:0 5.84 crafty 4.05 0:4 34.01 parser 5.44 0:5 15.73 perlbmk 252.56 4:13 172.62 gap 142.12 2:22 76.29 vortex 98.55 1:39 61.83 bzip2 0.28 7.2 twolf 4.21 29.96

49 Current Results go 3.2 0:3 28.31 m88ksim 3.61 0:4 17.88 gcc 905.59
Spec95 Total CPU Time (sec) (min:sec) Mem Required (mB) go 3.2 0:3 28.31 m88ksim 3.61 0:4 17.88 gcc 905.59 15:6 346.08 compress 0.05 0:0 5.32 li 10.61 0:11 14.21 ijpeg 5.26 0:5 18.88 perl 26.02 0:26 94.18 vortex 99.79 1:40 61.83

50 The End… Any Comments or Questions?

51 References Peng-Sheng Chen, Ming-Yu Hung, Yuan-Shin Hwang, Roy Dz-Ching Ju, Jenq Kuen Lee. Compiler support for speculative multithreading architecture with probabilistic points-to analysis PPOPP 2003: 25-36 Only other PPA research Group Jin Lin, Tong Chen, Wei-Chung Hsu, Peng-Chung Yew, Roy Dz-Ching Ju, Tin-Fook Ngai and Sun Chan, “A Compiler Framework for Speculative Analysis and Optimizations,” in Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, San Diego, California, June 9-11, 2003, pp Roy Dz-Ching Ju. Probabilistic Memory Disambiguation and its Application to Data Speculation (1996) Probabilistic array subscript analysis Manel Fernandez and Roger Espasa. Speculative Alias Analysis for Executable Code (2002)

52 BACKUP SLIDES

53 Does Probabilistic Aliasing Exist?

54 Does Probabilistic Aliasing Exist?

55 Does Probabilistic Aliasing Exist?

56 Does Probabilistic Aliasing Exist?

57 Does Probabilistic Aliasing Exist?

58 Does Probabilistic Aliasing Exist?

59 Probabilistic Dependence Matrix
src snk Compress95 Ref input set Flow Sensitive Analysis Location Oriented DDA 85x85 static lod/str pairs

60 Pointer Analysis Issues
Scalability vs. Accuracy Generally, a ‘difficult’ tradeoff exists between: the amount of computation and memory required vs. the accuracy of the analysis. Precision/Efficiency tradeoff, where is the sweet spot? Which metric should be used? Direct metric Report performance applied to an optimization Dynamically measure false positives Which benchmark suite? Are the results reproducible? Direct metric: The average number of objects aliased to a pointer expression appearing in the program

61 Pointer Analysis Issues
Complications associated with pointer arithmetic, casting, function pointers, long jumps, and multithreaded applications. Can these be ignored? Different pointer analysis uses have different needs. A universal pointer analysis probably doesn’t exist.

62 Optimizing Compilers 101 Optimizing compilers must preserve program correctness All compiler algorithms (code transformations) are developed within this rule What if this rule was relaxed? If not bound by this rule of program correctness, the opportunity constitutes a reevaluation of all optimizations that were originally created within this rule. If given a framework for speculation recovery (hardware or software), this rule becomes flexible.

63 Compiler Support for Speculation
Control Speculation Executing instructions before determining that they will execute in the normal flow of execution. Existing compiler frameworks can adequately incorporate and exploit control speculation using control flow profiling or simple heuristic rules. Data speculation Executing loads before potentially dependant stores Very little has been done to effectively exploit data speculation. Data dependence profiling is expensive No effective heuristics exist Control speculation attacks control dependences (spre) Data speculation attacks data dependences

64 Pointer Analysis Pointer analysis is a critical compiler component used to analyze programs written in C-like programming languages, which utilize pointers and pointer-based data structures It attempts to disambiguate indirect memory references, so that subsequent compiler passes have a more accurate view of program behaviour.

65 How is Pointer Analysis used?
Parallelizing Compiler Can the loop be parallelized? (TLP) void foo(int *a, int *b) { for(i=1; i<N; i++) { a[i] = b[i] / 13; } *a = *b + 1; Guide Compiler Optimizations load/store redundancy elimination, register allocation, CSE, dead code elimination, live variable, instruction scheduling [eg. VLIW] (ILP), loop invariant code motion, etc. Behavioral Synthesis & Data Flow Processors Necessary for partitioning and instruction scheduling. Error Detection & Program Understanding Programmer can detect errors or discover poorly written code.

66 Pointer Analysis Design Choices
Flow/Path sensitivity Context sensitivity Heap modeling Aggregate modeling Alias representation Whole Program Incremental Compilation

67 How is Pointer Analysis used?
Parallelizing Compiler Can the loop be parallelized? (TLP) void foo(int *a, int *b) { for(i=1; i<N; i++) a[i] = b[i-1]; for(i=1; i<N; i++) { *a = *b + 1; } Guide Compiler Optimizations load/store redundancy elimination, register allocation, CSE, dead code elimination, live variable, instruction scheduling [eg. VLIW] (ILP), loop invariant code motion, etc. Behavioral Synthesis & Data Flow Processors Necessary for partitioning and instruction scheduling. Error Detection & Program Understanding Programmer can detect errors or discover poorly written code. Any more?

68 Pointer Analysis Published by Year
Haven’t we Solved this Problem Yet?


Download ppt "Probabilistic Pointer Analysis [PPA]"

Similar presentations


Ads by Google