Context-Sensitivity Analysis Literature Review by José Nelson Amaral University of Alberta
Dimensions of Pointer Analysis Unification-based × Insertion-based Flow-sensitive × flow-insensitive Field-sensitive × field-insensitive × field-based Context-sensitive × context-insensitive
CMPUT Compiler Design and Optimization 3 Andersen’s X Steensgaard’s (Example) Insertion X Unification a = &b; Program: Steensgaard: Andersen: S = {(a,b)} a b b a After (Shapiro/Horwitz, PPL97)
CMPUT Compiler Design and Optimization 4 Andersen’s X Steensgaard’s (Example) a = &b; b = &c; Program: Steensgaard: Andersen: S = {(a,b); (b,c)} c a b c b a After (Shapiro/Horwitz, PPL97)
CMPUT Compiler Design and Optimization 5 Andersen’s X Steensgaard’s (Example) Program: Steensgaard: Andersen: S = {(a,b); (b,c)} c a b c b a What should happen in each analysis? a = &b; b = &c; if(cond) a = &d; After (Shapiro/Horwitz, PPL97)
CMPUT Compiler Design and Optimization 6 Andersen’s X Steensgaard’s (Example) Program: Steensgaard: Andersen: S = {(a,b); (b,c); (a,d); (d,c)} S = {(a,b); (b,c); (a,d)} c a b d c (b,d) a a = &b; b = &c; if(cond) a = &d; After (Shapiro/Horwitz, PPL97)
CMPUT Compiler Design and Optimization 7 Andersen’s X Steensgaard’s (Example) Program: Steensgaard: Andersen: S = {(a,b); (b,c); (a,d); (d,c)} S = {(a,b); (b,c); (a,d)} c a b d c (b,d) a And now? a = &b; b = &c; if(cond) a = &d; d = &e; After (Shapiro/Horwitz, PPL97)
CMPUT Compiler Design and Optimization 8 Andersen’s X Steensgaard’s (Example) a = &b; b = &c; if(cond) a = &d; d = &e; Program: Steensgaard: Andersen: S = {(a,b); (b,c); (a,d); (d,c); (d,e); (b,e)} S = {(a,b); (b,c); (a,d); (d,e)} c a b d e (c,e) (b,d) a After (Shapiro/Horwitz, PPL97)
CMPUT Compiler Design and Optimization Flow-sensitive X Flow-insensitive (Example) a = &b; b = &c; if(cond) a = &d; d = &e; Program: b a c b a d c b a Strong update: Not only a now points to d, but also a no longer points to b a c b d e c,e b,d a a c b d c a Insertion basedUnification based
CMPUT Compiler Design and Optimization Flow-sensitivity in SSA (incomplete slide) Program: a0 c b d e pb = &b; pc = &c; pd = &d; pe = &e; a0 = pb; *pb = pc; if(cond) a1 = pd; a2 = phi(a0, a1, FALSE, TRUE); *pd = pe; All variables that had their address taken must have an “access path” which is their address. They can only be referenced through their access paths. a1 pbpc pdpe a2 In SSA flow-sensitive information can be obtained from the single graph above.
Field-insensitive × Field-based × Field-sensitive analysis Field insensitive: Each aggregate object modeled by a single abstract variable. Field-based: An abstract variable models all instances of a field of an aggregate type. Field-sensitive: Unique abstract variable models each field of each aggregate object. (PearceKellyHankinTOPLAS07)
Field Sensitivity (Example) (PearceKellyHankinTOPLAS07) Program: typedef struct{ int *f1; int *f2; } aggr; aggr a,b; int *c, d, e, f; a.f1 = &d; d a d f1 d a f1 Field InsensitiveField BasedField Sensitive Assume a flow insensitive, insertion-based analysis. Program:
Field Sensitivity (Example) (PearceKellyHankinTOPLAS07) Program: typedef struct{ int *f1; int *f2; } aggr; aggr a,b; int *c, d, e, f; a.f1 = &d; a.f2 = &f; d a d f1 d a f1 f ff Field InsensitiveField BasedField Sensitive Assume a flow insensitive, insertion-based analysis. Program:
Field Sensitivity (Example) (PearceKellyHankinTOPLAS07) Program: typedef struct{ int *f1; int *f2; } aggr; aggr a,b; int *c, d, e, f; a.f1 = &d; a.f2 = &f; d a d f1 d a f1 f ff2f a f2 Field InsensitiveField BasedField Sensitive Assume a flow insensitive, insertion-based analysis. Program:
Field Sensitivity (Example) (PearceKellyHankinTOPLAS07) typedef struct{ int *f1; int *f2; } aggr; aggr a,b; int *c, d, e, f; a.f1 = &d; a.f2 = &f; b.f1 = &e; d a d f1 d a f1 f ff2f a f2 e e e Field InsensitiveField BasedField Sensitive Assume a flow insensitive, insertion-based analysis. Program:
Field Sensitivity (Example) (PearceKellyHankinTOPLAS07) typedef struct{ int *f1; int *f2; } aggr; aggr a,b; int *c, d, e, f; a.f1 = &d; a.f2 = &f; b.f1 = &e; d a d f1 d a f1 f ff2f a f2 e b e e b f1 Field InsensitiveField BasedField Sensitive Assume a flow insensitive, insertion-based analysis. Program:
Field Sensitivity (Example) (PearceKellyHankinTOPLAS07) Program: typedef struct{ int *f1; int *f2; } aggr; aggr a,b; int *c, d, e, f; a.f1 = &d; a.f2 = &f; b.f1 = &e; c = a.f1; d a d f1 d a f1 f ff2f a f2 e b e e b f1 c c c Field InsensitiveField BasedField Sensitive Assume a flow insensitive, insertion-based analysis.
Field Sensitivity (Example) (PearceKellyHankinTOPLAS07) Program: typedef struct{ int *f1; int *f2; } aggr; aggr a,b; int *c, d, e, f; a.f1 = &d; a.f2 = &f; b.f1 = &e; c = a.f1; d a d f1 d a f1 f ff2f a f2 e b e e b f1 c c c Field InsensitiveField BasedField Sensitive Assume a flow insensitive, insertion-based analysis.
Field Sensitivity in C A field-sensitive analysis for C is fundamentally harder than a field-sensitive analysis for Java: – C allows the address of a field to be taken Existing field-sensitive analysis for C: – YongHorwitzRepsPLDI99; – ChandraRepsPASTE99; – JohnsonWagnerUSENIX04; – PearceKellyHankinTOPLAS07;
What context-sensitivity means? Context-sensitive analysis: “the effects of a procedure call are estimated within a specific calling context” Context-insensitive analysis: “the effects of a procedure call summarizes the information for all calling contexts.” (EmamiGhyaHendrenPLDI94)
Another definition “A context-insensitive (CI) algorithm does not distinguish the different calling contexts of a procedure, whereas a context-sensitive (CS) does.” (ZhuCalmanPLDI04) “CS treats multiple calls to a single procedure independently.” (RufPLDI95) “CI constructs a single approximation to a procedure’s effect on all of its callers.” (RufPLDI95)
Alternative definition: The calling context problem The calling context problem is “the problem of correctly accounting for the calling context of a called procedure.” HorowitzRepsBlinkeyTOPLAS90
A more strict definition “A precise CS analysis yields results as precise as if they were computed on a modified program with all method calls inlined.” – Requires a context-sensitive heap abstraction: a separate abstraction is needed for each copy of an allocation statement – Virtual call targets must be computed context- sensitively separately for each calling context; using precise points-to information; SridharanBodikPLDI06
Context-Sensitive Example Two calls to a function foo produce different return values because of the points-to set at the point immediately before each call to foo. – In other words, the return value of foo changes depending on the context within which foo is invoked.
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); }
Context-sensitive example Is there an algorithm that “gets” this example? Emami, Ghiya, and Hendren (PLDI94) should get it. We need to study the points-to sets that the algorithm computes at points P1, P2, and P3. #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo P1 P2 P3
Context-sensitive example In the following animation: #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo P1 P2 P3 x y x definitely points to y (variable x contains the address of variable y) x probably points to y (arrows are colored red only for convenience in the animation, they represent new points-to relations that were not in the previous slide)
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo P1? a1 x1 y1 a2 x2 y2 a3 x3 y3
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo P1 a1 x1 y1 a2 x2 y2 a3 x3 y3
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo PA? a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo PA a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo PA’? a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo PA’ a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t PA”?
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo PA” a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo PA” a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t PA’’’?
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo PA’’’? a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t PB?
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo PBPB a1 x1 y1 a2 x2 y2 a3 x3 y3 p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 lp P2?
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 lp P2
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 lp PA? p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 lp PA p2 p3t
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 lp p2 p3t PB?
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 lp p2 p3t PBPB
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 lp lq P3?
Context-sensitive example #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } #include typedef int arr[10000]; arr a1, a2, a3; int cond1, cond2; int *foo (int **p2, int **p3){ int *t; if(cond2){ t = *p2; *p2 = *p3; *p3 = t; } return *p2; } int main(int argc, char *argv[]){ int *x1, *x2, *x3, *y1, *y2, *y3; int *lp, *lq, r; cond1 = argc-1; cond2 = argc-2; a1[0] = argc; a2[0] = argc+1; a3[0] = argc+2; x1 = a1; x2 = a2; x3 = a3; y1 = a1; y2 = a2; y3 = a3; if(cond1){ x1 = a2; x2 = a1; } lp = foo(&x2, &x3); lq = foo(&y2, &y3); return (*lp + *lq); } foo a1 x1 y1 a2 x2 y2 a3 x3 y3 lp lq P3
Solutions to the context-sensitive problem Create a context for each acyclic path from the root of the call graph to the current invocation (EmamiGhyaHendrenPLDI94). Create a context for each set of “relevant” alias set on entry of procedure --- also known as partial transfer functions (PTF) (WilsonLamPLDI95) – “to answer simple queries (PTF) requires all the results to be computed.” (WhaleyLamPLDI04) (Descriptions taken from RufPLDI95)
Solutions to the context-sensitive problem (cont.) Tag each alias to allow a procedure to propagate only appropriate aliases to its callers: – uses aliases on entry to the enclosing procedure (LandiRyderPLDI93) – Augment summary with abstraction of call stack (Cooper89MScThesis, ChoiBurkeCarinePoPL93) A fully context-sensitive analysis is exponential on the size of the input program --- unless the number of contexts considered is limited somehow.
Solutions to the context-sensitive problem (cont.) Create a clone of the method for each context (WhaleyLamPLDI04) – Reports up to 5 × clones (for a Java source code analyzer called pmd). – No discussion as how results of the analysis could be used in a real compiler.
Ruf’s Evaluation of Context Sensitivity Compares flow-sensitive CS and CI analyses – Benchmarks: Largest benchmark has 6771 lines of code and 5435 pointer or function outputs in the analysis. Sparse call graphs (4.2 callers/procedure on average, 54% of procedures have a single caller) Shallow nesting of pointer datatypes --- most pointers reference scalar datatypes. – CI finds that on average each memory operation references very few locations. – CS analysis generates 2% fewer points-to pair – CS does not affect the indirect memory references at all. RufPLDI95
Definition of a context “A context is a static abstraction of a method invocation” – A context-sensitive analysis “distinguishes invocations if their context is different” LhotakHendrenCC06
Invocation (Context) Abstractions call sites: the context of an invocation is the program statement from which the method was invoked. – Derived from call-string abstraction: a different approximation is computed for each distinct path in the call graph (defined by SharirPnueli1981). receiver object: the context is the static abstraction of the object in which the method was invoked. – (defined by MilanovaRoutevRyderISSTA02) LhotakHendrenCC06
1-level call string sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } LiangPenningsHarroldPASTE05
Points-to Graph Using 1-level Call String Sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } A:this 10 a1 o10 LiangPenningsHarroldPASTE05 A node is a variable or instance. An edge is variable reference or an instance field reference.
1-level call string sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o7#4 get:ret 7 get:this 7 A:this 10 a1 o10 f LiangPenningsHarroldPASTE05 Special local variable to represent return value of method get() The call string is limited to size one. This node represents the object allocated at line 4 because of a call to get from line 7.
1-level call string sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o7#4 get:ret 7 get:this 7 A:this 10 a1a2 A:this 11 o10o11 f LiangPenningsHarroldPASTE05
1-level call string sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o7#4 get:ret 7 get:this 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 f f LiangPenningsHarroldPASTE05 Cannot distinguish between the object allocation initiated by lines 10 and 11 because in both cases the new object is created at line 4 through a call from line 7.
1-level call string sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o7#4 get:ret 7 get:this 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 f f LiangPenningsHarroldPASTE05
1-level context-bound receiver-object sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } a1 o10 A:this 10 o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f LiangPenningsHarroldPASTE05 get:this 7
1-level context-bound receiver-object sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1 o10 o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f f LiangPenningsHarroldPASTE05 get:this 7 This node represented the objected created at line 4 to represent the object of line 10. The context is given by the object and is independent of the call chain to the object creation.
1-level context-bound receiver-object sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1 o10 a2 A:this 11 o11 o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f f LiangPenningsHarroldPASTE05 get:this 7
1-level context-bound receiver-object sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1a2 A:this 11 get:this 11 o10o11 o11#4 get:ret 11 o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f ff LiangPenningsHarroldPASTE05 get:this 7 Now objects from lines 10 and 11 have distinct abstract representations.
1-level context-bound receiver-object sensitivity 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1a2 A:this 11 get:this 11 o10o11 o11#4 get:ret 11 p o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f ff LiangPenningsHarroldPASTE05 get:this 7
context insensitive 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1a2 A:this 11 get:this 11 o10o11 o11#4 get:ret 11 p 1-level receiver object A:this a1 o10 o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f f f LiangPenningsHarroldPASTE05 get:this 7
context insensitive 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1a2 A:this 11 get:this 11 o10o11 o11#4 get:ret 11 p 1-level receiver object o4 get:ret A:this a1 get:this o10 o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f f f LiangPenningsHarroldPASTE05 get:this 7 Without context, there is a single abstraction to represent all objects allocated at line 4.
context insensitive 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1a2 A:this 11 get:this 11 o10o11 o11#4 get:ret 11 p 1-level receiver object o4 get:ret A:this a1a2 get:this o10o11 o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f f f LiangPenningsHarroldPASTE05 get:this 7
context insensitive 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1a2 A:this 11 get:this 11 o10o11 o11#4 get:ret 11 p 1-level receiver object o4 get:ret A:this a1a2 get:this o10o11 o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f f f LiangPenningsHarroldPASTE05 get:this 7
context insensitive 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } o10#4 get:ret 10 get:this 10 A:this 10 a1a2 A:this 11 get:this 11 o10o11 o11#4 get:ret 11 p 1-level receiver object o4 get:ret A:this a1a2 get:this o10o11 p o7#4 get:ret 7 A:this 10 a1a2 A:this 11 get:this 12 o10o11 o12#4 p get:ret 12 1-level call string f f f f LiangPenningsHarroldPASTE05 get:this 7
Strings of Contexts A context of a method invocation i can be defined by a context string that represents the top invocations in the stack when i is invoked. Managing unbounded growth in the number of contexts: – k-limiting: Limit the contexts considered to k – cycle collapsing: Collapse all cycles in the context- insensitive call graph into a single context. Used by ZhuCalmanPLDI04 and WhaleyLamPLDI04 LhotakHendrenCC06
Equivalent Contexts Two contexts are equivalent if their points-to relations are the same. – The number of distinct method-context pairs indicates how worthwhile context sensitivity may be in improving precision of points-to sets. LhotakHendrenCC06
Call Site × Receiver Object Context Sensitivity Call-site Sensitivity: The context of an invocation is the program statement from which the method is invoked. Receiver-Object Sensitivity: The context of an invocation is the abstraction of the object on which the method is invoked.. o12#4 p get:ret 12 o11#4 get:ret 11 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 }
Call Site × Receiver Object Context Sensitivity Hibrid Sensitivity: The context of an invocation is the abstraction of both the call site and the object on which the method is invoked.. 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } 1 class A { 2Object f; 3Object get() { 4 return new Object(); 5} 6A() { 7 this.f = this.get(); 8} 9static main() { 10 A a1 = new A(); 11 A a2 = new A(); 12 Object p = a2.get(); 13 p.toString(); 14} 15 } (c7,o11)#4 p get:ret 12 Using a limit of 1 for string and object.
Is CS (in Java) Worth It? (number of contexts) BenchInsen sitive Object-sensitivityCall-site stringCollap. Cycles 1231H12 max k Total compress × jess × jython × Distinct compress jess jython LhotakHendrenCC06 Total # of contexts is the product of the number in the column by the number of methods. Insensitive: 1 context per method 1, 2, 3: Pointers are context-sensitive but pointer targets are not. 1H: both pointers and pointer targets modeled with context strings of maximum length 1.
Is CS (in Java) Worth It? (number of contexts) BenchInsen s Object-sensitivityCall-site stringCollap. Cycles 1231H12 max k Total compress × jess × jython × Distinct compress jess jython LhotakHendrenCC06 Large number of contexts, but fewer that are distinct. Collapsing cycles models large parts of the call graph context-insensitively.
Is CS (in Java) Worth It? (Virtual Call Resolution) BenchCHAInsen s Object-sensitivityCall-site string 1231H12 javac soot-c polyglot bloat pmd LhotakHendrenCC06 Number of potentially polymorphic call sites (non library code).
A casting can potentially fail if the analysis cannot prove statically that the new type is a supertype of the original type. Cast Safety Analysis determines which casts cannot fail. A 1H analysis reduced the number of potentially failing castings from 3539 to 1017 in the polyglot benchmark. Application of CS in Java (Cast Safety) LhotakHendrenCC06
CS slightly improves call graph precision. CS yields a more significant improvement in virtual call resolution. A 1-object-sensitive or a 1H-object-sensitive analysis seems to be the best tradeoff. Extending the length of context strings in an object-sensitive analysis has little benefits. Collapsing cycles in the call graph is not a good idea for Java. Is CS (in Java) Worth It? (Lhotak-Hendren Conclusions) LhotakHendrenCC06
Liang/Harrold Evaluate CS on Andersen’s Analysis for Java CS results in more precise reference information in some benchmarks. – Both call-string contexts and receiver contexts are useful (in different benchmarks). – In some benchmarks CS makes no difference They use precise models to simulate collection and map classes. LiangPenningsHarrolsPASTE05
k-limiting object names In the code on the left the number of object names (shadows) is unbounded. Landi and Ryder limit the number of shadows to k. All object names with more then k dereferences are represented by the same name (shadow). LandiRyderPLDI92 2typedef struct CELL{ 3 int number; 4 struct CELL *next; 5} cell; 6 cell *head; 7 8int FindMax(cell *cursor) 9{ 10 int local_max; 11 if(cursor == NULL) 12 return 0; 13 local_max = cursor->number; 14 for( ; cursor->next != NULL ; cursor = cursor->next) 15 { 16 if(cursor->number > local_max) 17 local_max = cursor->number; 18 } 19 return local_max; 20} 2typedef struct CELL{ 3 int number; 4 struct CELL *next; 5} cell; 6 cell *head; 7 8int FindMax(cell *cursor) 9{ 10 int local_max; 11 if(cursor == NULL) 12 return 0; 13 local_max = cursor->number; 14 for( ; cursor->next != NULL ; cursor = cursor->next) 15 { 16 if(cursor->number > local_max) 17 local_max = cursor->number; 18 } 19 return local_max; 20}
Heap Cloning For each procedure, create a graphical representation of the heap objects that are manipulated by the procedure (allocated, assigned to, referenced, etc) Traverse the call graph cloning the graph of the callee into each call site. LattnerLnhartAdvePLDI07
Dealing with Cloning Complexity Use unification-based analysis so that many clones are merged together; Do not clone unreacheable objects from a callee into a caller; – For example, objects whose scope is entirely within the callee are not cloned; Merge (instead of cloning) global variables; LattnerPhD05
Recursive functions Abandon context-sensitivity in strongly connected components of the call graph. – Merge the graphs for all functions in the SCC LattnerPhD05
Heap Specialization Heap specialization: clone heap objects along call chains (paths in the call graph). Nystrom et al. propose that only heap objects that escape the callee need to be cloned. They observe, empirically, that if the only exposure of an escaped object is through a global variable, there is no benefit for cloning. Their analysis is flow-insensitive, Anderson style. NystromKimHwuPASTE04
Demand-driven Pointer Analysis Aimed to JITs. Only analyze portions of the program relevant to queries. 90% precision of field-sensitive Andersen’s analysis within 2ms per query (OOPSLA05). SridharanGopanShanBodikOOPSLA05
Incremental/Compositional/Partial Pointer/Escape Analysis for Java Generate parameterized analysis results for each method. – Recursive methods use a fix-point iterative algorithm. – Analyze each method independent of its caller. – Trade precision X time: can analyze a method without analyzing all the methods that it invokes. – Function summaries are flow insensitive. – Based on “points-to escape graphs”: (inside nodes/edges, outside nodes/edges, return value) Slow. Complexity of O(N 10 ) where N is the number of instructions in the scope of the analysis: – compress is 3 times slower to compile with the analysis. VivienRinardPLDI01, WhaleyRinardOOPSLA99, SalcianuPhDMIT01
On-demand and Incremental Region-based Shape Analysis for C Main idea: break down the abstraction into smaller components and analyze each component separately. – Use a “cheap” flow-insensitive and context-sensitive pointer analysis to partition the memory into disjoint regions. Each node in the points to graph represents a “memory region”. Regions must be disjoint Interprocedural propagation: uses a pair of input/output transfer functions for each function. On-demand: Can limit inter-procedural propagations to a set of regions. Incremental: Can reuse results from previously analyzed regions. Analyze OpenSSH (18.6 Kloc) in 45 seconds. HacketRuginaPOPL05
Refinement-Based On-Demand CS points- to analysis for Java Based on Context-Free-Language (CFL)-reachability. – the CF language L represents paths in the program that might cause a variable to point to an abstract location. – Balanced parenthesis property filters out unrealizable paths: call/return pairs must match In Java store/loads to fields must also match (the same is not true for C). – Significant increase of precision in relation to context- insensitive analysis. – 13 minutes to analyze polyglot SridharanBodikPOPL05
References M. Sharir and A. Pnuelli, “Two approaches to interprocedural data flow analysis,” in Program Flow Analysis: Theory and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1981, pp S. Horwitz, T. Reps, D. Binkley, “Interprocedural slicing using dependence graphs,” TOPLAS (1): W. Landi, B. G. Ryder, “A safe approximate algorithm for interprocedural aliasing,” PLDI 1992, pp M. Emami, R. Ghiya, L. J. Hendren, “Context-sensitive interprocedural points-to analysis in the presence of function pointers.” PLDI 1994, pp J.-D. Choi, R. Cytron, J. Ferrante, “On the Efficient Engineering of Ambitious Program Analysis,” IEEE Trans. on Soft. Enginer, Vol. 20, No. 2, Feb, 1994, pp, – Describes Factored SSA (FSSA). E. Ruf, “Context-insensitive alias analysis reconsidered,” PLDI 95, pp R. P. Wilson, M. S. Lam, “Efficient context-sensitive pointer analysis for C programs,” PLDI 1995, pp – Partial Transfer Functions are not practical.
References J. Whaley, M. Rinard, “Compositional Pointer and Escape Analysis for Java Programs,” OOPSLA 99, pp M. Fähndrich, J. Rehof, M. Das, “Scalable context-sensitive flow analysis using instantiation constraints,” PLDI 2000, – Based exclusively on types. Unification-based in the intra-procedural level. F. Vivien, M. Rinard, “Incrementalized Pointer and Escape Analysis,” PLDI 2001, pp , M. Berndl, O. Lhoták, F. Qian, L. Hendren, N. Umanee, “Points-to analysis using BDDs,” PLDI 2003, pp J. Zhu, S. Calman, “Symbolic pointer analysis revisited,“ PLDI 2004, pp – Treats call-graph cycles context-insensitively --- loses precision in Java J. Whaley, M. S. Lam, “Cloning-based context-sensitive pointer alias analysis using binary decision diagrams,” PLDI 2004, pp – Treats call-graph cycles context-insensitively --- loses precision in Java (lots of contexts --- no practical way to use them).
References E. M. Nystrom, H.-S. Kim, W. W. Hwu, “Importance of Heap Specialization in Pointer Analysis,” PASTE 2004, pp D. Liang, M. Pennings, M. J. Harrold, “Evaluating the Impact of Context-Sensitivity on Andersen’s Algorithm for Java Programs,” PASTE05, pp B. Hackett, R. Rugina, “Region-Based Shape Analysis with Tracked Locations,” POPL05, pp M. Sridharan, D. Gopan, L. Shan, R. Bodik, “Demand-Driven Points-to Analysis for Java,” OOPSLA05, O. Lhoták, L. Hendren, “Context-Sensitive Points-to Analysis: Is It Worth It?,” Compiler Construction 2006, pp C. Lattner, A. Lenharth, V. Adve, “Making context-sensitive points-to analysis with heap cloning practical for the real world,” PLDI 2007, pp. 278 – 289.