Pointer analysis John Rollinson & Kaiyuan Li

Pointer analysis John Rollinson & Kaiyuan Li
Optimizing Compilers for Modern Architectures, Spring 2019

Point-to analysis when...
Pointer arithmetic Not helpful (or no access to) source code Multiple threads

Symbolic Range Analysis
Global Analysis: Heap allocations treated as unique memory locations Pointers tracked as a set of memory locations and corresponding range offsets Pointers do not alias when there is no range overlap at each possible memory location Local Analysis: Solves path-sensitive false positives for global analysis (pointers inside a loop) Models each ɸ as a unique memory location and performs local overlap analysis for that memory location Paisante, Vitor, et al. "Symbolic range analysis of pointers." 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). IEEE, 2016.

ATLAS: automatic JAVA library points-to analysis
Dynamic analysis, flow-insensitive, context-sensitive, iterative Libraries are blackboxes Minimize false positives (precise but not sound) For local optimization, but not for transformation Based on Andersen’s algorithm Heap effects of functions (fragments) 2 steps: guess (precise) -> check (accurate) For all the methods that we have seen, we all assume that we have access to the source code, and all the source code is helpful for getting information. However, it is not always true. The source code could be totally private, or contain many inline assemblies or system calls so that it is too expensive to analysis on it. ATLAS is a tool that could help optimizations under such condition. It uses a modified version of Andersen’ algorithm, and do dynamic analysis on library APIs regardless of their implementations. One important point of this method is that instead of soundness, it emphasizes on precision. Which is very different from all the methods that we have seen. Why they do this? It is because that when the tool output some code fragments, and says that it is 100% percent sure about the pointing conditions of such code, then you can do pattern matching when compile or interpret code that use such libraries and do local optimizations. Which is great enough considering we do not use the source code. But the trade-off is that now we do not have the overall ideas of the program. Bastani, Osbert, et al. "Active learning of points-to specifications." Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 2018.

Guess (specification search space & testing specifications): Problem: guess too many Solution: path specification (context sensitive) Using unit tests to guarantee precise S box = ob ⇢ this set → this get ⇢ r get The algorithm takes two steps: the first is guessing and the second is checking. For the guessing part, while using Andersen’s algorithm, there is a problem on getting precise, which is that we have too many possibilities. Remember our goal is to be precise instead of sound, so do not really need so many candidates. And we need to represent the relationships for a chain of transfers. The solution given is to use a method called path specification. Here is the example. We new an object, and set it to a container. Then we get from the container, and return. If we would like to know what the return value is, we can build a path specification relationship like this. The dotted lines arrow are those that we are sure from the given code, and the solid line arrow represents an assumption. Which is a guess. If this guess stand, then this path specification is valid. There could be multiple guesses at a point, so they use automatic unit tests to make sure which of them are correct.

Check (specification inference): Problem: infinite possibilities Solution: active language learning algorithm via unit tests Finally convert path specifications to code fragment S box = ob ⇢ this set (→ this clone ⇢ r clone)* → this get ⇢ r get After a set of path specifications are generated, we would like to know how many of them are really helpful. This is why there is a checking section. For example, we have such a case, that after set, the object is cloned for many times before it is outputted. This can be presented by the path specification, but may not be helpful because we do not know how many times it is cloned, it can be infinite possibilities. So here they use an algorithm called active language learning, to try if they can get the exact number of the repeating part for some cases. If so, it is a helpful path specification, otherwise, discard it. Finally, all the valid items will be transferred to code fragments so that you can use them in further optimizations.

Sparse Flow-Sensitive Pointer Analysis for Multithreaded Programs (FSAM)
Whole program points-to analysis for generic Pthread-based concurrent C programs Multi-stage analysis: Run Andersen’s analysis (fast, but imprecise) Generate sparse def-use model: Use sequential model to approximate def-use chains for each memory location Identify may-happen-in-parallel (MHP) relations between threads Remove MHP relations prevented by locks Add parallel execution def-use relationships into sparse graph Use existing sparse def-use analysis on the new def-use model 1-2 orders of magnitude faster than non-sparse analysis Sui, Yulei, Peng Di, and Jingling Xue. "Sparse flow-sensitive pointer analysis for multithreaded programs." Proceedings of the 2016 International Symposium on Code Generation and Optimization. ACM, 2016.

FSAM: Example t0 t1 t2 t0 t1 t2 Andersen’s Pointer Analysis
main: S1: *p = … fork(t1, foo) S2: … = *p join(t1) S3: *p = … fork(t2, foo) lock(l1) S4: *p = ... unlock(l1) join(t2) S5: *p = … S6: … = *p Foo: lock(l2) S7: *q = … S8: *q = ... unlock(l2) Andersen’s Pointer Analysis Flow-Sensitive Pointer Analysis Sequential approximation Fork/join adjustments May happen in parallel Ignored because of lock For variable in “may alias” set for both q & p

Discussions

Pointer analysis John Rollinson & Kaiyuan Li

Similar presentations

Presentation on theme: "Pointer analysis John Rollinson & Kaiyuan Li"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Pointer analysis John Rollinson & Kaiyuan Li

Similar presentations

Presentation on theme: "Pointer analysis John Rollinson & Kaiyuan Li"— Presentation transcript:

Similar presentations

About project

Feedback