Pointer analysis John Rollinson & Kaiyuan Li

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
50.530: Software Engineering Sun Jun SUTD. Week 10: Invariant Generation.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Optimizing single thread performance Dependence Loop transformations.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
The LC-3 – Chapter 6 COMP 2620 Dr. James Money COMP
Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Chapter 2: Algorithm Discovery and Design
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Intraprocedural Points-to Analysis Flow functions:
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Big Kernel: High Performance CPU-GPU Communication Pipelining for Big Data style Applications Sajitha Naduvil-Vadukootu CSC 8530 (Parallel Algorithms)
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Chapter 2: Algorithm Discovery and Design
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Chapter 2: Algorithm Discovery and Design
Chapter 2: Algorithm Discovery and Design Invitation to Computer Science, C++ Version, Third Edition.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Program analysis with dynamic change of precision. Philippe Giabbanelli CMPT 894 – Spring 2008.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Threads cannot be implemented as a library Hans-J. Boehm (presented by Max W Schwarz)
Threads. Readings r Silberschatz et al : Chapter 4.
Invitation to Computer Science 5 th Edition Chapter 2 The Algorithmic Foundations of Computer Science.
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Tutorial 2: Homework 1 and Project 1
Chapter 13 Recursion Copyright © 2016 Pearson, Inc. All rights reserved.
Advanced Algorithms Analysis and Design
Code Optimization.
Data Flow Analysis Suman Jana
Segmentation COMP 755.
Lecture 1 Introduction Richard Gesick.
Memory Consistency Models
Floating-Point and High-Level Languages
Atomic Operations in Hardware
Faster Data Structures in Transactional Memory using Three Paths
Memory Consistency Models
Pointer Analysis Lecture 2
Lecture 25 More Synchronized Data and Producer/Consumer Relationship
CPU Efficiency Issues.
Introduction to Algorithms
Algorithm Analysis CSE 2011 Winter September 2018.
Structural testing, Path Testing
Topic 17: Memory Analysis
Amir Kamil and Katherine Yelick
Pointers and Dynamic Variables
Threads and Memory Models Hal Perkins Autumn 2011
University Of Virginia
Performance Optimization for Embedded Software
Chapter 14: Protection.
Objective of This Course
This Lecture Substitution model
Threads and Memory Models Hal Perkins Autumn 2009
Coding Concepts (Basics)
Pointer Analysis Lecture 2
Explaining issues with DCremoval( )
Pointer analysis.
자바 언어를 위한 정적 분석 (Static Analyses for Java) ‘99 한국정보과학회 가을학술발표회 튜토리얼
Multithreading Why & How.
Amir Kamil and Katherine Yelick
This Lecture Substitution model
The Rich/Knight Implementation
CS703 – Advanced Operating Systems
The IF Revisited A few more things Copyright © Curt Hill.
Virtual Memory.
The Rich/Knight Implementation
Presentation transcript:

Pointer analysis John Rollinson & Kaiyuan Li 15745 - Optimizing Compilers for Modern Architectures, Spring 2019

Point-to analysis when... Pointer arithmetic Not helpful (or no access to) source code Multiple threads

Symbolic Range Analysis Global Analysis: Heap allocations treated as unique memory locations Pointers tracked as a set of memory locations and corresponding range offsets Pointers do not alias when there is no range overlap at each possible memory location Local Analysis: Solves path-sensitive false positives for global analysis (pointers inside a loop) Models each ɸ as a unique memory location and performs local overlap analysis for that memory location Paisante, Vitor, et al. "Symbolic range analysis of pointers." 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). IEEE, 2016.

ATLAS: automatic JAVA library points-to analysis Dynamic analysis, flow-insensitive, context-sensitive, iterative Libraries are blackboxes Minimize false positives (precise but not sound) For local optimization, but not for transformation Based on Andersen’s algorithm Heap effects of functions (fragments) 2 steps: guess (precise) -> check (accurate) For all the methods that we have seen, we all assume that we have access to the source code, and all the source code is helpful for getting information. However, it is not always true. The source code could be totally private, or contain many inline assemblies or system calls so that it is too expensive to analysis on it. ATLAS is a tool that could help optimizations under such condition. It uses a modified version of Andersen’ algorithm, and do dynamic analysis on library APIs regardless of their implementations. One important point of this method is that instead of soundness, it emphasizes on precision. Which is very different from all the methods that we have seen. Why they do this? It is because that when the tool output some code fragments, and says that it is 100% percent sure about the pointing conditions of such code, then you can do pattern matching when compile or interpret code that use such libraries and do local optimizations. Which is great enough considering we do not use the source code. But the trade-off is that now we do not have the overall ideas of the program. Bastani, Osbert, et al. "Active learning of points-to specifications." Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 2018.

ATLAS: automatic JAVA library points-to analysis Guess (specification search space & testing specifications): Problem: guess too many Solution: path specification (context sensitive) Using unit tests to guarantee precise S box = ob ⇢ this set → this get ⇢ r get The algorithm takes two steps: the first is guessing and the second is checking. For the guessing part, while using Andersen’s algorithm, there is a problem on getting precise, which is that we have too many possibilities. Remember our goal is to be precise instead of sound, so do not really need so many candidates. And we need to represent the relationships for a chain of transfers. The solution given is to use a method called path specification. Here is the example. We new an object, and set it to a container. Then we get from the container, and return. If we would like to know what the return value is, we can build a path specification relationship like this. The dotted lines arrow are those that we are sure from the given code, and the solid line arrow represents an assumption. Which is a guess. If this guess stand, then this path specification is valid. There could be multiple guesses at a point, so they use automatic unit tests to make sure which of them are correct.

ATLAS: automatic JAVA library points-to analysis Check (specification inference): Problem: infinite possibilities Solution: active language learning algorithm via unit tests Finally convert path specifications to code fragment S box = ob ⇢ this set (→ this clone ⇢ r clone)* → this get ⇢ r get After a set of path specifications are generated, we would like to know how many of them are really helpful. This is why there is a checking section. For example, we have such a case, that after set, the object is cloned for many times before it is outputted. This can be presented by the path specification, but may not be helpful because we do not know how many times it is cloned, it can be infinite possibilities. So here they use an algorithm called active language learning, to try if they can get the exact number of the repeating part for some cases. If so, it is a helpful path specification, otherwise, discard it. Finally, all the valid items will be transferred to code fragments so that you can use them in further optimizations.

Sparse Flow-Sensitive Pointer Analysis for Multithreaded Programs (FSAM) Whole program points-to analysis for generic Pthread-based concurrent C programs Multi-stage analysis: Run Andersen’s analysis (fast, but imprecise) Generate sparse def-use model: Use sequential model to approximate def-use chains for each memory location Identify may-happen-in-parallel (MHP) relations between threads Remove MHP relations prevented by locks Add parallel execution def-use relationships into sparse graph Use existing sparse def-use analysis on the new def-use model 1-2 orders of magnitude faster than non-sparse analysis Sui, Yulei, Peng Di, and Jingling Xue. "Sparse flow-sensitive pointer analysis for multithreaded programs." Proceedings of the 2016 International Symposium on Code Generation and Optimization. ACM, 2016.

FSAM: Example t0 t1 t2 t0 t1 t2 Andersen’s Pointer Analysis main: S1: *p = … fork(t1, foo) S2: … = *p join(t1) S3: *p = … fork(t2, foo) lock(l1) S4: *p = ... unlock(l1) join(t2) S5: *p = … S6: … = *p Foo: lock(l2) S7: *q = … S8: *q = ... unlock(l2) Andersen’s Pointer Analysis Flow-Sensitive Pointer Analysis Sequential approximation Fork/join adjustments May happen in parallel Ignored because of lock For variable in “may alias” set for both q & p

Discussions