Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability Jakob Rehof and Manuel Fähndrich Microsoft Research.

Slides:



Advertisements
Similar presentations
Solve the system of inequalities by graphing. x ≤ – 2 y > 3
Advertisements

R O O T S Field-Sensitive Points-to-Analysis Eda GÜNGÖR
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
SDN Controller Challenges
Lecture 11: Code Optimization CS 540 George Mason University.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Demand-driven Alias Analysis Implementation Based on Open64 Xiaomi An
1 CS 201 Compiler Construction Lecture Interprocedural Data Flow Analysis.
B Multi-Layer Network Design II Dr. Greg Bernstein Grotto Networking
Bebop: A Symbolic Model Checker for Boolean Programs Thomas Ball Sriram K. Rajamani
Building a Better Backtrace: Techniques for Postmortem Program Analysis Ben Liblit & Alex Aiken.
Refinement-Based Context-Sensitive Points-To Analysis for JAVA Soonho Kong 17 January Work of Manu Sridharan and Rastislav Bodik, UC Berkeley.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
A Fixpoint Calculus for Local and Global Program Flows Swarat Chaudhuri, U.Penn (with Rajeev Alur and P. Madhusudan)
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Approximate Counting via Correlation Decay Pinyan Lu Microsoft Research.
The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.
1 Constraint Problems in Program Analysis from the sublime to the ridiculous Alex Aiken Stanford University.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
Set Constraint-Based Program Analysis Manuel Fähndrich CS590 UW Spring 2001.
Previous finals up on the web page use them as practice problems look at them early.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
1/25 Context-Bounded Analysis of Concurrent Queue Systems Gennaro Parlato University of Illinois at Urbana-Champaign Università degli Studi di Salerno.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.
Intraprocedural Points-to Analysis Flow functions:
Java Alias Analysis for Online Environments Manu Sridharan 2004 OSQ Retreat Joint work with Rastislav Bodik, Denis Gopan, Jong-Deok Choi.
Type Inference David Walker CS 510, Fall Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Intermediate Code. Local Optimizations
Projects. Dataflow analysis Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Cloning-Based Context-Sensitive Pointer Alias Analysis using BDDs John Whaley Monica Lam Stanford University June 10, 2004.
Prof. Aiken CS 294 Lecture 11 Program Analysis. Prof. Aiken CS 294 Lecture 12 The Purpose of this Course How are the following related? –Program analysis.
Improving the Precision of Abstract Simulation using Demand-driven Analysis Olatunji Ruwase Suzanne Rivoire CS June 12, 2002.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
From last class. The above is Click’s solution (PLDI 95)
The SPA Project GOLF and ESP Manuvir Das Microsoft Research (joint work with Manuel Fahndrich, Jakob Rehof)
Mining Windows Kernel API Rules Jinlin Yang 09/28/2005CS696.
Example x := read() v := a + b x := x + 1 w := x + 1 a := w v := a + b z := x + 1 t := a + b.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.
11th Nov 2004PLDI Region Inference for an Object-Oriented Language Wei Ngan Chin 1,2 Joint work with Florin Craciun 1, Shengchao Qin 1,2, Martin.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India & K. V. Raghavan.
Pointer Analysis – Part II CS Unification vs. Inclusion Earlier scalable pointer analysis was context- insensitive unification-based [Steensgaard.
Compositionality Entails Sequentializability Pranav Garg, P. Madhusudan University of Illinois at Urbana-Champaign.
CS 343 presentation Concrete Type Inference Department of Computer Science Stanford University.
Reachability Analysis for Callbacks 北京大学 唐浩
Review for E&CE Find the minimal cost spanning tree for the graph below (where Values on edges represent the costs). 3 Ans. 18.
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.
Polymorphic Type-Based Flow Analysis Jakob Rehof Microsoft Research Redmond, WA, USA.
Manuel Fahndrich Jakob Rehof Manuvir Das
Dataflow analysis.
Harry Xu University of California, Irvine & Microsoft Research
Pointer Analysis Lecture 2
Interprocedural Analysis Chapter 19
Ravi Mangal Mayur Naik Hongseok Yang
Safely Supporting Probabilistic Data: PL Techniques as Part of the Story Dan Grossman University of Washington.
Pointer Analysis Lecture 2
Fast Min-Register Retiming Through Binary Max-Flow
Presentation transcript:

Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability Jakob Rehof and Manuel Fähndrich Microsoft Research

Type-Based Program Analysis Common vocabulary Data access paths Function summary Context-sensitivity Directional flow Type-based Type structure (  ) Function type (->) Type instantiation, polymorphism (  ) Subtyping (  )

+CS -DI ( ,=) +CS +DI GOAL: Scaleable Flow Analysis of H.O. Programs w. Polymorphic Subtyping -CS +DI (=,  ) +CS +DI ( ,  ) Precision and Cost -CS -DI (=,=) Type-based Higher-order Context-sensitive (CS) Directional (DI)

Outline Goals Problems and Results Current Flow Analysis w.  +  Our Solution Summary

Current Method (  ) Polymorphism by copying types (  ) Subtyping by constrained types (  +  )  constraint copying

Problems w. Current Method Constraint copying is expensive (memory) Constraint simplification is hard Previous algorithm (Mossin) No on-demand algorithms (n = size of type-annotated program)

Results No constraint copying On-demand queries All flow in

Outline Goals Problems and Results Current Flow Analysis w.  +  Our Solution Summary

Current Flow Analysis w.  +  (Mossin) max(s,t) = if s<=t then t else s real * real -> real standard type

Current Flow Analysis w.  +  max(s:a,t:b) = (if s<=t then t else s) :c {a  c, b  c} => real:a * real:b -> real:c analysis type subtyping constraints flow label

Current Flow Analysis w.  +  max(s:a,t:b) = (if s<=t then t else s) :c {a  c, b  c} => real:a * real:b -> real:c

max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0,y0) max(x1,y1) Current Flow Analysis w.  + 

max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0,b0  c0}=>c0

max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0,b0  c0}=>c0 {a1  c1,b1  c1}=>c1 Current Flow Analysis w.  + 

with and

Without Subtyping: norm(x,y ) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a * real:a -> real:a  :a’

Without Subtyping: norm(x:a’,y:a’) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a * real:a -> real:a 

Outline Goals Problems and Results Current Flow Analysis w.  +  Our Solution Summary

Flow Analysis Overview Source Code Type Instantiation Graph Flow Graph A B Type Inference

Flow Analysis Overview Source Code Type Instantiation Graph Flow Graph A B Type Inference CFL- Reachability Polymorphic Subtyping

Eliminating constraint copies max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0, b0  c0} => real:a0 * real:b0 -> real:c0 {a1  c1, b1  c1} => real:a1 * real:b1 -> real:c1

1. Get a graph max(s:a,t:b) : real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

2. Label instantiation sites max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

3. Represent substitutions max(s:a,t:b) : real:a * real:b -> real:c a a0 a a1 b b0 b b1 c c0 c c1 i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

3.a. … as a graph max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i

3.a. … as a graph max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

4. Eliminate constraint copies ! max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

? ? ? max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

Type Theory to the Rescue ! Polarity (+,-) ->   

5. Polarities (+,-) max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

6. Reverse negative edges max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

8. Be careful ! max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j Spurious !

9. Do CFL-reachability max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 [i ]i [j ]j M  [k M ]k d d CFG

Further Issues Polymorphic type structure Recursive type structure –context-sensitive data-dependence analysis is uncomputable [Reps 00] –our techniques require finite types –regular unbounded data types handled via finite approximations: recursive type expressions

One-level implementation GOLF analysis system for C by Manuvir Das (MSR) and Ben Liblit (Berkeley) Exhaustive points-to sets for MS Word 97, 1.4 Mloc, in 2 minutes

Outline Goals Problems and Results Current Flow Analysis w.  +  Our Solution Summary

Elimination of constraint copying Reformulation of polymorphic subtyping with instantiation constraints Transfer of CFL-reachability techniques to type-based flow analysis

Scaleable Program Analysis Project (MSR, spt ) +CS -DI ( ,=) -CS +DI (=,  ) +CS +DI ( ,  ) -CS -DI (=,=) [ RF, POPL 01 ] [ Das, PLDI 00 ] [ FRD, PLDI 00 ] research.microsoft.com/spa

Summary Type-based flow analysis –all flow in, n = typed pgm size –context-sensitive (polymorphism) –directional (subtyping) –demand-driven algorithm –incorporates label-polymorphic recursion –works directly on H.O. programs –structured data of finite type –unbounded data structures via approx.

CFL Formulation S  P N P  M P | [ P |  N  M N | ] N |  M  [k M ]k | M M | d | 

Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end a a0, b b0, c c0, a a1, b b1, c c1 a  cb  ca  cb  c  |-; ; e : c0*c1

Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end a a0, b b0, c c0, a a1, b b1, c c1 a  cb  ca  cb  c  |-; ; e : c0*c1 instantiation constraints subtyping constraints type environment