Lecture 13 – Dataflow Analysis (and more!) Eran Yahav 1 Reference: Dragon 9, 12.

Slides:



Advertisements
Similar presentations
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advertisements

1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
Program Representations. Representing programs Goals.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
Trace-based Just-in-Time Type Specialization for Dynamic Languages Andreas Gal, Brendan Eich, Mike Shaver, David Anderson, David Mandelin, Mohammad R.
Lecture 01 - Introduction Eran Yahav 1. 2 Who? Eran Yahav Taub 734 Tel: Monday 13:30-14:30
Lecture 15 – Dataflow Analysis Eran Yahav 1
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
Lecture 16 – Dataflow Analysis Eran Yahav 1 Reference: Dragon 9, 12.
Lecture 17 – Dataflow Analysis (and more!) Eran Yahav 1 Reference: Dragon 9, 12.
Lecture 03 – Background + intro AI + dataflow and connection to AI Eran Yahav 1.
Lecture 04 – Numerical Abstractions Eran Yahav 1.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Cpeg421-08S/final-review1 Course Review Tom St. John.
1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Advanced Compilers CSE 231 Instructor: Sorin Lerner.
Chair of Software Engineering Fundamentals of Program Analysis Dr. Manuel Oriol.
Data Flow Analysis Compiler Design Nov. 3, 2005.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Class canceled next Tuesday. Recap: Components of IR Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Loops Guo, Yao.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Ben Livshits Based in part of Stanford class slides from
Eran Yahav Technion Recent Developments in the Real World 1.
Constant Propagation. The constant propagation framework is different from all the data-flow problems discussed so far, in that It has an unbounded set.
Precision Going back to constant prop, in what cases would we lose precision?
1 CS 201 Compiler Construction Data Flow Analysis.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
Data Flow Analysis Compiler Baojian Hua
Created by, Author Name, School Name—State FLUENCY WITH INFORMATION TECNOLOGY Skills, Concepts, and Capabilities.
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
MIT Introduction to Program Analysis and Optimization Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Jeffrey D. Ullman Stanford University. 2 boolean x = true; while (x) {... // no change to x }  Doesn’t terminate.  Proof: only assignment to x is at.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
Program Representations. Representing programs Goals.
 Control Flow statements ◦ Selection statements ◦ Iteration statements ◦ Jump statements.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Optimization Simone Campanoni
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Dataflow Analysis CS What I s Dataflow Analysis? Static analysis reasoning about flow of data in program Different kinds of data: constants, variables,
Code Optimization Data Flow Analysis. Data Flow Analysis (DFA)  General framework  Can be used for various optimization goals  Some terms  Basic block.
ActionScript Programming Help
Data Flow Analysis Suman Jana
Textbook: Principles of Program Analysis
Program Representations
Topic 10: Dataflow Analysis
University Of Virginia
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
Data Flow Analysis Compiler Design
Topic-4a Dataflow Analysis 2019/2/22 \course\cpeg421-08s\Topic4-a.ppt.
Static Single Assignment
Presentation transcript:

Lecture 13 – Dataflow Analysis (and more!) Eran Yahav 1 Reference: Dragon 9, 12

Last week: dataflow  General recipe  design the domain  transfer functions  determine join operation (may/must?)  direction: forward/backward?  note initial values 2

Dataflow Analysis  Build control-flow graph  Assign transfer functions  Compute fixed point 3

Control-Flow Graph 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 1: y:=x 2: z:=1 3: y > 0 4: z=z*y 5: y=y-1 6: y:=0 4

Executions 5 1: y:=x 2: z:=1 3: y > 0 4: z=z*y 5: y=y-1 6: y:=0 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0

Input/output Sets 6 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 1: y:=x 2: z:=1 3: y > 0 4: z=z*y 5: y=y-1 6: y:=0 in(1) out(1) in(2) out(2) in(3) out(3) in(4) out(4) in(5) out(5) in(6) out(6)

Transfer Functions 1: y:=x 2: z:=1 3: y > 0 4: z=z*y 5: y=y-1 6: y:=0 out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) in(6) = out(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } out(3) = in(3) 7

System of Equations in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 8

System of Equations in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 in(1)={(x,?),(y,?),(z,?)}, in(2)=out(1), in(3)=out(2) U out(5), in(4)=out(3), in(5)=out(4),In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) }, out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3), out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) }, out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } In(1)  {(x,?),(y,?),(z,?)}== In(2)  {(y,1)}{(x,?),(y,1),(z,?)}== In(3)  {(z,2),(y,5)} {(z,2),(z,4),(y,5)} In(4)  {(z,2),(y,5)} In(5)  {(z,4)} In(6)  {(z,2),(y,5)} Out(1)  {(y,1)}{(x,?),(y,1),(z,?)}== Out(2)  {(z,2)} {(z,2),(y,1)}== Out(3)  {(z,2),(y,5)} Out(4)  {(z,4)} Out(5)  {(y,5)} {(z,4),(y,5)} Out(6)  {(y,6)} … 9

F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 RD  RD’ when  i: RD(i)  RD’(i) … RD  F(RD  )F(F(RD  ))F(F(F(RD  )))F(F(F(F(RD  )))) In(1)  {(x,?),(y,?),(z,?)}== In(2)  {(y,1)}{(x,?),(y,1),(z,?)}== In(3)  {(z,2),(y,5)} {(z,2),(z,4),(y,5)} In(4)  {(z,2),(y,5)} In(5)  {(z,4)} In(6)  {(z,2),(y,5)} Out(1)  {(y,1)}{(x,?),(y,1),(z,?)}== Out(2)  {(z,2)} {(z,2),(y,1)}== Out(3)  {(z,2),(y,5)} Out(4)  {(z,4)} Out(5)  {(y,5)} {(z,4),(y,5)} Out(6)  {(y,6)} System of Equations 10

Monotonicity F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 RD  F(RD  )F(F(RD  ))F(F(F(RD  )))F(F(F(F(RD  )))) … Out(1)  {(y,1)}{(x,?),(y,1),(z,?)}{(y,1)}…. … … out(1) = {(y,1)} when (x,?)  out(1) in(1) \ { (y,l) | l  Lab } U { (y,1) }, otherwise (silly example just for illustration) 11

Analyses as Monotone Frameworks  Property space  Powerset  Clearly a complete lattice  Transformers  Kill/gen form  Monotone functions (let’s show it) 12

Kill/Gen formulation for Reaching Definitions 13 Blockout (lab) [x := a] lab in(lab) \ { (x,l) | l  Lab } U { (x,lab) } [skip] lab in(lab) [b] lab in(lab) Blockkillgen [x := a] lab { (x,l) | l  Lab }{ (x,lab) } [skip] lab  [b] lab  For each program point, which assignments may have been made and not overwritten, when program execution reaches this point along some path.

Monotonicity of Kill/Gen transformers  Have to show that x  x’ implies f(x)  f(x’)  Assume x  x’, then for kill set k and gen set g (x \ k) U g  (x’ \ k) U g  Technically, since we want to show it for all functions in F, we also have to show that the set is closed under function composition 14

Distributivity of Kill/Gen transformers  Have to show that f(x  y)  f(x)  f(y)  f(x  y) = ((x  y) \ k) U g = ((x \ k)  (y \ k)) U g = (((x \ k) U g)  ((y \ k) U g)) = f(x)  f(y)  Used distributivity of  and U  Works regardless of whether  is U or  15

Solving Gen/Kill Equations  Designated block Entry with OUT[Entry]=   pred(B) = predecessor nodes of B in the control flow graph 16 OUT[ENTRY] =  ; for (each basic block B other than ENTRY) OUT[B] =  ; while (changes to any OUT occur) { for (each basic block B other than ENTRY) { OUT[B]= (IN[B]  killB)  genB IN[B] =  p  pred(B) OUT[p] }

Analyses Summary 17 Reaching Definitions Available Expressions Live Variables L  (Var x Lab)  (AExp)  (Var)    AExp  Initial{ (x,?) | x  Var}  Entry labels{ init } final DirectionForward Backward F{ f: L  L |  k,g : f(val) = (val \ k) U g } f lab f lab (val) = (val \ kill)  gen

Meanwhile, IRL  new compilers for new languages  new compilers for old languages  e.g., Java->JavaScript  new uses of compiler technology  … 18

Google Web Toolkit 19 Javascript code js Java code txt Semantic Representation Backend (synthesis) Compiler Frontend (analysis)

GWT Compiler Optimization public class ShapeExample implements EntryPoint { private static final double SIDE_LEN_SMALL = 2; private final Shape shape = new SmallSquare(); public static abstract class Shape { public abstract double getArea(); } public static abstract class Square extends Shape { public double getArea() { return getSideLength() * getSideLength(); } public abstract double getSideLength(); } public static class SmallSquare extends Square { public double getSideLength() { return SIDE_LEN_SMALL; } } public void onModuleLoad() { Shape shape = getShape(); Window.alert("Area is " + shape.getArea()); } private Shape getShape() { return shape; } 20 (source:

GWT Compiler Optimization public class ShapeExample implements EntryPoint { public void onModuleLoad() { Window.alert("Area is 4.0"); } 21 (source:

Adobe ActionScript First introduced in Flash Player 9, ActionScript 3 is an object-oriented programming (OOP) language based on ECMAScript—the same standard that is the basis for JavaScript—and provides incredible gains in runtime performance and developer productivity. ActionScript 2, the version of ActionScript used in Flash Player 8 and earlier, continues to be supported in Flash Player 9 and Flash Player ActionScript 3.0 introduces a new highly optimized ActionScript Virtual Machine, AVM2, which dramatically exceeds the performance possible with AVM1. As a result, ActionScript 3.0 code executes up to 10 times faster than legacy ActionScript code. (source:

Adobe ActionScript 23 AVM bytecode swf AS code txt Semantic Representation Backend (synthesis) Compiler Frontend (analysis)

Adobe ActionScript: Tamarin 24 The goal of the "Tamarin" project is to implement a high-performance, open source implementation of the ActionScript™ 3 language, which is based upon and extends ECMAScript 3rd edition (ES3). ActionScript provides many extensions to the ECMAScript language, including packages, namespaces, classes, and optional strict typing of variables. "Tamarin" implements both a high-performance just-in-time compiler and interpreter. The Tamarin virtual machine is used within the Adobe® Flash® Player and is also being adopted for use by projects outside Adobe. The Tamarin just-in- time compiler (the "NanoJIT") is a collaboratively developed component used by both Tamarin and Mozilla TraceMonkey. The ActionScript compiler is available as a component from the open source Flex SDK.Flex SDK

Mozilla SpiderMonkey  SpiderMonkey is a fast interpreter  runs an untyped bytecode and operates on values of type jsval—type-tagged double-sized values that represent the full range of JavaScript values.jsvaldouble-sized  SpiderMonkey contains two JavaScript Just-In-Time (JIT) compilers, a garbage collector, code implementing the basic behavior of JavaScript values…  SpiderMonkey's interpreter is mainly a single, tremendously long function that steps through the bytecode one instruction at a time, using a switch statement (or faster alternative, depending on the compiler) to jump to the appropriate chunk of code for the current instruction. 25 (source:

Mozilla SpiderMonkey: Compiler  consumes JavaScript source code  produces a script which contains bytecode, source annotations, and a pool of string, number, and identifier literals. The script also contains objects, including any functions defined in the source code, each of which has its own, nested script.  The compiler consists of  a random-logic rather than table-driven lexical scanner  a recursive-descent parser that produces an AST  a tree-walking code generator  The emitter does some constant folding and a few codegen optimizations 26 (source:

Mozilla SpiderMonkey  SpiderMonkey contains a just-in-time trace compiler that converts bytecode to machine code for faster execution.  The JIT works by detecting commonly executed loops, tracing the executed bytecodes in those loops as they run in the interpreter, and then compiling the trace to machine code.  See the page about the Tracing JIT for more details.Tracing JIT  The SpiderMonkey GC is a mark-and-sweep, non- conservative (exact) collector. 27 (source:

Mozilla TraceMonkey 28 (source:

Mozilla TraceMonkey  Goal: generate type-specialized code  Challenges  no type declarations  statically trying to determine types mostly hopeless  Idea  run the program for a while and observe types  use observed types to generate type-specialized code  compile traces  Sounds suspicious? 29

Mozilla TraceMonkey  Problem 1: “observing types” + compiling possibly more expensive than running the code in the interpreter  Solution: only compile code that executes many times  “hot code” = loops  initially everything runs in the interpreter  start tracing a loop once it becomes “hot” 30

Mozilla TraceMonkey  Problem 2: past types do not guarantee future types… what happens if types change?  Solution: insert type-checks into the compiled code  if type-check fails, need to recompile for new types  code with frequent type changes will suffer some slowdown 31

Mozilla TraceMonkey 32 function addTo(a, n) { for (var i = 0; i < n; ++i) a = a + i; return a; } var t0 = new Date(); var n = addTo(0, ); print(n); print(new Date() - t0); a = a + i; // a is an integer number (0 before, 1 after) ++i; // i is an integer number (1 before, 2 after) if (!(i < n)) // n is an integer number ( ) break; trace_1_start: ++i; // i is an integer number (0 before, 1 after) temp = a + i; // a is an integer number (1 before, 2 after) if (lastOperationOverflowed()) exit_trace(OVERFLOWED); a = temp; if (!(i < n)) // n is an integer number ( ) exit_trace(BRANCHED); goto trace_1_start;

33 Mozilla TraceMonkey SystemRun Time (ms) SpiderMonkey (FF3)990 TraceMonkey (FF3.5)45 Java (using int)25 Java (using double)74 C (using int)5 C (using double)15

Static Analysis Tools  Coverity  SLAM  ASTREE  … 34

35

36

Lots and lots of research  Program Analysis  Program Synthesis  …  Next semester  Project –contact me if you’re interested  Advanced course in program analysis 37

THE END 38