Data Flow Analysis Compiler Baojian Hua

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Lecture 11: Code Optimization CS 540 George Mason University.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Graph-Coloring Register Allocation CS153: Compilers Greg Morrisett.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Code Optimization. 2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer.
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
Program Representations. Representing programs Goals.
Optimization Compiler Baojian Hua
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
Lecture 01 - Introduction Eran Yahav 1. 2 Who? Eran Yahav Taub 734 Tel: Monday 13:30-14:30
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
More Dataflow Analysis CS153: Compilers Greg Morrisett.
Cpeg421-08S/final-review1 Course Review Tom St. John.
1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Chair of Software Engineering Fundamentals of Program Analysis Dr. Manuel Oriol.
Liveness Analysis Mooly Sagiv Schrierber Wed 10:00-12:00 html://
Program analysis Mooly Sagiv html://
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Overview of program analysis Mooly Sagiv html://
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Precision Going back to constant prop, in what cases would we lose precision?
1 CS 201 Compiler Construction Data Flow Analysis.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
MIT Introduction to Program Analysis and Optimization Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
Intermediate Code Representations
Final Code Generation and Code Optimization.
Control Flow Analysis Compiler Baojian Hua
CS 3220: Compilation Techniques for Parallel Systems Spring Pitt CS
CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Optimization Simone Campanoni
Code Optimization Data Flow Analysis. Data Flow Analysis (DFA)  General framework  Can be used for various optimization goals  Some terms  Basic block.
COMPILERS Liveness Analysis hussein suleman uct csc3003s 2009.
Data Flow Analysis Suman Jana
Mooly Sagiv html://
Dataflow Testing G. Rothermel.
Topic 10: Dataflow Analysis
University Of Virginia
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
Data Flow Analysis Compiler Design
Final Code Generation and Code Optimization
Static Single Assignment
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/4/2019 CPEG421-05S/Topic5.
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/17/2019 CPEG421-05S/Topic5.
SSA-based Optimizations
Intermediate Code Generation
Live variables and copy propagation
COMPILERS Liveness Analysis
Presentation transcript:

Data Flow Analysis Compiler Baojian Hua

Front End source code abstract syntax tree lexical analyzer parser tokens IR semantic analyzer

Middle End AST translation IR1 asm other IR and translation translation IR2

Optimizations AST translation IR1 asm other IR and translation translation IR2 opt

General Scheme for Optimization Analysis control flow, data flow, dependency, … to obtain conservative static knowledge of the program being optimized approximation of the dynamic Rewriting rewrite the program dependent on the knowledge obtained above IR IR ’ static information analysis rewriting

“ Conservative Static ” Cjump (x==5? L1: L2) y = 1y = 2 print (y) Can we substitute y with the value 2? This amounts to prove that x is always equal to 5! Suppose x is an input from user, it ’ s impossible to know it ’ s value statically. So one must be conservative to use the static knowledge.

Liveness Analysis

Motivation Low level IRs assume an infinite number of abstract “ registers ” good for code generations but bad for execution on a real machine machine has a finite number of registers so how to leverage this? The goal of register allocation (optimization) is to put infinite variables into a few registers need liveness analysis

Example Consider this TAC: Three variables: a, b, and c. And assume that the target machine has only one register: r. Is it possible to put all three variables “ a ”, “ b ” and “ c ” in register “ r ” ? a = 1 b = a + 2 c = b + 3 return c

Example Calculate which variable is “ live ” at a given program point. {c}{c} {b}{b} {a}{a} The “ liveness ” information gives live ranges. Live ranges don ’ t overlap, thus all three variables can be put into one reg ’. Consider this TAC: a = 1 b = a + 2 c = b + 3 return c

Example Register allocation: a => r b => r c => r {c}{c} {b}{b} {a}{a} Code rewriting: r = 1 r = r + 2 r = r + 3 return r Consider this TAC: a = 1 b = a + 2 c = b + 3 return c

Data Flow Equations for Liveness Inside basic blocks (backward): in = use[n] \/ (out - def[n]) // Example: a = 1 b = a + 2 c = b + 3 return c // Example: a = 1 b = a + 2 c = b + 3 return a + c int out

For general CFG Equations: in[n] = use[n]\/(out[n]-def[n]) out[n] = \/ s ∈ succ[n] in[s] Fixpoint algorithm init in out sets with {} loop until no set changes use[n] def[n] in[n] out[n]

Example in/out 1{} {} {a} … 2{} {a} {}{a} {b,c} … 3{} {b,c} {}{b,c}{b} … 4{} {b} {}{b}{a,c} … 5{} {a} {a}{a,c} … 6{} {c} {} … a = 0 b = a + 1 c = c + b a = b * 2 a<N return c node def{a}{b}{c}{a}{} use{}{a}{b, c}{b}{a, N}{c} {a,c}{a,c} {b,c}{b,c} {b,c}{b,c} {a,c}{a,c} {a,c}{a,c} Final live_out Loop the nodes with order: 1, 2, 3, 4, 5, 6 {c}{c} in[n] = use[n] \/ (out[n]-def[n]) out[n] = \/ s\in succ[n] in[s]

Interference Graph a = 0 b = a + 1 c = c + b a = b * 2 a<N return c {a,c}{a,c} {b,c}{b,c} {b,c}{b,c} {a,c}{a,c} {a,c}{a,c} Final live_out {c}{c} For any two variable x and y, if they are live simultaneously, then draw an (undirected) edge x->y. a b c

Speeding-up the analysis Ordering the nodes for liveness analysis: reverse top-sort order You do this in lab 5 Once a variable Careful selection of set representation Careful data structure engineering Say: bit-vector Basic block You do this in lab 5

Basic Blocks Step 1: calculate def and use for each basic block b one pass backward calculation Step 2: do liveness analysis on each block just as discussed above Step 3: calculate liveness information for each statement in each block one pass backward calculation

Example out/in 3{} {} {c} 2{} {c} {a,c}{a,c} 1{} {a,c} {c} a = 0 b = a + 1 c = c + b a = b * 2 a<N return c block123 def{a}{a,b,c}{} use{}{a,c}{c} This set does NOT contain variable “ b ”. Why? Blocks are reverse topo- sort ordered live_out for each block {a,c} {} Backward calculation of live_out for each statement. {a,c} {b,c}

Reaching Definition

a = 0 b = a + 1 c = c + b a = b * 2 a<N return c E.g., can we substitute the variable a with 0? The problem: at any program point, we ’ d like to know where the value of a variable x is defined. If so, we are doing the so- called constant propagation optimization.

Implementation a = 0 b = a + 1 c = c + b a = b * 2 a<N return c Number each definition: Here we number the four definition with 5, 6, 7, 8, which have no special meaning, just: 1. they are different from the block number, and 2. they are all unique.) 5: 6: 7: 8:

Equations a = 0 b = a + 1 c = c + b a = b * 2 a<N return c Calculate def and kill for each block, based on the equation for statement: def[d: x= … ] = {d} kill[d: x= … ] = defs(x)-{d} 5: 6: 7: 8: def[1] = {5} kill[1] = {8} def[2] = {6,7,8} kill[2] = {5} def[3] = {} kill[3] = {}

Data Flow Equation Forward calculation: in[b] = \/ q ∈ pred(b) out[b] out[b] = def[b]\/(in[b]-kill[b])

Fixpoint algorithm a = 0 b = a + 1 c = c + b a = b * 2 a<N return c : 6: 7: 8: block123 def{5}{6,7,8}{} kill{8}{5}{} in/out 1{} {} {5} 2{} {5} {6,7,8}{5,6,7,8} {6,7,8} 3{} {6,7,8} in[b] = \/ q ∈ pred(b) out[b] out[b] = def[b]\/(in[b]-kill[b]) {} {5,6,7,8} {6,7,8}

Constant Propagation a = 0 b = a + 1 c = c + b a = b * 2 a<N return c : 6: 7: 8: {} {5,6,7,8} {6,7,8} Can we substitute the variable a here with the constant “ 0 ” ? No! Because there are two definitions for “ a ” which may reach this point: 5 and 8.

Available Expressions

a = 0 b = a + 1 c = c + b a = a + 1 a<N return c E.g., has the right-side expression “ a+1 ” been calculated and thus available here? So the second calculation can be avoided! The problem: at a given program point, we ’ d like to know whether or not the value of an expression e has been calculated and is also available. 1.The expression e must be calculated on every path to the point, and 2.variables used in e must not been redefined after the initial calculation.

Implementation a = 0 b = a + 1 c = c + b a = a + 1 a<N return c Calculate gen and kill for each block, based on the equation for statement. (Tiger table 17.4) gen[1] = {} kill[1] = {a+1} gen[2] = {} kill[2] = ALL gen[3] = {} kill[3] = {} All possible expressions: ALL={a+1, c+b}

Implementation a = 0 b = a + 1 c = c + b a = a + 1 a<N return c Calculate in/out for each block, based on the fixpoint algorithm. gen[1] = {} kill[1] = {a+1} gen[2] = {} kill[2] = ALL gen[3] = {} kill[3] = {} All available expressions: ALL={a+1, c+b} in/out 1{} ALL{} 2ALL {} 3ALL {}

Implementation a = 0 b = a + 1 c = c + b a = a + 1 a<N return c Calculate in/out for each statement, based on the in/out for each block. {} All available expressions: ALL={a+1, c+b} in/out 1{} ALL{} 2ALL {} 3ALL {} {a+1} {}

Common Sub-expression Elimination (CSE) a = 0 b = a + 1 c = c + b a = a + 1 a<N return c E.g., has the right-side expression “ a+1 ” been calculated and thus available here? So the second calculation can be avoided! After the available expression analysis, we know “ a+1 ” is available, so the second calculation can be omitted! return c {} {a+1} {} b But with which variable the expression “ a+1 ” should be substituted? We need to do reaching expression analysis... (Read the text and do homework!)