Data-Flow Analysis. Approaches Static Analysis Inspections Dependence analysis Symbolic execution Software Verification Data flow analysis Concurrency.

Slides:



Advertisements
Similar presentations
DATAFLOW TESTING DONE BY A.PRIYA, 08CSEE17, II- M.s.c [C.S].
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
A System to Generate Test Data and Symbolically Execute Programs Lori A. Clarke September 1976.
Lecture 11: Code Optimization CS 540 George Mason University.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Chair of Software Engineering From Program slicing to Abstract Interpretation Dr. Manuel Oriol.
1 Introduction to Data Flow Analysis. 2 Data Flow Analysis Construct representations for the structure of flow-of-data of programs based on the structure.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
Data Flow Testing (DFT) Data flow testing is NOT the same as constructing Design Diagrams in the form of data-flow-diagrams (DFD) or E-R diagrams. It is.
The Application of Graph Criteria: Source Code  It is usually defined with the control flow graph (CFG)  Node coverage is used to execute every statement.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,
CS 536 Spring Global Optimizations Lecture 23.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 16, 1999.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Data Flow Analysis Compiler Design Nov. 3, 2005.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Lecture #17, June 5, 2007 Static Single Assignment phi nodes Dominators Dominance Frontiers Dominance Frontiers when inserting phi nodes.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Software Testing and Quality Assurance
Finite State Verification for Software Systems Lori A. Clarke University of Massachusetts Laboratory for Advanced Software Engineering Research
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Ben Livshits Based in part of Stanford class slides from
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 5 Data Flow Testing
1 CS 201 Compiler Construction Data Flow Analysis.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
Topics in Software Dynamic White-box Testing Part 2: Data-flow Testing
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Data Flow Analysis Compiler Baojian Hua
Software testing techniques Testing criteria based on data flow
Presented By Dr. Shazzad Hosain Asst. Prof., EECS, NSU
Finite-State Verification. A quick look at three approaches to FSV Model Checking Flow Equations Data Flow Analysis FLAVERS.
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
Formal Methods Program Slicing & Dataflow Analysis February 2015.
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
MIT Introduction to Program Analysis and Optimization Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Jeffrey D. Ullman Stanford University. 2 boolean x = true; while (x) {... // no change to x }  Doesn’t terminate.  Proof: only assignment to x is at.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
Control Flow Graphs : The if Statement 1 if (x < y) { y = 0; x = x + 1; } else { x = y; } x >= yx < y x = y y = 0 x = x + 1 if (x < y) { y = 0;
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
Optimization Simone Campanoni
Dataflow Analysis CS What I s Dataflow Analysis? Static analysis reasoning about flow of data in program Different kinds of data: constants, variables,
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
COMPILERS Liveness Analysis hussein suleman uct csc3003s 2009.
Data Flow Analysis Suman Jana
Dataflow Testing G. Rothermel.
Machine-Independent Optimization
Topic 10: Dataflow Analysis
University Of Virginia
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
Flow Analysis Data-flow analysis, Control-flow analysis, Abstract interpretation, AAM.
Sudipto Ghosh CS 406 Fall 99 November 16, 1999
Data Flow Analysis Compiler Design
COMPILERS Liveness Analysis
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

Data-Flow Analysis

Approaches Static Analysis Inspections Dependence analysis Symbolic execution Software Verification Data flow analysis Concurrency analysis Interprocedural analysis Dynamic Analysis Assertions Error seeding, mutation testing Coverage criteria Fault-based testing Object oriented testing Regression testing

Data Flow Analysis (DFA) Efficient technique for proving properties about programs Not as powerful as automated theorem provers, but requires less human expertise Uses an annotated control flow graph model of the program Compute facts for each node Use the flow in the graph to compute facts about the whole program We’ll focus on single units

Some examples of DFA techniques DFA used extensively in program optimization e.g., determine if a definition is dead (and can be removed) determine if a variable always has a constant value determine if an assertion is always true and can be removed DFA can also be used to find anomalies in the code Find def/ref anomalies [Osterweil and Fosdick] Cecil/Cesar system demonstrated the ability to prove general user- specified properties [Olender and Osterweil] FLAVERS demonstrated applicability to concurrent system [Dwyer and Clarke] Why “anomalies” and not faults? May not correspond to an actual executable failure

Data flow analysis First,determine local information that is true at each node in the CFG e.g., What variables are defined What variables are referenced Usually stored in sets e.g.,ref(n) is the set of variables referenced at node n Second, use this local information and control flow information to compute global information about the whole program Done incrementally by looking at each node’s successors or predecessors Use a fixed point algorithm-continue to update global information until a fixed point is reached

Reaching Definitions Definition reaches a node if there is a def clear path from the definition to that node Definition of x at node 1 reaches nodes 2,3,4,5 but not 6 :=x x:= Could be used to determine data dependencies, useful for debugging, data flow testing, etc. Start from a definition, and move forward on the graph to see how far it reaches

Computing global information- reaching definitions reaching_def= {x i,y j } 3 21 reaching _def={x i } reaching _def={y j } Definitions that might reach a node reaching_def ={y j } 3 21 reaching_ def={x i,y j } reaching_ def={y j } Definitions that must reach a node Start from a definition, and move forward on the graph to see how far it reaches

Reaching Definitions X i means that the definition of variable x at node i {might|must} reach the current node Also stated as {possible|definite} {some|all} {any|all} {may|must}

Computing values for a node y:= …x In(n) Out(n) Forward flow Keep track of the definitions into a node that have not been redefined Flow out of a node depends on flow in and on what happens in the node def(n)={y} ref(n)={x…}

Example: Possible Reaching Definitions X1X1 Y2Y2 X4X4 {X 1 } {X 1,Y 2 } {X 1,Y 2,X 4 } { } {X 1,Y 2 } {X 1 } {X 1,Y 2 } {X 4,Y 2 } {X 1,Y 2,X 4 } int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + y; end if;... x = foo() y = x + 2 if(x > 0) x = x + y Forward flow, any path problem, def/ref sets are the initial facts associated with each node def={x} ref={ } def={y} ref={x} def={} ref={x} def={x} ref={x,y} def={} ref={ }

Example: Definite Reaching Definitions X1X1 Y2Y2 X4X4 {X 1 } {X 1,Y 2 } {Y 2 } { } {X 1,Y 2 } {X 1 } {X 1,Y 2 } {X 4,Y 2 } {Y 2 } int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + y; end if;... x = foo() y = x + 2 if(x > 0) x = x + y Forward flow, all path problem, def/ref sets are the initial facts associated with each node

Example: Definite Reaching Definitions X1X1 Y2Y2 X4X4 {X 1 } {X 1,Y 2 } {Y 2 } { } {X 1,Y 2 } {X 1 } {X 1,Y 2 } {X 4,Y 2 } {Y 2 } int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + y; end if;... x = foo() y = x + 2 if(x > 0) x = x + y Forward flow, all path problem, def/ref sets are the initial facts associated with each node What happens when we add a loop?

Example: Definite Reaching Definitions X1X1 Y2Y2 X4X4 {X 1 } {X 1,Y 2 } {Y 2 } { } {X 1,Y 2 } {X 1 } {X 1,Y 2 } {X 4,Y 2 } {Y 2 } int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + y; end if;... x = foo() y = x + 2 if(x > 0) x = x + y Forward flow, all path problem, def/ref sets are the initial facts associated with each node X X X X What happens when we add a loop?

Computing values for a node y:= …x In(n) Out(n) Forward flow Keep track of the definitions into a node that have not been redefined Flow out of a node depends on flow in and on what happens in the node For definite reaching defs: In(n) =  k є pred Out(k) Out(n) = In(n)-def(n) U def(n n )

Used to determine what variables need to be kept in a register after executing a node Live Variables a variable, x, is live at node p if there exists a def-clear path wrt x from node p to a use of x x is live at 2, 3, 4, but not at node 5 :=x x:= Start from a use, and move backward on the graph until a def is encountered

Computing global information- live variables live={x,y} 2 1 live={y} Possible live variables 43 live={x} 2 1 live={x, y} Definite live variables 43 live={x}

Live Variables: Computing values for a node y:= …x In(n) Out(n) Backward flow Keep track of the (forward) references that have not found their (backward) definition site Flow out of a node depends on flow in and on what happens in the node

{} {x} {x,y} {} Backward flow, any path problem, def/ref sets are the initial facts associated with each node Example: Possible Live Variables {x,y} {} {x,y} {x} { } int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + y; end if;... x = foo() y = x + 2 if(x > 0) x = x + y

{} {x} {} Example: Definite Live Variables {x,y} {} {x} { } int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + y; end if;... x = foo() y = x + 2 if(x > 0) x = x + y Backward flow, all path problem, def/ref sets are the initial facts associated with each node

{} {x} {} Example: Definite Live Variables {x,y} {} {x} { } int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + y; end if;... x = foo() y = x + 2 if(x > 0) x = x + y Backward flow, all path problem def/ref sets are the initial facts associated with each node In(n) =  k є succ(n) Out(k) Out(n) = In(n)-def(n) U ref(n)

Data Flow Analysis decisions 3 21 Union or Intersection 54 Backward/Forward Any/All Facts Equations Initial values

Constant Propagation Some variables at a point in a program may only take on one value If we know this, can optimize the code when it is compiled

Constant Propagation x = 3 y = x + 2 if(x > z) x = x + y int x,y;... x := 3; y := x + 2; if x > z then x := x + y; end if;...

Constant Propagation x = 3 y = x + 2 if(x >z) x = x + y U=unknown N= not a constant Forward flow, all paths problem Facts are the computations and assignments made at each node (U,U) (3,U) (3,5) (N,5) (8,5) (3,5) int x,y;... x := 3; y := x + 2; if x > z then x := x + y; end if;... (x,y) (3,U) (3,5)

Constant Propagation with a loop (U,U) (3,U) (3,5) (8,5) (N,5) x = 3 y = x + 2 if(x > z) x = x + y (3,U) (3,5)(N,5) U=unknown N= not a constant (3,5)(N,5) Forward flow, all paths problem Facts are the computations and assignments made at each node

Fixed point The data flow analysis algorithm will eventually terminate If there are only a finite number of possible sets that can be associated with a node If the function that determines the sets that can be associated with a node is monotonic

Constant Propagation with a loop (U,U) (3,U) (3,5) (8,5) (N,5) x = 3 y = x + 2 if(x > 0) x = x + y (3,U) (3,5)(N,5) U=unknown N= not a constant Forward flow, all paths problem Facts are the computations and assignments made at each node

DAVE: detects anomalous def/ref behavior L. D. Fosdick and Leon J. Osterweil, " Data Flow Analysis in Software Reliability", ACM Computing Surveys, September 1976, 8 (3), pp Application independent specification of erroneous behavior use of an uninitialized variable redefinition of a variable that is not referenced

Anomalous pairs of ref/def information d - defined, r - referenced, u - undefined d..r: defined variable reaches a reference u..d: undefined variable reaches a definition d..d: definition is redefined before being used d..u: definition is undefined before being used u..r: undefined variable reaches a reference Unreference definition undefined reference

Consider an unreferenced definition For a definition of a, want to know if that definition is not going to be referenced Is there some path where a is redefined or undefined before being used? May be indicative of a problem Usually just a programming convenience and not a problem For a definition of a, want to know if on all paths is a redefined or undefined before being used May be indicative of a problem Or could just be wasteful

Some versus All def a ref a def a For some path, a is redefined without a subsequent reference For all paths, a is redefined without a subsequent reference

Unreferenced definition: Computing values for a node y:= …x In(n) Out(n) Forward flow Keep track of the definitions into a node that have not been referenced Flow out of a node depends on flow in and on what happens in the node

Unreferenced definition: Computing values for a node y:= …x In(n) Out(n) Forward flow Keep track of the definitions into a node that have not been referenced Flow out of a node depends on flow in and on what happens in the node Forward flow, all paths problem In(n) =  k є pred(n) Out(k) Out(n) = In(n)-ref(n) U def(n)

Unreferenced definitions x = foo() y = x + 2 if(x > 0) x = x + 1 y:= z Forward flow, all paths problem { } {x} {y} {x,y} {y} int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + 1; end if; y:= … (unreferenced defs) {x} {y} Need to look at each node where there is a def

Continuing with the unreferenced def example x = foo() y = x + 2 if(x > 0) x = x +1 y:= … { } {x} {y} {x,y} {y} {x} {y} A definiton is redefined without being used if def(n)  In(n)-ref(n) Must compute this for each node For this example, the last node would report a redefinition of y on all paths Finds the location where the redef occurs def={x} ref={ } def={y} ref={x} def={} ref={x} def={x} ref={x} def={y} ref={ }

Unreferenced definitions, (finds the location of the first def of the pair) x = foo() y = x + 2 if(x > 0) x = x + 1 y:= … Backward flow, all paths problem {x,y} {y} {y} double def {y} int x,y;... x := foo(); y := x + 2; if x > 0 then x := x + 1; end if; y :=... (unreferenced defs) {y}

In values depend on direction y:= … In(n) Out(n) Forward flow Backward flow

Data Flow Analysis Problem Need to determine the information that should be computed at a node Need to determine how that information should flow from node to node Backward or Forward Union or Intersection Often there is more than one way to solve a problem Can often be solved forward or backward, but usually one way is easier than the other