Dataflow Testing G. Rothermel.

Slides:



Advertisements
Similar presentations
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advertisements

Lecture 11: Code Optimization CS 540 George Mason University.
Data Flow Analysis. Goal: make assertions about the data usage in a program Use these assertions to determine if and when optimizations are legal Local:
Data Flow Coverage. Reading assignment L. A. Clarke, A. Podgurski, D. J. Richardson and Steven J. Zeil, "A Formal Evaluation of Data Flow Path Selection.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
1 Introduction to Data Flow Analysis. 2 Data Flow Analysis Construct representations for the structure of flow-of-data of programs based on the structure.
Systems V & V, Quality and Standards
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,
Software Testing Sudipto Ghosh CS 406 Fall 99 November 16, 1999.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Data flow analysis Emery Berger University.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Ben Livshits Based in part of Stanford class slides from
Topics in Software Dynamic White-box Testing: Data-flow Testing
1 CS 201 Compiler Construction Data Flow Analysis.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
1 Data-Flow Analysis Proving Little Theorems Data-Flow Equations Major Examples.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
White-Box Testing Techniques II Originals prepared by Stephen M. Thebaut, Ph.D. University of Florida Dataflow Testing.
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
Jeffrey D. Ullman Stanford University. 2 boolean x = true; while (x) {... // no change to x }  Doesn’t terminate.  Proof: only assignment to x is at.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
Final Code Generation and Code Optimization.
1 Graph Coverage (3). Reading Assignment P. Ammann and J. Offutt “Introduction to Software Testing” ◦ Section 2.2 ◦ Section
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Code Optimization Data Flow Analysis. Data Flow Analysis (DFA)  General framework  Can be used for various optimization goals  Some terms  Basic block.
Paul Ammann & Jeff Offutt
Data Flow Analysis Suman Jana
White-Box Testing Pfleeger, S. Software Engineering Theory and Practice 2nd Edition. Prentice Hall, Ghezzi, C. et al., Fundamentals of Software Engineering.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing
Software Testing and Maintenance 1
Paul Ammann & Jeff Offutt
Graph Coverage Criteria CS 4501 / 6501 Software Testing
Paul Ammann & Jeff Offutt
Paul Ammann & Jeff Offutt
Structural testing, Path Testing
White-Box Testing Techniques
White-Box Testing.
Graph Coverage for Design Elements CS 4501 / 6501 Software Testing
White-Box Testing.
White-Box Testing Techniques II
CHAPTER 4 Test Design Techniques
University Of Virginia
White-Box Testing Techniques II
Code Optimization Chapter 10
Code Optimization Chapter 9 (1st ed. Ch.10)
White-Box Testing.
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
CS 201 Compiler Construction
Graph Coverage for Design Elements CS 4501 / 6501 Software Testing
Sudipto Ghosh CS 406 Fall 99 November 16, 1999
Paul Ammann & Jeff Offutt
Graph Coverage Criteria CS 4501 / 6501 Software Testing
White-Box Testing Techniques I
Topic-4a Dataflow Analysis 2019/2/22 \course\cpeg421-08s\Topic4-a.ppt.
Final Code Generation and Code Optimization
Static Single Assignment
White-Box Testing Techniques II
Paul Ammann & Jeff Offutt
White-Box Testing.
Software Testing.
Presentation transcript:

Dataflow Testing G. Rothermel

White Box Adequacy Criteria Statement coverage Decision coverage Condition coverage Path coverage Dataflow coverage

White Box Adequacy Criteria Statement coverage Decision coverage Condition coverage Path coverage Dataflow coverage

Comparing Criteria Analytically Criterion A subsumes criterion B if, for any program P and test suite T for P, T being A-adequate for P implies that T is B-adequate for P. path statement decision condition Can we find a criterion that is stronger than decision but doesn’t have the problems that path has?

Dataflow Testing: Motivation Suppose that a statement assigns a value but the use of that value is never executed under test Need definition-use pairs (du-pairs): associations between definitions and uses of the same variable or memory location a=c+10 d=a+y a not used on this path

Dataflow Testing: Find the Du-Pairs Starting at Statement 1 PROGRAM GCD begin 1 read(x) 2 read(y) 3 while (x <> y) do 4 if (x > y) then 5 x = x – y else 6 y = y – x endif endwhile 7 print x end Entry read(x) Exit read(y) while x <> y if x > y x = x - y y = y - x print x endif endwhile T F

Dataflow Testing: Find the Du-Pairs Starting at Statement 1 PROGRAM GCD begin 1 read(x) 2 read(y) 3 while (x <> y) do 4 if (x > y) then 5 x = x – y else 6 y = y – x endif endwhile 7 print x end Entry read(x) Exit read(y) while x <> y if x > y x = x - y y = y - x print x endif endwhile T F

Introduction Data-flow analysis provides information for dataflow testing and other tasks by computing the flow of data to points in the program For structured programs, data-flow analysis can be performed on an abstract syntax tree; in general, intraprocedural data-flow analysis is performed on the control flow graph

Introduction Entry Compute the flow of data to points in the program --- e.g., Where does the assignment to I in statement 1 reach? Where does the assignment computed in statement 2 reach? Which uses of variable J are reachable from the end of B1? Is the value of variable I used after statement 3? Interesting points before and after basic blocks or statements 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Exit

Data-flow Problems (Reaching Definitions) Entry A definition of a variable or memory location is a point or statement where that variable gets a value --- e.g., a read or assignment statement. A use of a variable or memory location is a point or statement where that variable’s value is fetched and used in a computation A definition of V reaches a point p if there exists a control-flow path in the CFG from the definition to p with no other definitions of V on the path (called a definition-clear path) Such a path may exist in the graph but may not be executable (I.e., there may be no input to the program that will cause it to be executed); such a path is infeasible. 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Exit

Data-flow Problems (Reaching Definitions) Entry Where are the definitions in the program? Of variable I: Of variable J: Which basic blocks (before block) do these definitions reach? Def 1 reaches Def 2 reaches Def 3 reaches Def 4 reaches Def 5 reaches 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Exit

Data-flow Problems (Reaching Definitions) Entry Where are the definitions in the program? Of variable I: 1, 3 Of variable J: 2, 4, 5 Which basic blocks (before block) do these definitions reach? Def 1 reaches Def 2 reaches Def 3 reaches Def 4 reaches Def 5 reaches 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Exit

Data-flow Problems (Reaching Definitions) Entry Where are the definitions in the program? Of variable I: 1, 3 Of variable J: 2, 4, 5 Which basic blocks (before block) do these definitions reach? Def 1 reaches B2 Def 2 reaches B1, B2, B3 Def 3 reaches B1, B3, B4 Def 4 reaches B4 Def 5 reaches Exit 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Exit

Data-flow Problems (Reaching Definitions) Entry Where are the definitions in the program? Of variable I: 1, 3 Of variable J: 2, 4, 5 Which uses do these definitions reach? Def 1 reaches B2 Def 2 reaches B1, B2, B3 Def 3 reaches B1, B3, B4 Def 4 reaches B4 Def 5 reaches Exit 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Exit

Data-flow Problems (Reaching Definitions) Entry Where are the definitions in the program? Of variable I: 1, 3 Of variable J: 2, 4, 5 Which uses do these definitions reach? Def 1 reaches B2 Def 2 reaches B1, B2, B3:4 Def 3 reaches B1, B3, B4 Def 4 reaches B4:5 Def 5 reaches Exit 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Exit

Reaching Definitions Algorithm Entry 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 How can we compute this information? What would be a naïve way? Exit

Reaching Definitions Algorithm Entry Method: Compute two kinds of local information (i.e., within a basic block) GEN[B] is the set of definitions that are created (generated) within B KILL[B] is the set of definitions that, if they reach the point before B (i.e., the beginning of B) won’t reach the end of B 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 <describe the sets for reaching definitions> Now ask what GEN for 1-4 and Kill for 1-4 are. Now ask how you could compute GEN and KILL, given you have the CFG for the program. <can get GEN with one pass over program; must have GEN to get KILL> Exit

Reaching Definitions Algorithm Entry Method (cont’d): 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Now what can we do with these sets to get the reaching definitions? Discuss intuitive methods. Exit

Reaching Definitions Algorithm Entry Method (cont’d): Compute two other sets by propagation IN[B] is the set of definitions that reach the beginning of B OUT[B] is the set of definitions that reach the end of B 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 How can we initialize these sets? Exit

Reaching Definitions Algorithm Entry Method (cont’d): 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Now what? Exit

Reaching Definitions Algorithm Entry Method (cont’d): Propagation method: Initialize the IN[B], OUT[B] sets for all B Iterate over all B until there are no changes to the IN[B], OUT[B] sets On each iteration, visit all B, and compute IN[B], OUT[B] as IN[B] = union OUT[P], for each P that is a predecessor of B OUT[B] = GEN[B] union (IN[B] – Kill[B]) 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Exit

Reaching Definitions Algorithm algorithm ReachingDefinitions Input: CFG w/GEN[B], KILL[B] for all B Output: IN[B], OUT[B] for all B begin ReachingDefinitions IN[B]=empty; OUT[B]=GEN[B], for all B; change = true while change do begin Change = false foreach B do begin In[B] = union OUT[P], for each P that is a predecessor of B Oldout = OUT[B] OUT[B] = GEN[B] union (IN[B] – Kill[B]) if OUT[B] != Oldout then change = true endfor endwhile end Reaching Definitions

Reaching Definitions Algorithm Data-flow for example (set approach) All entries are sets; sets in red indicate changes from last iteration thus, requiring another iteration of the algorithm 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Init GEN KILL IN OUT Iter1 Iter2 1 2 3 4

Reaching Definitions Algorithm Data-flow for example (set approach) 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Init GEN KILL IN OUT Iter1 Iter2 1 1,2 1,2,34,5 -- 3 2,3 2 1,3 4 2,4,5 3,4 5 3,5

Reaching Definitions Algorithm Data-flow for example (bit-vector approach) 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Init GEN KILL IN OUT Iter1 1 2 3 4

Reaching Definitions Algorithm Data-flow for example (bit-vector approach) 1. I := 2 2. J := I + 1 3. I := 1 4. J := J + 1 5. J := J - 4 B1 B2 B3 B4 Init GEN KILL IN OUT Iter1 1 11000 11111 00000 00100 2 10100 01100 3 00010 01011 00110 4 00001 00101

Conservatism and Approximation Exact solutions to most dataflow problems are undecidable. Thus, we compute approximations. Approximate analysis can overestimate the solution: Solution contains actual information plus some spurious information but does not omit information This type of information is safe or conservative Approximate analysis can underestimate the solution: Solution may not contain all actual information This type of information in unsafe For optimization, need conservative, safe analysis For software engineering tasks, we may be able to use unsafe analysis information

Definition-Use Pairs A definition-use pair (DU-pair) consists of a definition D of variable v and a use U of v that D reaches.

Definition-Use Pairs B1 B3 B2 B6 B5 B4 entry Z > 1 X = 1 Z > 2 Y = X + 1 X = 2 Z = X – 3 X = 4 Z = X + 7 exit B1 B3 B2 B6 B5 B4 DU-pairs for (2:X): DU-pairs for (4:X): DU-pairs for (5:X): DU-pairs for (3:Y): DU-pairs for (5:Z): DU-pairs for (6:Z):

Definition-Use Pairs B1 B3 B2 B6 B5 B4 {(2:X,3:X),(2:X,5:X)} entry Z > 1 X = 1 Z > 2 Y = X + 1 X = 2 Z = X – 3 X = 4 Z = X + 7 exit B1 B3 B2 B6 B5 B4 DU-pairs for (2:X): {(2:X,3:X),(2:X,5:X)} DU-pairs for (4:X): {(4:X,5:X)} DU-pairs for (5:X): {(5:X,6:X)} DU-pairs for (3:Y): {} DU-pairs for (5:Z): DU-pairs for (6:Z):

Data Dependence Graph A data dependence graph has nodes for every basic block and edges representing the flow of data between nodes Different types of data dependence Flow: def to use Anti: use to def Out: def to def entry Z > 1 X = 1 Z > 2 Y = X + 1 X = 2 Z = X – 3 X = 4 Z = X + 7 exit B1 B3 B2 B6 B5 B4

Data Dependence Graph B1 B3 B2 B6 B5 B4 B1 B4 B2 B3 B5 B6 entry entry Z > 1 X = 1 Z > 2 Y = X + 1 X = 2 Z = X – 3 X = 4 Z = X + 7 exit B1 B3 B2 B6 B5 B4 B1 Z > 1 B4 X = 1 Z > 2 X = 2 B2 Z = X – 3 X = 4 B3 B5 Y = X + 1 B6 Z = X + 7 exit

Data Dependence Graph B1 B3 B2 B6 B5 B4 B1 B4 B2 B3 B5 B6 entry entry Z > 1 X = 1 Z > 2 Y = X + 1 X = 2 Z = X – 3 X = 4 Z = X + 7 exit B1 B3 B2 B6 B5 B4 B1 Z > 1 B4 X = 1 Z > 2 X = 2 B2 Z = X – 3 X = 4 B3 B5 Y = X + 1 B6 Z = X + 7 exit

Data Flow Testing Data flow testing involves covering du-pairs (or covering data dependence edges in a data dependence graph). To render this stronger than branch coverage we distinguish predicate uses (p-uses) from computation uses (c-uses), and say that to cover a du-pair ending in a p-use, you must exercise all outcomes of the predicate Having done that, which pairs do we need to cover? All-defs coverage: test each def to some use. All-uses coverage: test each def to each use by some path All-paths coverage: test each def to each use by all acyclic paths

Comparing Criteria Analytically Criterion A subsumes criterion B if, for any program P and test suite T for P, T being A-adequate for P implies that T is B-adequate for P. path condition all uses all defs decision statement

Comparing Criteria Empirically (Hutchins et al, ICSE 94) Strategy Mean Cases Faults Found Random Testing 100 79.5% Branch Testing 34 85.5% All Uses 84 90.0%

Dataflow Testing: Find the Du-Pairs Starting at Statement 5 PROGRAM GCD begin 1 read(x) 2 read(y) 3 while (x <> y) do 4 if (x > y) then 5 x = x – y else 6 y = y – x endif endwhile 7 print x end Entry read(x) Exit read(y) while x <> y if x > y x = x - y y = y - x print x endif endwhile T F