1 Cost Effective Dynamic Program Slicing Xiangyu Zhang Rajiv Gupta The University of Arizona.

Slides:



Advertisements
Similar presentations
Comparison and Evaluation of Back Translation Algorithms for Static Single Assignment Form Masataka Sassa #, Masaki Kohama + and Yo Ito # # Dept. of Mathematical.
Advertisements

8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
1 Authors: Vugranam C. Sreedhar, Roy Dz-Ching Ju, David M. Gilles and Vatsa Santhanam Reader: Pushpinder Kaur Chouhan Translating Out of Static Single.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
 Program Slicing Long Li. Program Slicing ? It is an important way to help developers and maintainers to understand and analyze the structure.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Program Representations. Representing programs Goals.
Program Slicing. 2 CS510 S o f t w a r e E n g i n e e r i n g Outline What is slicing? Why use slicing? Static slicing of programs Dynamic Program Slicing.
Program Slicing Mark Weiser and Precise Dynamic Slicing Algorithms Xiangyu Zhang, Rajiv Gupta & Youtao Zhang Presented by Harini Ramaprasad.
Presented By: Krishna Balasubramanian
CS590F Software Reliability What is a slice? S: …. = f (v)  Slice of v at S is the set of statements involved in computing v’s value at S. [Mark Weiser,
1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
Program Slicing Xiangyu Zhang. CS590F Software Reliability What is a slice? S: …. = f (v)  Slice of v at S is the set of statements involved in computing.
CS4723 Software Engineering Lecture 10 Debugging and Fault Localization.
1 Integrating Influence Mechanisms into Impact Analysis for Increased Precision Ben Breech Lori Pollock Mike Tegtmeyer University of Delaware Army Research.
A Comparison of Online and Dynamic Impact Analysis Algorithms Ben Breech Mike Tegtmeyer Lori Pollock University of Delaware.
Pruning Dynamic Slices With Confidence Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona.
Program Representations Xiangyu Zhang. CS590F Software Reliability Why Program Representations  Initial representations Source code (across languages).
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Improving Data-flow Analysis with Path Profiles ● Glenn Ammons & James R. Larus ● University of Wisconsin-Madison ● 1998 ● Presented by Jessica Friis.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Program Representations Xiangyu Zhang. CS590Z Software Defect Analysis Program Representations  Static program representations Abstract syntax tree;
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Catching Accurate Profiles in Hardware Satish Narayanasamy, Timothy Sherwood, Suleyman Sair, Brad Calder, George Varghese Presented by Jelena Trajkovic.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 5 Data Flow Testing
Precision Going back to constant prop, in what cases would we lose precision?
Benchmarks Prepared By : Arafat El-madhoun Supervised By:eng. Mohammad temraz.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Presented By Dr. Shazzad Hosain Asst. Prof., EECS, NSU
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
Data Structures & AlgorithmsIT 0501 Algorithm Analysis I.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
Assuring Application-level Correctness Against Soft Errors Jason Cong and Karthik Gururaj.
Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.
Predicated Static Single Assignment (PSSA) Presented by AbdulAziz Al-Shammari
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Technology and Science, Osaka University Dependence-Cache.
1 CS 201 Compiler Construction Introduction. 2 Instructor Information Rajiv Gupta Office: WCH Room Tel: (951) Office.
ABCD: Eliminating Array-Bounds Checks on Demand Rastislav Bodík Rajiv Gupta Vivek Sarkar U of Wisconsin U of Arizona IBM TJ Watson recent experiments.
1 Recursive Data Structure Profiling Easwaran Raman David I. August Princeton University.
Chapter 11: Dynamic Analysis Omar Meqdadi SE 3860 Lecture 11 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
References: “Pruning Dynamic Slices With Confidence’’, by X. Zhang, N. Gupta and R. Gupta (PLDI 2006). “Locating Faults Through Automated Predicate Switching’’,
Software Engineering Research Group, Graduate School of Engineering Science, Osaka University A Slicing Method for Object-Oriented Programs Using Lightweight.
Software Engineering Department Graph-Less Dynamic Dependence-Based Dynamic Slicing Algorithms Árpád Beszédes, Tamás Gergely and Tibor Gyimóthy University.
Final Code Generation and Code Optimization.
Program Slicing Techniques CSE 6329 Spring 2013 Parikksit Bhisay
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.
Pruning Dynamic Slices With Confidence Original by: Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona Presented by: David Carrillo.
Adaptive Inlining Keith D. CooperTimothy J. Harvey Todd Waterman Department of Computer Science Rice University Houston, TX.
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
High-level optimization Jakub Yaghob
Static Slicing Static slice is the set of statements that COULD influence the value of a variable for ANY input. Construct static dependence graph Control.
Antonia Zhai, Christopher B. Colohan,
A Survey of Program Slicing Techniques: Section 4
Program Slicing Baishakhi Ray University of Virginia
Program Slicing Xiangyu Zhang.
CS 201 Compiler Construction
Data Flow Analysis Compiler Design
Final Code Generation and Code Optimization
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
CS 201 Compiler Construction
Presentation transcript:

1 Cost Effective Dynamic Program Slicing Xiangyu Zhang Rajiv Gupta The University of Arizona

2 Program Slicing Definition Slice of v at S is the set of statements involved in computing v ’s value at S. [Mark Weiser, 1982] Static slice is the set of statements that COULD influence the value of a variable for ANY input. Construct static dependence graph  Control dependences  Data dependences Traverse dependence graph to compute slice  Transitive closure over control and data dependences

3 Dynamic Slicing Dynamic slice is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and Laski, 1988] Execution trace  control flow trace -- dynamic control dependences  memory reference trace -- dynamic data dependences Construct a dynamic dependence graph Traverse dynamic dependence graph to compute slices Smaller, more precise, slices are more helpful

4 Slice Sizes: Static vs. Dynamic ProgramStatements Avg. of 25 slicesStatic / Dynamic StaticDynamic 126.gcc 099.go 134.perl 130.li 008.espresso 585,491 95, ,182 31,829 74,039 51,098 16,941 5,242 2,450 2,353 6,614 5, Static slice can be much larger than the dynamic slice

5 Applications of Dynamic Slicing  Debugging [Korel & Laski ]  Detecting Spyware [Jha ] Installed without users’ knowledge  Software Testing [Duesterwald, Gupta, & Soffa ] Dependence based structural testing - output slices.  Module Cohesion [N.Gupta & Rao ] Guide program structuring  Performance Enhancing Transformations Instruction criticality [Ziles & Sohi ] Instruction isomorphism [Sazeides ]  Others…

6 The Graph Size Problem Program Statements Executed (Millions) Dynamic Dependence Graph Size(MB) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go ,568 1,296 1,442 1,816 1, ,954 1,745 1,534 1,707 Graphs of realistic program runs do not fit in memory.

7 Space and Time Cost of LP [ICSE 2003] Program Slicing Time Average (Minutes) Max. Dynamic Dependence Graph Size(MB) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go Still not fast enough. Need to keep graph in memory.

8 Input: N=2 Dependence Graph Representation 5 1 : for I=1 to N do 6 1 : if (i%2==0) then 7 1 : p=&a 8 1 : a=a : z=2*(*p) 10 1 : print(z) 1 1 : z=0 2 1 : a=0 3 1 : b=2 4 1 : p=&b 5 2 : for I=1 to N do 6 2 : if (i%2==0) then 8 2 : a=a : z=2*(*p) 1: z=0 2: a=0 3: b=2 4: p=&b 5: for i = 1 to N do 6: if ( i %2 == 0) then 7: p=&a endif 8: a=a+1 9: z=2*(*p) endfor 10: print(z)

9 5:for i=1 to N 6:if (i%2==0) then 7: p=&a 8: a=a+1 9: z=2*(*p) 10: print(z) T F 1: z=0 2: a=0 3: b=2 4: p=&b T Input: N=2 1 1 : z=0 2 1 : a=0 3 1 : b=2 4 1 : p=&b 5 1 : for i = 1 to N do 6 1 : if ( i %2 == 0) then 8 1 : a=a : z=2*(*p) 5 2 : for i = 1 to N do 6 2 : if ( i %2 == 0) then 7 1 : p=&a 8 2 : a=a : z=2*(*p) 10 1 : print(z) T Dependence Graph Representation F

10 OPT: Compacted Graph Algorithm  Compaction Elimination of timestamp labels.  Remove labels that can be inferred  Transform dependence graph to enable elimination  Remove labels that are redundant  Fast Traversal Long search for relevant dependence is often replaced by quick computation of dependence  Consequence of compaction

11 OPT-1a. Infer Local Def-Use Labels: Full Elimination X = = X X = = X 0 X = = X (10,10) (20,20) (30,30) Assign timestamps on node level

12 OPT-1b. Infer Local Def-Use Labels: Partial Elimination In Presence of Aliasing X = *P = = X X = *P = = X (10,10) (20,20) X = *P = = X (10,10) 0 *P is a may alias of X

13 OPT-2a. Transform Local Def-Use Labels: Full Elimination In Presence of Aliasing Z = Y = (10,11) (20,21) (10,11) (20,21) X = f(Y) = X *P = g(Z) (11,11) (21,21) Z = Y = (10,11) (20,21) (10,11) (20,21) X = f(Y) = X *P = g(Z) X = f(Y) = X *P = g(Z) 0 0 X = f(Y) = X *P = g(Z) Z = Y =

14 OPT-2b. Transform Non-local Def-Use to Local Use-Use Edges = X X = (10,11) (20,21) (10,11) (20,21) = X X = (10,11) (20,21) = X X = 0 use-use

15 OPT-2c. Transform Non-Local Def-Use to Local Def-Use Edges X = = Y = X Y = 1 2 X = Y = 1 2 = Y = X (1,3) (2,3) (10,12) (11,12) X = Y = 1 2 = Y = X (1,3) (2,3) = Y = X Y = 2 X = 0 0 Node for path

16 OPT-3. Redundant Labels Across Non-Local Def-Use Edges X = Y = = Y = X X = Y = X = Y = = Y = X X = Y = (1,2) (10,11) X = Y = = Y = X X = Y = (10,11) (1,2)

17 OPT-4.(Control Dep.) Infer Fixed Distance Unique Control Ancestor PathTimestamps (32,33) (10,13) (20,23) (30,34) (21,22) (11,12) (31,32) (10,11) (20,21) (30,31) 1 1

18 OPT-5a. Transform Multiple Control Ancestors (32,33) (10,13) (20,23) (30,34) (21,22) (10,13) (30,34)

19 OPT-5b. Transform Varying Distance to Unique Control Ancestors

20 OPT-6. Redundant Across Non-Local Def- Use and Control Dependence Edges X = If P = X X = If P = X (1,2) X = If P = X (1,2)

21 Completeness of Label Elimination Optimizations  Data Dependence Labels Local to a basic block  Infer (OPT-1a, OPT-1b)  Transform (OPT-2a) Non-Local across basic blocks  Transform (OPT-2b, OPT-2c)  Redundant (OPT-3)  Control Dependence Labels  Infer (OPT-4)  Transform (OPT-5a, OPT-5b)  Redundant (OPT-6)

22 Slicing algorithm (1) {s2} U t 0 t = … s2: x= … s1:v=f(x,…) 0

23 Slicing algorithm (2) t 0 t = … s2: …=x … s1:v=f(x,…) 0 Use-use edge

24 Slicing algorithm (3) {s3} U t’ t = … s1:v=f(x,…) … s3: x=… … s4: x=… … …

25 Shortcuts to Speed Up Traversal 0: X = 1: Y = f(X) 2: Z = g(Y) 3: … = Z (10,11) (20,21) 0 0 0: X = 1: Y = f(X) 2: Z = g(Y) 3: … = Z (10,11) (20,21) 0 {2}

26 Experimental Setup  Implementation Trimaran: C programs, IR (intermediate representation) An instrumented interpreter executes IR, collects compact control flow trace and memory trace. CFG and PDG are constructed on IR level so that the slicing is also on IR level.  Experiment In order to get fair comparisons among algorithms, we shared as much code as possible in different implementations. 2.2 GHz Pentium, 2 G RAM, 1 G swap space. For each benchmark, we collected 3 different traces, for each trace, we randomly computed 25 slices.

27 OPT: Compacted Graph Sizes Program Graph Size (MB)Before / After Explicit Dependences (%) BeforeAfter 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go 1,568 1,296 1,442 1,816 1, ,954 1,745 1,534 1,

28 OPT: Effects

29 OPT: Slicing Times at Different Execution Points

30 OPT: Benefit of Shortcuts Program OPT Slicing Times (Avg. of 25 slices) W/O Shortcuts (Seconds) With Shortcuts (Seconds) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go

31 OPT vs. LP: Graph Sizes Program Graph Size (MB) OPTLP (Max. of 25) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go

32 OPT vs. LP: Slicing Times Program Slicing Times (Avg. of 25 slices) OPT (Seconds) LP (Minutes) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go

33 Traditional vs. OPT: Short Program Runs Program Slicing Times (Avg. of 25 slices) OPT (Seconds) Traditional (Seconds) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go 36.3 : : : : : : : : : :

34 Graph Construction Cost Trace Generation - Instrumented program takes twice as long to run as the uninstrumented program. Trace Preprocessing for Graph Construction Time(LP) < Time(OPT) < Time(Traditional) ProgramLP (min)OPT (min)Trad. (min) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go

35 Conclusion  A straightforward implementation of precise algorithm is not practical.  Carefully designed precise dynamic slicing algorithms provide precise dynamic slices at reasonable space and time costs.  Our work is one step toward making dynamic slicing practical. On going work: Efficient online compression another 5-10 times reduction; 15MB for 150Mills(over 100 times reduction in total); 4-10 times slowdown.